Over 20 years have passed since a free-viewpoint video technology has been proposed with which a user's viewpoint can be freely set up in a reconstructed three-dimensional space of a target scene photographed by multi-view cameras. This technology allows us to capture and reproduce the real world as recorded. Once we capture the world in a digital form, we can modify it as augmented reality (i.e., placing virtual objects in the digitized real world). Unlike this concept, the augmented world allows us to see through real objects by synthesizing the backgrounds that cannot be observed in our raw perspective directly. The key idea is to generate the background image using multi-view cameras, observing the backgrounds at different positions and seamlessly overlaying the recovered image in our digitized perspective. In this paper, we review such desired view-generation techniques from the perspective of free-view point image generation and discuss challenges and open problems through a case study of our implementations.