Current location - Education and Training Encyclopedia - Graduation thesis - Naked-eye 3d technical paper
Naked-eye 3d technical paper
The naked-eye 3D technology has the advantages of small data volume, high transmission efficiency, adaptive adjustment of display content and good user interaction. The following is my paper on naked-eye 3d technology. Interested parents can have a look!

Research on Multi-view Naked 3d Stereo Video Technology Based on Depth Map

Abstract: 3D stereoscopic video technology is attracting more and more attention, but at present, most 3D video systems need to wear special glasses to watch stereoscopic effects, or require viewers to watch from a fixed angle. Multi-view naked-eye 3D stereoscopic video system can avoid the above two limitations and get the best 3D viewing experience. At present, the most advanced 3D stereoscopic video research in the world focuses on multi-view 3D stereoscopic video technology based on depth map. In this paper, several key technologies of multi-view naked-eye 3D stereoscopic video system based on depth map are studied, including depth map extraction, virtual view synthesis and multi-view video synthesis, and corresponding simulation experiments are carried out. From the experimental results, the multi-view naked-eye 3D stereoscopic video system based on depth map has the advantages of small data volume, high transmission efficiency, adaptive adjustment of display content and good user interaction.

Keywords: naked-eye 3D stereoscopic video; Depth map; 3d tv

At present, people pay more and more attention to 3D stereoscopic video technology, among which the mainstream 3D technology mainly includes binocular stereoscopic video (including video data from two viewpoints) and multi-viewpoint stereoscopic video (including video data from more than eight viewpoints). Binocular stereoscopic video can be divided into two types: glasses-wearing viewing and binocular naked-eye stereoscopic display. The former requires the viewer to wear polarized glasses, which brings inconvenience to viewing, while the latter requires the viewer to watch from a fixed angle. When many people watch the same monitor at the same time, most viewers can't get the best viewing position, which greatly affects the viewing experience. For multi-view stereoscopic video technology, because the same naked-eye 3D stereoscopic display can provide content from multiple perspectives at the same time, the audience can watch it from any free angle, which greatly improves the convenience of viewing. Therefore, multi-view stereoscopic video has become the mainstream of current technical research. However, compared with binocular stereo video, the data volume of multi-view stereo video has doubled, which brings inconvenience to storage and transmission. Multi-view stereoscopic video technology based on depth map has the advantage of small data, so it has become the most potential multi-view stereoscopic video scheme. In this paper, some key technologies of multi-view three-dimensional video technology based on depth map are deeply studied, and corresponding simulation experiments are carried out. The chapters of this paper are arranged as follows: The second section introduces the overall architecture of multi-view 3D stereo system based on depth map, the third section introduces depth map extraction, the fourth section introduces virtual view generation, the fifth section introduces multi-view video synthesis, and the sixth section summarizes the full text.

1. Multi-view 3D stereoscopic video system framework based on depth map.

The technical framework of multi-view 3D stereoscopic video system based on depth map is shown in figure 1. First, you need to shoot the original video sequence. Although the final multi-view naked-eye stereoscopic display system needs 9 or more views of video content, the actual original video sequence shooting stage only needs to shoot 2-3 views of video. This is because the virtual viewpoint generation technology based on depth map can generate virtual viewpoint videos of multiple viewpoints (9 viewpoints in this paper) at the decoding end, so the multi-viewpoint stereoscopic video technology based on depth map has the advantages of small data volume and easy transmission.

After shooting the original video sequence, it is necessary to extract the depth map and calculate the camera parameters. The quality of the depth map extracted in this step directly determines the quality of the virtual viewpoint video generated later. After completing the above steps, it needs to be compressed and encoded and transmitted to the decoding end through the network. After decoding the data, the decoder will generate virtual viewpoints based on the depth map, and the original video data of 2-3 viewpoints will become video data of 9 viewpoints. The video data of nine viewpoints can't be played directly on the naked-eye 3D stereoscopic display with multi-viewpoints, so it is necessary to synthesize the 3D raster structure used in the display with multi-viewpoints.

In the following chapters of this paper, depth map extraction, virtual viewpoint generation and multi-viewpoint video synthesis will be introduced in detail, and corresponding simulation experiments will be carried out.

Second, depth map extraction.

2. 1 introduction of depth map

The depth map is a gray image (as shown in Figure 2-b), and the gray value range is 0-255. The gray value can be converted with the depth information of the scene to get the depth value, which can be used in the practical application of stereoscopic video system.

The pixels in the depth map are gray values of 0-255. As mentioned above, the depth map is mainly used for virtual viewpoint generation. In this process, we use the actual depth value, so we need to establish a conversion relationship to convert the gray value of pixels in the depth map into the actual depth value:

In the formula (1), z is the depth value we need in the process of virtual viewpoint generation, v represents the gray value of pixels in the depth image in Figure 2-b, and Znear and Zfar respectively represent the nearest depth and farthest depth in the scene of video shooting. These two values need to be measured during the shooting of the original video sequence.

2.2 Depth Map Extraction Based on Block Matching

Shoot the same scene with two cameras placed side by side to get two images. To obtain the depth map of an image, it is necessary to pair pixels with another image. After matching pixels, you will get the parallax of each pixel in the middle of the two images. The relationship between depth value and parallax value is as follows:

Where z is the required depth value, d is the parallax value obtained after pixel matching, f is the focal length of the camera, and b is the baseline distance between the two cameras. Therefore, using the disparity value d, the depth value z can be easily obtained. But the key link is to get the accurate disparity value, so accurate pixel matching is needed. But in fact, due to the difference of exposure parameters of different cameras, even if the same scene is shot, there is still brightness difference between pixels, so we adopt the method of image block matching to improve the robustness of pixel matching to some extent. In this experiment, 3? 3 size image blocks, it must be pointed out that by default, the original video sequence is shot by two strictly horizontal parallel cameras, so only horizontal search is carried out when matching image blocks, and no vertical search is carried out. The whole depth map extraction process is shown in Figure 3.

The depth map extracted from the multi-view video sequence provided by MPEG, an international video standard organization, is shown in Figure 4.

Thirdly, virtual viewpoint generation.

Virtual viewpoint generation technology [2] can project the pixels in the left and right viewpoints to any position between the two viewpoints, so as to generate a video image (as shown in Figure 5) which is not the virtual viewpoint originally shot by the camera, which requires the depth maps of the left and right viewpoints and camera parameters. This technology mainly uses 3D projection algorithm to find the corresponding point between two image planes. The specific process is to project a point on an image plane to the three-dimensional world coordinate system, and then project the point from the three-dimensional world coordinate system to another image coordinate plane.

For any given point p0, the coordinate is (u0, V0), which lies in the image plane v0. If you want to find the coordinates (u 1, V 1) of the corresponding point P 1, and the change point is on the image plane v 1, then the calculation of the whole three-dimensional projection process is as follows:

Here, z is the distance from a point in the 3D world coordinates to the camera along the Z axis of the camera coordinate system, and p is the corresponding projection matrix. The projection matrix P consists of camera internal matrix K, rotation matrix R and translation matrix T, and the specific description of P is as follows: where k is 3? The upper triangular matrix of 3 consists of focal length f, tilt parameter? It consists of theoretical points (u', v') on the position of the virtual camera. R and t describe the position of the camera in the world coordinate space.

Through the above steps, viewpoint synthesis based on depth map can be realized initially.

Fourthly, multi-view video synthesis.

4. 1 naked-eye 3D stereoscopic display principle

To let viewers experience 3D stereoscopic effect, the core principle is to let their eyes see different pictures at the same time, so as to obtain stereoscopic effect. The simplest way is to wear special glasses, which can forcibly control what you see, but this scheme brings great inconvenience to viewers (especially those who wear glasses themselves). The scheme adopted in this paper is naked-eye 3D stereoscopic display. The main realization method is to add a parallax barrier in front of the display screen, and control the emergent direction of each pixel light through the barrier, so that some images only enter the left eye and some images only enter the right eye, thus forming binocular parallax and generating stereoscopic vision (as shown in Figure 6).

4.2 Multi-view video synthesis

The naked-eye 3D parallax barrier used in this paper has a complicated structure, and the image content of nine viewpoints can be controlled through its barrier, so that the images of nine viewpoints can be displayed on the same display at the same time. Although the viewer can only see two images at the same time, the viewing angle of the display is greatly increased. In order to cooperate with the display of the 9-view grating fence, we need to rearrange the RGB pixels of the 9-view image, and the rearrangement order is shown in Figure 7. Numbers in the figure indicate the number of viewpoints. If the RGB values of nine viewpoint images are rearranged according to the order in the figure, a stereoscopic image with 9 times the resolution of each original viewpoint image will be obtained, which can be played on a multi-viewpoint naked-eye 3D display. A stereoscopic image composed of 9-view images is shown in Figure 8 (stereoscopic effect can only be seen on a 9-view naked-eye grid stereoscopic display).

conclusion

Multi-view stereoscopic video technology based on depth map is the research hotspot of three-dimensional stereoscopic video at present. This technology does not need to wear special 3D stereoscopic video glasses, and has the advantages of small total data and large viewing angle. In this paper, several key technologies of multi-view naked-eye 3D stereoscopic video system based on depth map are deeply studied, including depth map extraction, virtual view synthesis and multi-view video synthesis, and corresponding simulation experiments are carried out.

refer to

[ 1] M? ller,k; Merck, p. Zweigende; "3D video representation using depth map", IEEE Proceedings, vol. 99, No.4, pp. 643-656, April 2065438 +0 1

[2] Njiki-Virginia, P; Koppel, M; Dr. Skof; Lakshman; Merck, p. K. muller; Zweigende; "3D Video Rendering and Advanced Texture Synthesis Based on Depth Images", Multimedia, IEEE Transactions, Vol. 13, No.3, pp. 453-465, 20 1 1 June.

[3] Mill, K; Merck, p. "Challenges of 3D video standardization", Visual Communication and Image Processing (VCIP), 201/IEEE, vol., p. 1-4,1/6-9, 201/.

[4] Sourimant,g; "A Simple and Effective Method for Calculating the Depth Map of Multi-view Video", 3D TV-Conference: The True Vision-Capture, Transmission and Display of 3D Video (3D TV-Con), 20 10, Volume 1, Page 1-4, June 7, 20 10.

Hopf, K; "autostereoscopic display providing comfortable viewing conditions and high telepresence", circuit and system of video technology, IEEE Journal, vol. 10, No.3, pp. 359-365, April 2000.

Click the next page for details >>& gt naked-eye 3d technical paper