Viewportbased 360 Video Streaming MotionConstrained Tile and Viewport
Viewport-based 360 Video Streaming: Motion-Constrained Tile and Viewport Extraction Presenter: Eun-Seok Ryu (esryu@gachon. ac. kr) Dongmin Jang, Jong-Beom Jeong, Eun-Seok Ryu Multimedia Communications and Systems Lab. (MCSL) http: //mcsl. gachon. ac. kr Department of Computer Engineering Gachon University
Introduction v. High Bandwidth Requirement of VR § Recently, various HMD devices are on the market § Recommend 12 K resolution for reducing nausea with high quality VR § High Bandwidth and high computational complexity are huddle Need to reduce the required bandwidth! The emergence of various HMD (Gear VR, Oculus Rift, Daydream, Play. Station VR) 3079 -18 -0045 -00 -0003 Requirement details pixels/degree 40 pix/deg video resolution 11520 x 6480 framerate 90 fps Requirements for high quality VR Source: Technicolor, Oct. 2016 (m 39532, MPEG 116 th Meeting)
Viewport Independent vs Viewport Dependent Viewport Independent v. Transmit whole picture v. Projection and packing v. Downsampling / adjusting QP Viewport Dependent (Proposed Method) v. Transmit viewport only v. Bitrate saving v. Ecoding complexity reducing viewport v. But, delay Select the tiles corresponding to the viewport Decoder Viewpoint-based decoding 3079 -18 -0045 -00 -0003
Keypoint for Viewport Dependent Streaming § Field of View(FOV) • The field of view (FOV) in the HMDs : 96° to 110° • Part of the 360° picture • • The user’s current viewport : high resolution Remaining part : low resolution Field of View (FOV) § Tiles • Parallelization Tools • Divided into rectangular regions • Flexible horizontal and vertical boundaries • • Spatially refers to only its own tile, but temporally refers to other tiles Decoding problems occurs when transmitting only some tiles A frame divided into 8 Tiles 3079 -18 -0045 -00 -0003
Proposed Motion Constraint Tile Sets Reference Picture Motion vectors can refer to anyware Motion Estimation after applying MCTS Modify motion vector range Reference Original Motion Estimation before applying MCTS Current Picture Motion vectors only refer the tile in same position MPEG adopts Gachon Univ. ’s MCTS codes in HM ver. 16. 18 3079 -18 -0045 -00 -0003
MCTS with HEVC and SHVC - Structure SHM Encoder When referring to the same tile, TIP is performed Pic. EL t-1 Pic. EL t x Pic t x When referring to the other tiles, ILP is performed using Upsampled BL Upsampled Pic. BL t • • EL: Enhancement Layer BL: Base Layer TIP: Temporal Inter Prediction ILP: Inter Layer Prediction When referring to the other tiles, Intra Prediction is performed *Filter interpolation, Temporal Candidate of AMVP and MERGE 3079 -18 -0045 -00 -0003
MCTS Considerations (1/2) v. Interpolation § Use an eight-tap filter to interpolate luma prediction § Use 3 pixels of left and top, 4 pixels of right and bottom for Interpolation Modify reference range of motion vectors 3 Pixels 4 Pixels Current Pixel used for interpolation The current pixel and the pixels used for interpolation 3079 -18 -0045 -00 -0003 Interpolation problem of referring to a tile at the same position in TIP
MCTS Considerations (2/2) v. Temporal Candidate of AMVP and MERGE § Temporal candidates : C 3 and H block (right figure) § Problem: cross the column boundary between tiles Exclude H block at the column boundary between tiles Tile 1 Tile 2 P U H Temporal candidate problem at column boundary between Tiles 3079 -18 -0045 -00 -0003
Extraction Information Sets (EIS) SEI Message v. Extraction Information Sets (EIS) SEI Message § Contains replacement parameter set (max: around 2000 MCTS sets) § MCTS set contains a set of tiles to be extracted Extraction Information Set (1 ~ 2048) Parameter set Replacement Information VPS Level and Tier SPS Picture Resolution Tile Partition MCTS Set ( 1 ~ 2048) Tile MCTS index 1 Tile MCTS index 2 … Replacement Information of Parameter Set (1) VPS, SPS, PPS Slice address Original VPS Original SPS Original PPS EIS SEI Slice / Tile NAL Unit NAL Unit Bitstream before extraction 3079 -18 -0045 -00 -0003 …
NAL Bitstream Extractor: Fucntional Flow Extractor MCTS Bitstream Input Option Original VPS Original SPS Pharse Original PPS Pharse EIS SEI Message EIS SEI Slice / Tile • Target EIS Id • Target MCTS set Id • Target Highest Temporal Id • Number of Original Tiles • Number of EISs, MCTS sets, MCTSs • MCTS Id • Slice Reordering Info • Replacement Parameter set (VPS, SPS, PPS) Replace Original Parameter set With Replacement Parameter set Replacement PPS Target Slice / Tile • First Slice Segment in Pic Flag • Slice Segment Address … … 3079 -18 -0045 -00 -0003 Replacement SPS Target Slice / Tile Select Target Tile / Slice Corresponding to Input Option Adjust Slice Header Replacement VPS Target Slice / Tile Extracted Bitstream
Implemented Renderer and Player 3079 -18 -0045 -00 -0003
Experimental Results (1/2) v. Experimental Setup § 8 K test sequences defined in JVET CTC (common test condition) § Random Access (RA) coding structure § Uniform 3 x 3, 9 Tiles § HM ver. 16 encoder / SHM ver. 12. 3 encoder Name Resolution Frame Length Frame Rate Kite. Flite 8192× 4096 300 30 fps Harbor 8192× 4096 300 30 fps Trolley 8192× 4096 300 30 fps Gas. Lamp 8192× 4096 300 30 fps 8 K Test sequences 3079 -18 -0045 -00 -0003 Coding Option SHM Parameter HM parameter Version 12. 3 16. 16 CTU size 64× 64 Coding structure RA QP - Base Layer QP 22 Enhancement Layer QP 22 Tile Uniformly 3 x 3 = 9 tiles Slice mode Disable all slice options WPP mode Disable all wpp options Coding options
Experimental Results (2/2) v. Bitrate savings in case of transmiting some Tiles only using proposed MCTS (among 9 Tiles) v. Various number of Tiles and eye-tracking with DL are under researching now; expecting NOSSDAV 2019. : ) Proposed SHM 3079 -18 -0045 -00 -0003 Proposed HM Name 4 tiles bitrate saving 1 tile bitrate saving Kite. Flite 52% 88% 51% 87% Harbor 53% 88% 51% 87% Trolley 50% 87% 49% 87% Gas. Lamp 49% 87% 47% 86% Average bitrate saving 51% 87% 49% 86%
Conclusion v. Motivation § High Quality VR >= 12 k resolution § High BW, High computational complexity § Viewport tile streaming for 360 VR v. Proposed method § Motion Constrained Tile Sets (MCTS) § Extraction Information Set SEI Message (EIS SEI) § NAL Packet Extractor for Selected (ROI) Tiles v. Results § Transmit Selected Tiles without Decoding Errors § Save Bitrate, Reduce Computational Complexity at Decoder Side v. Future work § Eye tracking for Accurate Viewport Extraction § Deep Learning for ROI Estimation and Prefetching 3079 -18 -0045 -00 -0003
- Slides: 14