3 D Shape Reconstruction from Sketches via Multiview

  • Slides: 41
Download presentation
3 D Shape Reconstruction from Sketches via Multi-view Convolutional Networks Zhaoliang Lun Matheus Gadelha

3 D Shape Reconstruction from Sketches via Multi-view Convolutional Networks Zhaoliang Lun Matheus Gadelha Evangelos Kalogerakis Subhransu Maji Rui Wang

Creating 3 D shapes is not easy Image from Autodesk 3 D Maya

Creating 3 D shapes is not easy Image from Autodesk 3 D Maya

Goal: 2 D line drawings in, 3 D shapes out! front view side view

Goal: 2 D line drawings in, 3 D shapes out! front view side view Shape. MVD 3 D shape

Why line drawings? Simple & intuitive medium to convey shape! Image from Suggestive Contour

Why line drawings? Simple & intuitive medium to convey shape! Image from Suggestive Contour Gallery, De. Carlo et al. 2003

Challenges: ambiguity in shape interpretation ?

Challenges: ambiguity in shape interpretation ?

Challenges: need to combine information from multiple input drawings ?

Challenges: need to combine information from multiple input drawings ?

Challenges: favor interpretations by learning plausible shape geometry ?

Challenges: favor interpretations by learning plausible shape geometry ?

Related work [Igarashi et al. 1999] [Xie et al. 2013] [Rivers et al. 2010]

Related work [Igarashi et al. 1999] [Xie et al. 2013] [Rivers et al. 2010]

Deep net architecture front view side view Shape. MVD 3 D shape

Deep net architecture front view side view Shape. MVD 3 D shape

Deep net architecture: Encoder front view side view Feature representations capturing increasingly larger context

Deep net architecture: Encoder front view side view Feature representations capturing increasingly larger context in the sketches

Deep net architecture: Decoder Infer depth and normal maps front view depth map side

Deep net architecture: Decoder Infer depth and normal maps front view depth map side view Feature representations generating shape information at increasingly finer scales

Deep net architecture: Decoder Infer depth and normal maps front view +normal map side

Deep net architecture: Decoder Infer depth and normal maps front view +normal map side view Feature representations generating shape information at increasingly finer scales

Deep net architecture: Multi-view Decoder Infer depth and normal maps for several views front

Deep net architecture: Multi-view Decoder Infer depth and normal maps for several views front view side view output view 12

Deep net architecture: U-net structure Feature representations in the decoder depend on previous layer

Deep net architecture: U-net structure Feature representations in the decoder depend on previous layer & encoder’s corresponding layer front view side view output view 12 U-net: Ronneberger et al. 2015, Isola et al. 2016

Training: initial loss Penalize per-pixel depth reconstruction error: & per-pixel normal reconstruction error: front

Training: initial loss Penalize per-pixel depth reconstruction error: & per-pixel normal reconstruction error: front view side view output view 12 U-net: Ronneberger et al. 2015, Isola et al. 2016

Training: discriminator network Checks whether the output depth & normals look real or fake.

Training: discriminator network Checks whether the output depth & normals look real or fake. Trained by treating ground-truth as real, generated maps as fake. front view side view Generator Network Discriminator Network Real? Fake? output view 11 … output 12 outputview 12 c. GAN: Isola et al. 2016

Training: full loss Penalize per-pixel depth reconstruction error: & per-pixel normal reconstruction error: &

Training: full loss Penalize per-pixel depth reconstruction error: & per-pixel normal reconstruction error: & “unreal” outputs: front view side view Generator Network Discriminator Network Real? Fake? output view 11 … output 12 outputview 12 c. GAN: Isola et al. 2016

Training data Character 10 K models Chair 10 K models Airplane 3 K models

Training data Character 10 K models Chair 10 K models Airplane 3 K models Models from “The Models Resource” & 3 D Warehouse

Training data Synthetic line drawings

Training data Synthetic line drawings

Training data Synthetic line drawings … 12 views Training depth and normal maps

Training data Synthetic line drawings … 12 views Training depth and normal maps

Test time Predict multi-view depth and normal maps! front view side view output view

Test time Predict multi-view depth and normal maps! front view side view output view 12

Multi-view depth & normal map fusion output view 12 Multi-view depth & normal maps

Multi-view depth & normal map fusion output view 12 Multi-view depth & normal maps Consolidated point cloud

Multi-view depth & normal map fusion Optimization problem • Depth derivatives should be consistent

Multi-view depth & normal map fusion Optimization problem • Depth derivatives should be consistent with normals output view 12 Multi-view depth & normal maps Consolidated point cloud

Multi-view depth & normal map fusion Optimization problem • Depth derivatives should be consistent

Multi-view depth & normal map fusion Optimization problem • Depth derivatives should be consistent with normals output view 1 • Corresponding depths and normals across different views should agree output view 12 Multi-view depth & normal maps Consolidated point cloud

Surface Reconstruction output view 12 Multi-view depth & normal maps Consolidated point cloud Surface

Surface Reconstruction output view 12 Multi-view depth & normal maps Consolidated point cloud Surface reconstruction [Kazhdan et al. 2013]

Surface deformation output view 12 Multi-view depth & normal maps Consolidated point cloud Surface

Surface deformation output view 12 Multi-view depth & normal maps Consolidated point cloud Surface reconstruction [Kazhdan et al. 2013] Surface “fine-tuning” [Nealen et al. 2005]

Surface deformation output view 12 Multi-view depth & normal maps Consolidated point cloud Surface

Surface deformation output view 12 Multi-view depth & normal maps Consolidated point cloud Surface reconstruction Surface “fine-tuning” [Kazhdan et al. 2013] [Nealen et al. 2005]

Experiments

Experiments

Qualitative Results reference shape

Qualitative Results reference shape

Qualitative Results reference shape nearest retrieval

Qualitative Results reference shape nearest retrieval

Qualitative Results reference shape nearest retrieval reference shape our result volumetric net nearest retrieval

Qualitative Results reference shape nearest retrieval reference shape our result volumetric net nearest retrieval our result volumetric net

Qualitative Results reference shape nearest retrieval reference shape our result volumetric net nearest retrieval

Qualitative Results reference shape nearest retrieval reference shape our result volumetric net nearest retrieval our result volumetric net

Qualitative Results reference shape nearest retrieval our result volumetric net

Qualitative Results reference shape nearest retrieval our result volumetric net

Qualitative Results reference shape nearest retrieval our result volumetric net

Qualitative Results reference shape nearest retrieval our result volumetric net

Qualitative Results reference shape nearest retrieval our result volumetric net

Qualitative Results reference shape nearest retrieval our result volumetric net

Quantitative Results Character (human drawing) Our method Volumetric decoder Nearest retrieval Hausdorff distance 0.

Quantitative Results Character (human drawing) Our method Volumetric decoder Nearest retrieval Hausdorff distance 0. 120 0. 638 0. 242 Chamfer distance normal distance depth map error volumetric distance 0. 023 34. 27 0. 028 0. 309 0. 052 56. 97 0. 048 0. 497 0. 045 47. 94 0. 049 0. 550

Quantitative Results Man-made shape (human drawing) Our method Volumetric decoder Nearest retrieval Hausdorff distance

Quantitative Results Man-made shape (human drawing) Our method Volumetric decoder Nearest retrieval Hausdorff distance 0. 171 0. 211 0. 228 Chamfer distance normal distance depth map error volumetric distance 0. 028 34. 19 0. 037 0. 439 0. 032 48. 81 0. 046 0. 530 0. 038 43. 75 0. 059 0. 560

Single vs two input line drawings Single sketch Resulting shape Two sketches Resulting shape

Single vs two input line drawings Single sketch Resulting shape Two sketches Resulting shape

More results

More results

Summary • A multi-view net for 3 D shape synthesis from sketches • Trained

Summary • A multi-view net for 3 D shape synthesis from sketches • Trained on synthetic sketches; generalizes well to human-drawn sketches • View-based reconstruction predicts shape structure & geometry more accurately than voxel-based methods

Thank you! Acknowledgements: NSF (CHS-1422441, CHS-1617333, IIS- 1617917, IIS-1423082), Adobe, NVidia, Facebook. Experiments were

Thank you! Acknowledgements: NSF (CHS-1422441, CHS-1617333, IIS- 1617917, IIS-1423082), Adobe, NVidia, Facebook. Experiments were performed in the UMass GPU cluster (400 GPUs!) obtained under a grant by the Mass. Tech Collaborative Project page: people. cs. umass. edu/~zlun/Sketch. Modeling Code & data available!