3 D Point Capsule Networks Lifting Capsule Networks

  • Slides: 23
Download presentation
3 D Point Capsule Networks Lifting Capsule Networks to Raw 3 D Data Yongheng

3 D Point Capsule Networks Lifting Capsule Networks to Raw 3 D Data Yongheng Zhao, Tolga Birdal, Haowen Deng, Federico Tombari CVPR Tutorial Sunday, June 16, 2019

representing 3 d data octree graphs implicit surfaces algebraic surfaces

representing 3 d data octree graphs implicit surfaces algebraic surfaces

point clouds …. . . x, y, z r, g, b Nx. D matrix

point clouds …. . . x, y, z r, g, b Nx. D matrix of attributes ü ü Raw data: Efficient Sparse: Memory friendly Generic Arbitrary accuracy

why are point clouds hard? q unstructured geometry cannot be projected on a single

why are point clouds hard? q unstructured geometry cannot be projected on a single plane (different manifolds exist) q q basic representation is permutation dependent q sparse input : dense convolutions are wasteful q varying data density

consuming point clouds in networks: point-net Nx 3 point set X MLP local feature

consuming point clouds in networks: point-net Nx 3 point set X MLP local feature global feature Qi, Charles R. , et al. "Pointnet: Deep learning on point sets for 3 d classification and segmentation. " Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.

point-net pointnet

point-net pointnet

a new 3 D auto-encoder D E Input Shape Latent. Capsules Feature Reconstruction

a new 3 D auto-encoder D E Input Shape Latent. Capsules Feature Reconstruction

upsample vs deform latent code concatanate deform (MLP) Fixed Grid Template "Learn to Fold

upsample vs deform latent code concatanate deform (MLP) Fixed Grid Template "Learn to Fold a Napkin into Almost Any 3 D Shape, Deeply" Yang, Yaoqing, et al. "Foldingnet: Point cloud auto-encoder via deep grid deformation. " Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

a quick look at the decoders Groueix, Thibault, et al. "A papier-mâché approach to

a quick look at the decoders Groueix, Thibault, et al. "A papier-mâché approach to learning 3 d surface generation. " Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. Yang, Yaoqing, et al. "Foldingnet: Point cloud auto-encoder via deep grid deformation. " Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

architecture

architecture

capsules act locally

capsules act locally

optionally supervise Reconstruction Latent Capsules capsule-part association Part label for the capsule Input point

optionally supervise Reconstruction Latent Capsules capsule-part association Part label for the capsule Input point cloud with part label

optionally supervise Latent Capsules Part prediction Conv Cross Entropy Loss Part label for the

optionally supervise Latent Capsules Part prediction Conv Cross Entropy Loss Part label for the capsule

part segmentation Latent Capsules Part prediction D Conv

part segmentation Latent Capsules Part prediction D Conv

part segmentation

part segmentation

part segmentation

part segmentation

a rather new application: part interpolation/replacement A Capsule-Part Association Part replacement Latent Capsules Tail

a rather new application: part interpolation/replacement A Capsule-Part Association Part replacement Latent Capsules Tail A Tail Wing Body E Source shape A Body D Segmentation Part interpolation D E Segmentation Target shape

part interpolation/replacement

part interpolation/replacement

part interpolation / replacement

part interpolation / replacement

extracting invariant 3 D local descriptors Deng, Haowen, Tolga Birdal, and Slobodan Ilic. "Ppfnet:

extracting invariant 3 D local descriptors Deng, Haowen, Tolga Birdal, and Slobodan Ilic. "Ppfnet: Global context aware local features for robust 3 d point matching. " Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

reconstruction quality transfer learning part segmentation with limited data

reconstruction quality transfer learning part segmentation with limited data

to take home. . . • representation matters and is unsolved. • a rich

to take home. . . • representation matters and is unsolved. • a rich latent space is desirable and can (to a certain extent) be achieved by capsules + dynamic routing. • can we make capsules specialize on other 3 D shape properties? https: //tinyurl. com/yxq 2 tmv 3