DL methods for shape as parametric surfaces

将 [形状] 视作参数化空间曲面的DL方法

Jianfei Guo 出版于 survey

2021-01-30 2021-01-30 约 4666 字预计阅读 10 分钟

learning parametric surface

keyword
- neural parametric surface
- parametric surface generation/generative
overview
- 用一个参数方程$[x(s,t),y(s,t),z(s,t)]$表达一个曲面
- 可以用显式的手动构建或者隐式的神经网络来构建这个从s,t到x,y,z的映射关系

continuous patches

<AtlasNet> A papier-mâché approach to learning 3d surface generation

CVPR2018 Proceedings of the IEEE conference on computer vision and pattern recognition

Thibault Groueix, Matthew Fisher, Vladimir G Kim, Bryan C Russell, Mathieu Aubry

École des ponts, Adobe

continous 2D patches, learning 2-manifold parameterization, 2-manifold generation

PDF Code Project code-easy-to-understand

Motivation

represents a surface as a collection of parametric surface elements
把一个表面表征为一组parametric surface元素的集合
学到的一族从单位方到局部 2-流形的映射，非常类似一个surface 的 atlas 图册
每一个3D点最终都可以得到一个2D UV值

overview

pointcloud基线，是把一个latent shape code输出为一组点
本篇方法，额外输入一个从均匀单位方内采样的2D坐标点，用其来产生surface上的一个single point
- 从点云/数据中学出这种2-manifold（i.e. two-dimensional manifolds，二维流形）的parameterization
- 属于parametric approaches 分支
- ==这里本质上就是一个从二维均匀分布到空间二维流形分布的映射，condition on一个shape code==
很容易扩展多次，来把一个3D shape表征为几个surface 元素的联合

局部参数化表面的生成 locally parameterized surface generation

把surface看做一个广义的2-manifold（允许self-intersection & disjoint sets），考虑局部的参数化
consider a 2-manifold $\mathcal{S}$, a point $\boldsymbol{p} \in \mathcal{S}$, a parameterization $\varphi$ of $\mathcal{S}$ in a local neighborhood of $\boldsymbol{p}$
假定这个局部参数化就是从单位方 $[0,1]^2$ 到2-manifold $\mathcal{S}{\theta}$ 的映射 $\varphi{\theta}(x)$ : $\mathcal{S}\theta=\varphi{\theta}([0,1]^2)$
让$\mathcal{S}{\theta}$去估计/近似局部2-manifold $S{loc}$
i.e.寻找参数 $\theta$ 来最小化目标函数 $\underset{\theta}{\min}\mathcal{L}(\mathcal{S}\theta,\mathcal{S}{loc})+\lambda\mathcal{R}(\theta)$
上式的 $\mathcal{L}$ 是两个2-manifold之间的loss，$\mathcal{R}$是参数$\theta$的正则化项；
实践中，计算的不是两个2-manifold之间的loss，而是这两个2-manifold采样出的点集的chamfer 和 earth-mover距离
证明了MLP+ReLU就可以产生2-manifolds
证明了MLP+ReLU产生的2-manifolds can be learned to 很好地近似 target 2-manifolds
用了universal representation theorum：
Approximation capabilities of multilayer feedforward networks. Neural Networks, 1991

polygon mesh
建立一套3D shape和2D domain之间的连接是几何处理的一个存在已久的问题，它的应用有：texture mapping, re-meshing, shape correspondance
过去的方法需要input data就是parameterized；本篇直接从点云中学出这种parameterization

<Deep geometric prior> Deep geometric prior for surface reconstruction

CVPR2019 Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Francis Williams, Teseo Schneider, Claudio Silva, Denis Zorin, Joan Bruna, Daniele Panozzo

New York University

chart representation, auto-decoder

PDF Code

Motivation

首先把输入点云分成若干个重叠的部分，然后用MLP流形学习每个部分；
每个local流形学习用2-Wasserstein loss / EMD loss；
并在所有流形之间保证consistency

results

<Pix2Surf> Pix2surf: Learning parametric 3d surface models of objects from images

ECCV2020 European conference on computer vision

Jiahui Lei, Srinath Sridhar, Paul Guerrero, Minhyuk Sung, Niloy Mitra, Leonidas J Guibas

Zhejiang University, Stanford', UCL Adobe

parametric 3D shape/parameterization, 3D reconstruction, multi-view, single-view, surface reconstruction in NOCS

Preprint Code Project

Result

评价：可以看到学出来的曲面可以不是闭合的

Motivation

learning to generate 3D parametric surface representations for novel object instances, as seen from one or more views
使用2D patch来作为UV parameterization，处理多个non-adjacent views，并且建立2D pixels和3D surface points之间的correspondence
那些用implicit functions表达的surface，想要得到显式的表面，需要昂贵的后处理步骤：如Marching Cubes；本文直接学习生成显式的表面

主要贡献

high-quality parametric surfaces 遵循multi view一致性
生成的3D表面保留了精确的图像像素到3D表面点的correspondance，使得可以lift texture information去reconstruct 带有丰富集合与外观的 shapes

引用的directly reconstruct a parametric representation of a shape’s surface

class-specific templates (canonical template / mean shape in canonical space)
逐个类别手动设计的shape template
- [ECCV2018] Learning category-specific mesh reconstruction from image collections.
- [ICCV2019] Canonical surface mapping via geometric cycle consistency
general structured templates
适用于各种类别的通用shape template学习方法（应对不同的形状、拓扑）
- [ICCV2019] Learning shape templates with structured implicit functions.
more generic surface representations
- meshes deform
  - [ECCV2018] Pixel2mesh: Generating 3d mesh models from single rgb images.
  - [ICCV2019] Pixel2mesh++: Multi-view 3d mesh generation via deformation
  - [CVPR2019] 3DN: 3d deformation network.
- differentiable mesh renderer + image supervision
  - [CVPR2018] Neural 3d mesh renderer
  - [2019] Soft rasterizer: A differentiable renderer for image-based 3d reasoning
  - [2019] Pix2vex: Image-togeometry reconstruction using a smooth differentiable renderer.
  - [CVPR2019] Learning view priors for single-view 3d reconstruction.
- ==continuous 2D patches== 本篇类似：使用2D patch来作为UV parameterization
  - [CVPR2018] Atlasnet: A papier-mâché approach to learning 3d surface generation.
  - AtlasNet for video clip
    [CVPR2019] Photometric mesh optimization for video-aligned 3d object reconstruction.
  - introduce topology modification to atlasnet
    [ICCV2019] Deep mesh reconstruction from single rgb images via topology modification networks

preliminaries

NOCS
- 可以预测出一张图片的nocs map和mask
surface parameterization
- 表面的UV参数化即一个chart
- 用一组全连接网络学习多个chart

overview

==注意==：不同于atlas net，uv不是来自于均匀采样，而是来自于一个learned network，uv predictor
所以是先预测出图像每个像素的uv值，再把图像上属于这个物体的uv值集合和图像的feature 拼接一起来输出三维点集合(二维流形的三维点坐标集)

graph LR
	img[image coordinate] -.per index prediction.-> uv[uv value] --> MLP
	image --> z[global latent code z] --> MLP
	MLP --> 3d[3D surface coordinate]

single view single chart pix2surf

NOCS-UV branch
- 在过去的NOCS输出上额外加两个channel，输出uv值
- uv不是均匀采样来的，而是直接从图像预测出一张2-channel uv image
- 发现可以emergence of a chart，并且这个chart几乎已经multi view consistent，multi object consistent
  - 即网络可以自己学出来如何把一个物体shape unrap到一个flat 空间
- code-extractor 一个小CNN
  - 单张图片输入，输出一个global latent code z
- UV amplifier
  - 因为UV坐标只有2维，而global latent code z维度很大，这两个信息不平衡
  - 所以就是用一组MLP先把UV升维
SP(surface parameterization) branch
- 类似atlas net，以升维后的UV和global latent code的拼接为输入，输出三维点坐标
- 与atlas net的不同：
  - uv升维了
  - 有一个learned chart，建立起图像坐标和3D surface坐标的直接相关
  - uv不是来自于均匀采样，而是从一个网络学出来的（即上面的NOCS-UV branch）
- 输出的三维点坐标位于NOCS空间
loss / train
- NOCS map的真值
- 3D surface point的真值（从shapenet 3d model直接得到）
- 其余都是端到端的

multi view atlas pix2surf

不同view的latent code取max pooling，max pooled code和该view的code concat在一起
从一个view的pixel的NOCS map的真值，找到这个真值在另一个view下的绝对对应pixel位置
最小化这两个pixel预测出的3D 点距离，即为所定义的multi view consistency loss

<Meshlet> Meshlet priors for 3d mesh reconstruction

CVPR2020 Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

UCSB NVIDIA

point to mesh, local shape prior, geodesic parameterization, VAE

Preprint Code

Motivation

输入点云，输出mesh
过去的学习shape的方法，在学习先验时有两种：
- object级别的先验，没有和pose解耦；
- smooth regularizer先验，会损失local detail
本篇想学习的是那些处于canonical pose下的local natural meshlets，用local natural meshlets，这种meshlets在不同物体、不同类别之间完全是shared，然后用这样纯粹的局部先验来拼出一个完整mesh


	P指的是测试时的物体在数据集物体pose分布内，红P指不在数据集pose分布内 N指的是低噪声，红N指moderate noise T指训练集见过的物体类别，T指训练集没有见过的物体类别可以看到，本篇重点强调学出那些和pose解耦了的局部的meshlets，用这些meshlets来拼出完整mesh

geodesic parameterization

Geodesic polar coordinates on polygonal meshes.
把一个顶点和周围的点映射到这个顶点的切平面的坐标上；然后把切平面通过变换变换到canonical pose（即顶点位移到坐标原点，切平面的法向量即z轴，切平面的u,v轴和x,y轴重合）
这样，可以实现pose解耦，学到那些各种各样的局部的meshlets

VAE

用VAE把各种meshlets压缩到一个latent space
然后应用它fit一个点云集合的时候，首先用encoder提取一个初始的latent code，然后auto-decoder来更新几步latent code

overall optimization

首先随便初始化一个rough mesh，从这个rough mesh提取meshlets，保证每个vertex至少被3个meshlets cover
- 注意，这样训练时就有两个量要迭代优化更新：一个是mesh，一个是一组meshlets；
- 其中，每个meshlets由顶点和形状code构成
迭代：更新每一个局部的local shape
- 用point cloud和meshlets“拼成的mesh”的loss来更新每一个meshlet的形状
迭代：再让local shape形成global consistency
- 最小化更新后的meshlets的形状和“拼成的mesh”的误差
- 首先固定meshlets的形状code，更新mesh顶点
- 然后固定mesh顶点，更新meshlet的形状code

Shape reconstruction by learning differentiable surface representations

CVPR2020 Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Jan Bednarik, Shaifali Parashar, Erhan Gundogdu, Mathieu Salzmann, Pascal Fua

EPFL

patch, control over patches, overlap, collapse, differential surface properties

Preprint Code Video

Motivation

目前有一些学习an ensumble of Parametric表征的方法
- 但是这些方法并没有控制表面patch的变形，因此并不能阻止patches彼此重叠或者折叠成一个点、一条线
- 这种情况下，计算表面法向量就会变得困难、不可靠
本篇提出在训练时，开发深度神经网络的天生的可微性
- 来利用表面的微分属性去阻止patch折叠、显著减少互相重叠
- 并且这让我们可以可靠地计算表面法向量、曲率等

Learning to Reconstruct Texture-Less Deformable Surfaces. 3DV2018
Marr Revisited: 2D-3D Model Alignment via Surface Normal Prediction. CVPR2016
A Two-Stream Network for Fast and Accurate 3D Cloth Draping. ICCV2019

overview

results

主要对比基线就是atlasNet
Pointcloud Autoencoding (PCAE)
single view reconstruction (SVR) 单目重建

Better patch stitching for parametric surface reconstruction

3DV2020 2020 international conference on 3D vision (3DV)

Zhantao Deng, Jan Bednařı́k, Mathieu Salzmann, Pascal Fua

EPFL

patch stitching, atlas, learning

Preprint

Motivation

对目前的multiple patch based parametric surface representations（atlas），改进patches的global consistency（即防止**孔洞和多个patch不正确交叉“jagged/带锯齿**的"的情况）
典型的缝合问题（1D表示）

FoldingNet Foldingnet: Point Cloud Auto-Encoder via Deep Grid Deformation.CVPR2018
第一个基于深度神经网络的工作：学到一个参数化的函数来在3D空间中嵌入一个2D流形
后面的工作shifted to ensembles of such learned functions来做patch-wise表征：
- learning (encoder)
  - Atlasnet: A papier-mâché approach to learning 3d surface generation. CVPR2018
  - Learning elementary structures for 3d shape generation and matching. NeurIPS2019
  - Shape reconstruction by learning differentiable surface representations. CVPR2020 这是作者的前作，用正则化来减轻表面的扭曲、重叠
  - Tearingnet: Point cloud autoencoder to learn topology-friendly representations. arXiv, 2020.
- optimization (auto-decoder)
  - Deep geometric prior for surface reconstruction. CVPR2019
  - Meshlet priors for 3d mesh reconstruction. CVPR2020
- 2D output domain
  - Deep parametric shape predictions using distance fields. CVPR2020
- 因为连续的patch可以以任意精度采样，因此在拟合的时候可以有很高的精度
- 目前方法的主要缺陷
  - 学到的表面高度扭曲、大规模重叠；只能通过适当的regularization正则化来减轻（即作者前一篇工作Shape reconstruction by learning differentiable surface representations）
  - 更紧急的问题：individual patches的放置时的global inconsistency，导致surface artifacts，比如孔洞，或者一些多个patch不正确交叉的区域
    - 这个问题在meshlet和Deep geometric prior for surface reconstruction. 两篇里有一定程度攻击，但是只在optimization settings，很缓慢，并且在test time还需要几何观测（如带噪声的点云）；
  - 本篇主要基于learning-based (带encoder) 前作，利用它的低扭曲、低重叠属性，改进patches的global consistency

目录

目录