目录

目录

DIST: Rendering Deep Implicit Signed Distance Function with Differentiable Sphere Tracing


<DIST> Dist: Rendering deep implicit signed distance function with differentiable sphere tracing

编者按

  • 文中出现了非常多技术细节的详细解释,值得一读

  • sphere tracing
    https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215111200177.png

  • 训练一个神经网络,同时为每个3D location 预测signed distance 和color

  • 需要silhouette真值

Motivation

  • 给SDF加上一个differentiable renderer,来为inverse graphics models和deep implicit surface field建设桥梁
  • solving vision problem as inverse graphics process is one of the foundamental approaches, where the solution is the visual structure that best explains given observations 把视觉问题看做逆向图形学过程来解决;寻找能最好地解释给定观测的视觉结构
    • 3D geometry理解 领域:很早就被使用(1974, 1999, etc.)
    • 常常需要一个高效的renderer来从从一个optimizable 的3D结构 精确地simulate这些观测(e.g. depth maps),同时需要是可微的,来反向传播局部观测的误差
    • (first) a differentiable renderer for learning-based SDF
  • 用一个可微分的renderer来把learning-based SDF可微分地渲染为 depth image, surface normal, silhouettes,从任意相机viewpoints
  • 应用:可用于infer 3D shape from various inputs, e.g. multi-view images and single depth image

overview

  • https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215111010407.png

  • [auto-decoder] 给定一个已经pre-trained generative model, e.g. DeepSDF, 通过在latent code space 寻找能产生和给定观测最一致的3D shape

  • https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20210105074006003.png

  • [sphere tracing] 使用一个类似sphere tracing的框架来做可微分的渲染

    • 直接应用sphere tracing因为需要对network做反复的query并且在反向传播时产生递归的计算图(笔者注:就像SRN那样),计算费时、费内存;所以需要对前向传播和反向传播过程都要做出优化
    • sphere-traced results (i.e. camera ray上的距离),可以用于产生各种输出,如深度图表面法向量轮廓等,因此可以用loss来方便地形成端到端的manner
    • 前向通路
    • https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20210105073826931.png
  • https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215164421303.png

    • 用一种coarse-to-fine的方法来save computation at initial steps
      • 考虑到在sphere tracing的前面几步,不同pixel的ray都非常接近
      • 从图像的1/4分辨率开始tracing,然后每3步以后把每个像素分成4份
      • 在6步后,full resolution下的每个像素都有一个对应的ray,一直marching直到收敛
    • 一个aggresive 策略来加速ray marching
      • marching步长是$\alpha=1.5$倍的queried SDF value
      • 在距离表面很远的时候更快地朝表面march
      • 在ill-posed情况下能加速收敛(当表面法向量和ray direction的夹角很小时)
        • Q: what?
      • ray可以射穿表面,能够采样到表面内部(SDF<0);对表面的两侧都可以应用supervision
    • dynamic synchronized inference
    • 一个safe convergence criteria来防止不必要的网络query,同时保留分辨率
  • 反向传播

    • 用SDF的梯度的近似值,对训练影响不大,但是显著减少计算和内存占用

实验

  • 收敛速度
    https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215112912115.png
  • Texture Re-rendering
    https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215114347510.png
  • Shape Completion from Sparse Depths
    https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215114703816.png
  • Shape Completion over Different Sparsity
    https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215114227972.png
  • Inverse Optimization over Camera Extrinsics
    https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215113343946.png
  • Multi-view Reconstruction from Video Sequences 从多视角视频序列重建
    https://longtimenohack.com/posts/paper_reading/2020cvpr_liu_dist/image-20201215115145581.png