SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes
Abstract
We present SpectroMotion, a novel approach that combines 3D Gaussian Splatting (3DGS) with physically-based rendering (PBR) and deformation fields to reconstruct dynamic specular scenes. Previous methods extending 3DGS to model dynamic scenes have struggled to accurately represent specular surfaces. Our method addresses this limitation by introducing a residual correction technique for accurate surface normal computation during deformation, complemented by a deformable environment map that adapts to time-varying lighting conditions. We implement a coarse-to-fine training strategy that significantly enhances both scene geometry and specular color prediction. We demonstrate that our model outperforms prior methods for view synthesis of scenes containing dynamic specular objects and that it is the only existing 3DGS method capable of synthesizing photorealistic real-world dynamic specular scenes, outperforming state-of-the-art methods in rendering complex, dynamic, and specular scenes.
Pipeline
Our method stabilizes the scene geometry through three stages. In the static stage, we stabilize the geometry of the static scene by minimizing photometric loss \(\mathcal{L}_{\text{color}}\) between vanilla 3DGS renders and ground truth images. The dynamic stage combines canonical 3D Gaussians \(\textbf{G}\) with a deformable Gaussian MLP to model dynamic scenes while simultaneously minimizing normal loss \(\mathcal{L}_{\text{normal}}\) between rendered normal map \(\mathbf{N}^t\) and gradient normal map from depth map \(\mathbf{D}^t\), thus further enhancing the overall scene geometry. Finally, the specular stage introduces a deformable reflection MLP to handle changing environment lighting, deforming reflection directions \(\omega^t_r\) to query a canonical environment map for specular color \(\mathbf{c}_s^t\). It is then combined with diffuse color \(\mathbf{c_d}\) (using zero-order spherical harmonics) and learnable specular tint \(\mathbf{s_\mathbf{tint}}\) per 3D Gaussian to obtain the final color \(\mathbf{c}_\mathbf{final}^t\). This approach enables the modeling of dynamic specular scenes and high-quality novel view rendering.
SpectroMotion outperforms other SOTA methods
Baseline method (left) vs SpectroMotion (right)
Ablation study on coarse-to-fine and losses.
Ablation method (left) vs Full model (right)
Ablation study on different coarse to fine training strategy stages.
Static and Dynamic training strategy stages (left) vs Specular stage (Full model) (right)
Ablation study on SH, Static and Deformable environment map.
SH and Static environment map methods (left) vs Deformable environment map (Full model) (right)
Citation
Acknowledgements
This research was funded by the National Science and Technology Council, Taiwan, under Grants NSTC 112-2222-E-A49-004-MY2. The authors are grateful to Google, NVIDIA, and MediaTek Inc. for generous donations. Yu-Lun Liu acknowledges the Yushan Young Fellow Program by the MOE in Taiwan.
The website template was borrowed from NaRCan and ReconFusion.