Large-scale Dynamic Scene Reconstruction Based on Neural Fields

Authors

  • Yinan Qi

DOI:

https://doi.org/10.54097/kfy6ge79

Keywords:

3D Gaussian Splatting, Dynamic scene, Neural rendering, Large-scale

Abstract

Large-scale dynamic scene reconstruction methods based on spatiotemporal field models demonstrate significant potential in autonomous driving applications. However, existing neural radiance field (NeRF) and 3D Gaussian splatting (3DGS) techniques remain constrained by their dynamic element modeling capabilities and computational efficiency, failing to effectively address complex reconstruction tasks involving intertwined static and dynamic regions in driving scenarios. To address these challenges, this study proposes a novel framework integrating spatiotemporal attention mechanisms with sparse encoding strategies. The method employs a spatiotemporal attention module that captures dynamic motion patterns through self-supervised inter-frame prediction, addressing spatiotemporal inconsistencies caused by non-rigid deformations. Simultaneously, a KL divergence-guided hierarchical sparse encoding strategy achieves efficient multi-scale scene feature representation while preserving reconstruction accuracy. Furthermore, a mean-variance decoupled stochastic sampling mechanism enhances modeling robustness in dynamic regions. Experimental results demonstrate substantial improvements in reconstruction quality compared to state-of-the-art large-scale dynamic scene reconstruction methods, ultimately enabling more photorealistic 3D reconstruction outcomes."

Downloads

Download data is not yet available.

References

[1] Zhang Y, Zhu Z, Zheng W, et al. Beverse: Unified perception and prediction in birds-eye-view for vision-centric autonomous driving [J]. arXiv preprint arXiv: 2205.09743, 2022.

[2] Huang Y, Zheng W, Zhang Y, et al. Tri-perspective view for vision-based 3d semantic occupancy prediction [C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 9223-9232.

[3] Wei Y, Zhao L, Zheng W, et al. Surroundocc: Multi-camera 3d occupancy prediction for autonomous driving [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 21729-21740.

[4] Hu A, Murez Z, Mohan N, et al. Fiery: Future instance prediction in bird's-eye view from surround monocular cameras [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 15273-15282.

[5] Gu J, Hu C, Zhang T, et al. Vip3d: End-to-end visual trajectory prediction via 3d agent queries [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 5496-5506.

[6] Liang M, Yang B, Zeng W, et al. Pnpnet: End-to-end perception and prediction with tracking in the loop [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 11553-11562.

[7] Dauner D, Hallgarten M, Geiger A, et al. Parting with misconceptions about learning-based vehicle motion planning [C]//Conference on Robot Learning. PMLR, 2023: 1268-1281.

[8] Cheng J, Chen Y, Zhang Q, et al. Real-time trajectory planning for autonomous driving with gaussian process and incremental refinement [C]//2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022: 8999-9005.

[9] Cheng J, Mei X, Liu M. Forecast-mae: Self-supervised pre-training for motion forecasting with masked autoencoders [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 8679-8689.

[10] Hu S, Chen L, Wu P, et al. St-p3: End-to-end vision-based autonomous driving via spatial-temporal feature learning [C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 533-549.

[11] Hu Y, Yang J, Chen L, et al. Planning-oriented autonomous driving [C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 17853-17862.

[12] Jiang B, Chen S, Xu Q, et al. Vad: Vectorized scene representation for efficient autonomous driving [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 8340-8350.

[13] Li Z, Yu Z, Lan S, et al. Is ego status all you need for open-loop end-to-end autonomous driving [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 14864-14873.

[14] Turki H, Zhang J Y, Ferroni F, et al. Suds: Scalable urban dynamic scenes [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 12375-12385.

[15] Mildenhall B, Srinivasan P P, Tancik M, et al. Nerf: Representing scenes as neural radiance fields for view synthesis [J]. Communications of the ACM, 2021, 65(1): 99-106.

[16] Wang Z, Shen T, Gao J, et al. Neural fields meet explicit geometric representations for inverse rendering of urban scenes [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 8370-8380.

[17] Zhenxing M I, Xu D. Switch-nerf: Learning scene decomposition with mixture of experts for large-scale neural radiance fields [C]//The Eleventh International Conference on Learning Representations. 2022.

[18] Kerbl B, Kopanas G, Leimkühler T, et al. 3d gaussian splatting for real-time radiance field rendering [J]. ACM Trans. Graph., 2023, 42(4): 139:1-139:14.

[19] Geiger A, Lenz P, Stiller C, et al. Vision meets robotics: The kitti dataset [J]. The international journal of robotics research, 2013, 32(11): 1231-1237.

[20] Zwicker M, Pfister H, Van Baar J, et al. Surface splatting [C]//Proceedings of the 28th annual conference on Computer graphics and interactive techniques. 2001: 371-378.

[21] Yifan W, Serena F, Wu S, et al. Differentiable surface splatting for point-based geometry processing [J]. ACM Transactions On Graphics (TOG), 2019, 38(6): 1-14.

[22] Huang N, Wei X, Zheng W, et al. s^3gaussian: Self-Supervised Street Gaussians for Autonomous Driving [J]. arXiv preprint arXiv:2405.20323, 2024.

[23] Caesar H, Bankiti V, Lang A H, et al. nuscenes: A multimodal dataset for autonomous driving [C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11621-11631.

[24] Caesar H, Kabzan J, Tan K S, et al. nuplan: A closed-loop ml-based planning benchmark for autonomous vehicles [J]. arXiv preprint arXiv:2106.11810, 2021.

[25] Wu Z, Liu T, Luo L, et al. Mars: An instance-aware, modular and realistic simulator for autonomous driving [C]//CAAI International Conference on Artificial Intelligence. Singapore: Springer Nature Singapore, 2023: 3-15.

[26] Ost J, Mannan F, Thuerey N, et al. Neural scene graphs for dynamic scenes [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 2856-2865.

[27] Yan Y, Lin H, Zhou C, et al. Street gaussians: Modeling dynamic urban scenes with gaussian splatting [C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 156-173.

Downloads

Published

21-07-2025

Issue

Section

Articles