Implicit Function Super-Resolution Reconstruction Based on Group Propagation Vision Transformer

Jiacun Song

doi:10.54097/edyyr806

Authors

Jiacun Song

DOI:

https://doi.org/10.54097/edyyr806

Keywords:

Implicit function; Super-resolution; GP-ViT.

Abstract

Reconstruction-based single-image super-resolution methods, while demonstrating excellent performance, often face challenges such as training instability, artifact generation, information loss, and insufficient control over global information. To address these challenges, we propose an implicit function super-resolution reconstruction algorithm based on Group Propagation Vision Transformer (GP-ViT). This method employs GP-ViT as an encoder to efficiently capture global contextual information through a group propagation mechanism, while reducing computational complexity and memory consumption, and significantly enhancing local feature extraction capabilities. In the decoding phase, the algorithm utilizes an implicit function continuous representation to decode image features, supporting super-resolution reconstruction up to 32 times, enabling the recovery of high-frequency details in a continuous manner and generating high-quality images. Experimental results show that compared to classical super-resolution models, our method has significant improvements in two key metrics, PSNR and SSIM, while effectively reducing artifacts and preserving more detailed information.

Downloads

Download data is not yet available.

References

[1] C. Dong, C. C. Loy, K. He, and X. Tang, ‘Learning a Deep Convolutional Network for Image Super-Resolution’, in Computer Vision – ECCV 2014, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., Cham: Springer International Publishing, 2014, pp. 184–199. doi: 10.1007/978-3-319-10593-2_13.

[2] J. Kim, J. K. Lee, and K. M. Lee, ‘Accurate Image Super-Resolution Using Very Deep Convolutional Networks’, presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1646–1654. Accessed: Feb. 16, 2025. [Online]. Available: https://openaccess.thecvf.com/content_cvpr_2016/html/Kim_Accurate_Image_Super-Resolution_CVPR_2016_paper.html

[3] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, ‘Image Super-Resolution Using Very Deep Residual Channel Attention Networks’, in Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part VII, Berlin, Heidelberg: Springer-Verlag, Sep. 2018, pp. 294–310. doi: 10.1007/978-3-030-01234-2_18.

[4] H. Chen et al., ‘Pre-Trained Image Processing Transformer’, presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12299–12310. Accessed: Feb. 16, 2025. [Online]. Available: https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Pre-Trained_Image_Processing_Transformer_CVPR_2021_paper.html

[5] C. Ledig et al., ‘Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network’, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, pp. 105–114. doi: 10.1109/CVPR.2017.19.

[6] X. Wang et al., ‘ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks’, in Computer Vision – ECCV 2018 Workshops, L. Leal-Taixé and S. Roth, Eds., Cham: Springer International Publishing, 2019, pp. 63–79. doi: 10.1007/978-3-030-11021-5_5.

[7] K. Zhang, W. Zuo, and L. Zhang, ‘Learning a Single Convolutional Super-Resolution Network for Multiple Degradations’, presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 3262–3271. Accessed: Feb. 16, 2025. [Online]. Available: https://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Learning_a_Single_CVPR_2018_paper.html

[8] J. Gu, H. Lu, W. Zuo, and C. Dong, ‘Blind Super-Resolution With Iterative Kernel Correction’, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2019, pp. 1604–1613. doi: 10.1109/CVPR.2019.00170.

[9] C. Yang, J. Xu, S. D. Mello, E. J. Crowley, and X. Wang, ‘GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation’, Apr. 25, 2023, arXiv: arXiv:2212.06795. doi: 10.48550/arXiv.2212.06795.

[10] Y. Chen, S. Liu, and X. Wang, ‘Learning Continuous Image Representation with Local Implicit Image Function’, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, pp. 8624–8634. doi: 10.1109/CVPR46437.2021.00852.

[11] M. Bevilacqua, A. Roumy, C. Guillemot, and M.-L. A. Morel, ‘Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding’, presented at the British Machine Vision Conference (BMVC), 2012. Accessed: Jul. 29, 2024. [Online]. Available: https://inria.hal.science/hal-00747054

[12] J.-B. Huang, A. Singh, and N. Ahuja, ‘Single image super-resolution from transformed self-exemplars’, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2015, pp. 5197–5206. doi: 10.1109/CVPR.2015.7299156.

[13] Y. Matsui et al., ‘Sketch-based manga retrieval using manga109 dataset’, Multimed Tools Appl, vol. 76, no. 20, pp. 21811–21838, Oct. 2017, doi: 10.1007/s11042-016-4020-z.

[14] D. Martin, C. Fowlkes, D. Tal, and J. Malik, ‘A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics’, in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Jul. 2001, pp. 416–423 vol.2. doi: 10.1109/ICCV.2001.937655.

[15] E. Agustsson and R. Timofte, ‘NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study’, in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Jul. 2017, pp. 1122–1131. doi: 10.1109/CVPRW.2017.150.

[16] X. Wang, K. Yu, C. Dong, and C. Change Loy, ‘Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform’, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp. 606–615. doi: 10.1109/CVPR.2018.00070.

[17] C. Ma, Y. Rao, Y. Cheng, C. Chen, J. Lu, and J. Zhou, ‘Structure-Preserving Super Resolution With Gradient Guidance’, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020, pp. 7766–7775. doi: 10.1109/CVPR42600.2020.00779.

[18] J. Park, S. Son, and K. M. Lee, ‘Content-Aware Local GAN for Photo-Realistic Super-Resolution’, in 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Oct. 2023, pp. 10551–10560. doi: 10.1109/ICCV51070.2023.00971.

[19] Z. Yue, J. Wang, and C. C. Loy, ‘ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting’, Advances in Neural Information Processing Systems, vol. 36, pp. 13294–13307, Dec. 2023.

[20] zhengxiong luo, Y. Huang, S. Li, L. Wang, and T. Tan, ‘Unfolding the Alternating Optimization for Blind Super Resolution’, in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2020, pp. 5632–5643. Accessed: Jul. 29, 2024. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/hash/3d2d8ccb37df977cb6d9da15b76c3f3a-Abstract.html

[21] Z. Luo, Y. Huang, S. Li, L. Wang, and T. Tan, ‘End-to-end Alternating Optimization for Blind Super Resolution’, May 14, 2021, arXiv: arXiv:2105.06878. doi: 10.48550/arXiv.2105.06878.

[22] Y. Jo, S. Wug Oh, P. Vajda, and S. Joo Kim, ‘Tackling the Ill-Posedness of Super-Resolution through Adaptive Target Generation’, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, pp. 16231–16240. doi: 10.1109/CVPR46437.2021.01597.

[23] S. Y. Kim, H. Sim, and M. Kim, ‘KOALAnet: Blind Super-Resolution using Kernel-Oriented Adaptive Local Adjustment’, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, pp. 10606–10615. doi: 10.1109/CVPR46437.2021.01047.

[24] Z. Luo, H. Huang, L. Yu, Y. Li, H. Fan, and S. Liu, ‘Deep Constrained Least Squares for Blind Image Super-Resolution’, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2022, pp. 17621–17631. doi: 10.1109/CVPR52688.2022.01712.

[25] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, ‘Enhanced Deep Residual Networks for Single Image Super-Resolution’, presented at the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE Computer Society, Jul. 2017, pp. 1132–1140. doi: 10.1109/CVPRW.2017.151.