Analysis of the Integration Strategies of LLM and VLM Models with the Transformer Architecture
DOI:
https://doi.org/10.54097/3fhs5d75Keywords:
LLM, VLM, Transformer architecture, Integration strategiesAbstract
With the rapid development of artificial intelligence technology, Transformer architecture has become the core framework of natural language processing (NLP) and multimodal domain. In this paper, the fusion strategies of Large Language Model (LLM) and Visual Language Model (VLM) with Transformer architecture are deeply studied. This paper first introduces the basic principles and characteristics of Transformer architecture, LLM and VLM models, and then makes a comprehensive analysis of the advantages and challenges of different fusion strategies, and demonstrates the practical application effect of these fusion strategies in multimodal tasks through application cases such as visual question answering (VQA) and image description generation. The results show that by optimizing the model structure, training strategy and data processing, the integration of LLM and VLM with Transformer architecture can significantly improve the performance of the model in language and visual tasks, which provides a new idea and method for the development of multimodal artificial intelligence.
Downloads
References
[1] Miah, M. S. U., Kabir, M. M., Sarwar, T. B., et al.: A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM, Scientific Reports, Vol. 14 (2024) No. 1: 9603.
[2] Zhou, L., Zhang, Y., Yu, J., et al.: LLM-Augmented Linear Transformer–CNN for Enhanced Stock Price Prediction, Mathematics, Vol. 13 (2025) No. 3: 487.
[3] Alberts, I. L., Mercolli, L., Pyka, T., et al.: Large language models (LLM) and ChatGPT: what will the impact on nuclear medicine be? European Journal of Nuclear Medicine and Molecular Imaging, Vol. 50 (2023) No. 6: 1549-1552.
[4] Yadav, B.: Generative AI in the Era of Transformers: Revolutionizing Natural Language Processing with LLMs, J. Image Process. Intell. Remote Sens., Vol. 4 (2024) No. 2: 54-61.
[5] Zheng, L., Kandula, R. P., Kandasamy, K., et al.: New modulation and impact of transformer leakage inductance on current-source solid-state transformer, IEEE Transactions on Power Electronics, Vol. 37 (2021) No. 1: 562-576.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Journal of Computer Science and Artificial Intelligence

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.