MineralMLLM：矿物智能分析大模型

郭贤; 王耀祖(通讯作者); 张建良; 刘征建

doi:10.13374/j.issn2095-9389.2025.09.10.003

MineralMLLM：矿物智能分析大模型

MineralMLLM：Intelligent analysis model of mineral phase

摘要

摘要: 近年来，人工智能技术的发展推动了矿物分析方法的革新。传统矿物分析分析高度依赖专业地质知识及繁多精密仪器。而多模态大模型（Multimodal Large Language Models, MLLMs）因其在多源信息融合与复杂场景理解方面的优势，为该领域提供了新的研究路径。然而，通用领域的大模型在矿业学等专业领域表现受限。为此，本研究搭建了一种基于多模态大模型的矿物分析系统，研究利用矿物的“图-文”数据集，以Qwen2.5-VL为多模态基座模型，对比评估IA3和LoRA两种高效参数微调策略并优选方案，结合检索增强生成（Retrieval-Augmented Generation, RAG）技术，整合领域知识库，并构建轻量化Web前后端架构实现可视化交互。实验结果表明，该系统能够有效提升矿物分析的准确性，为矿物分析的智能化发展提供了可行方案。

Abstract: In recent years, the development of artificial intelligence (AI) technology has driven innovation in mineral analysis methods. Traditional mineral analysis relies heavily on specialized geological knowledge and numerous sophisticated instruments. Multimodal Large Language Models (MLLMs), with their advantages in multi-source information fusion and complex scene understanding, offer a new research path in this field. However, these general-purpose MLLMs have limited performance in specialized fields such as mining. To address this, this study developed a mineral analysis system based on MLLMs. Utilizing a mineral "image-text" dataset and using Qwen2.5-VL as the multimodal base model, the system compared and evaluated two efficient parameter fine-tuning strategies, IA3 and LoRA, and selected the optimal one. The system also incorporated retrieval-augmented generation (RAG) technology, integrated domain knowledge bases, and constructed a lightweight web-based front-end and back-end architecture for interactive visualization. Experimental results demonstrate that the system effectively improves the accuracy of mineral analysis, providing a viable solution for the intelligent development of mineral analysis.

HTML全文

参考文献(0)

施引文献

资源附件(0)