Full text loading...
This study explores the potential of agent-based frameworks powered by Large Language Models (LLMs) to integrate and interpret multimodal data in well drilling operations. The research leverages autonomous agents to process diverse datasets, including images and mechanical parameters, to assess wellbore stability and drilling efficiency. Using specialized agents, tasks such as caving detection from shale shaker images and the analysis of Mechanical Specific Energy (MSE) and Drilling Strength (DS) are performed.
Two language models—Phi-3 Mini and GPT-3.5—are compared to evaluate the trade-offs between performance and cost-effectiveness. The Phi-3 Mini model, despite being cost-efficient, exhibited occasional hallucinations, whereas GPT-3.5 consistently provided more precise and reliable outputs. Fine-tuned models like Llava were used for image-based analyses, enhancing multimodal integration.
The agent-based framework demonstrates its capability to autonomously reason, plan, and interact with external tools, producing actionable insights to guide operational decisions. Results indicate that the system effectively identifies risks, such as significant cavings and unusual MSE/DS ratios, offering recommendations to mitigate potential challenges. The proposed use of agent-based framework highlights the scalability and flexibility of LLM-powered agent systems, paving the way for advanced applications in the energy sector while addressing cost considerations through smaller, efficient models.