1887

Abstract

Summary

Geological cutting descriptions are essential data sources for the exploration of natural resources. These descriptions often detail vital attributes such as depths, lithology type and percentages, and physical characteristics such color, grain size, grain shape, hardness, sorting, cementation, porosity type and degree, and accessory minerals. Many of the reports are handwritten, with clear and legible to poor quality, which further complicates the digitization process. Moreover, the use of abbreviations and symbols in these handwritten records introduces another layer of complexity. We have overcome these complexities using sophisticated Optical Character Recognition (OCR) vision models (transformer-based further fine-tuned), although these models require additional effort to accurately interpret the reports. To mitigate the aforementioned challenges, a comprehensive Generative AI framework is developed, commencing with the acquisition of the optimal OCR output from our framework. Next, LLMs ranging from 13B to 70B open-source models are deployed on high-computational graphic servers. Although LLMs resolve many challenges, some issues related to percentages or lithology physical characteristics still require the judgement of professional geologists. Therefore, it is highly recommended that AI-driven automation is supervised with expert oversight to ensure high-quality data transformation.

Loading

Article metrics loading...

/content/papers/10.3997/2214-4609.202539019
2025-03-24
2026-02-14
Loading full text...

Full text loading...

References

  1. Abdin, Marah et al., “Phi-3 technical report: A highly capable language model locally on your phone.” arXiv preprint arXiv:2404.14219 (2024).
    [Google Scholar]
  2. Jiang, Albert Q., et al. “Mixtral of experts.” arXiv preprint arXiv:2401.04088 (2024).
    [Google Scholar]
  3. Dubey, Abhimanyu et al., “The llama 3 herd of models.” arXiv preprint arXiv:2407.21783 (2024).
    [Google Scholar]
/content/papers/10.3997/2214-4609.202539019
Loading
/content/papers/10.3997/2214-4609.202539019
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error