Full text loading...
Geological cutting descriptions are essential data sources for the exploration of natural resources. These descriptions often detail vital attributes such as depths, lithology type and percentages, and physical characteristics such color, grain size, grain shape, hardness, sorting, cementation, porosity type and degree, and accessory minerals. Many of the reports are handwritten, with clear and legible to poor quality, which further complicates the digitization process. Moreover, the use of abbreviations and symbols in these handwritten records introduces another layer of complexity. We have overcome these complexities using sophisticated Optical Character Recognition (OCR) vision models (transformer-based further fine-tuned), although these models require additional effort to accurately interpret the reports. To mitigate the aforementioned challenges, a comprehensive Generative AI framework is developed, commencing with the acquisition of the optimal OCR output from our framework. Next, LLMs ranging from 13B to 70B open-source models are deployed on high-computational graphic servers. Although LLMs resolve many challenges, some issues related to percentages or lithology physical characteristics still require the judgement of professional geologists. Therefore, it is highly recommended that AI-driven automation is supervised with expert oversight to ensure high-quality data transformation.