Full text loading...
The availability of geological formation properties is crucial for conducting a geological subsurface interpretation. Well reports often define such information, but their scanned and unstructured nature makes it cumbersome for geologists to extract the geological formations properties they require. Such properties are often represented in the form of tables; however, existing automatic information extraction methods struggle to process tables from scanned tables. To address this limitation in the literature, we introduce a novel method that extracts formation properties from tables of scanned reports. Our method includes two main components. The first component preprocesses the report pages, detects the tables, and converts them to a structured readable format. The second component includes a chain of large language model calls to extract and standardize the different formation properties from formation tables. We evaluate our method on a sample of test reports given by TotalEnergies and achieve satisfactory results with an accuracy of 90.2% and a marker coverage of 83%. Future work includes running a marker quality check on the extracted formation properties, as well as dealing with implicit tables using a visual language model.