Full text loading...
-
Futureproofing Rich Metadata File Ingestion with OSDU
- Publisher: European Association of Geoscientists & Engineers
- Source: Conference Proceedings, Third EAGE Digitalization Conference and Exhibition, Mar 2023, Volume 2023, p.1 - 4
Abstract
Acting as a technology-agnostic, standards-based data platform, the OSDU has reduced energy data silos and provided the capability for applications developers to build new solutions and data ingestion services.
The current OSDU schemas are primarily created to store file metadata to allow users to query common business content that can be extracted from the files. We utilized a machine-learning and subject matter expert classification process to auto-generate detailed file metadata for millions of files and ingest them directly to the user OSDU instance with source files.
The file classification process currently generates a graph database representation of files and rich metadata labels at a data-object level. The classification results, alongside data lineage and quality, are stored in OSDU work product components and datasets and ready to migrate to the OSDU platform.
The process prevents users having to manually fill or supply the file manifests during file ingestion to their OSDU implementation. With over 700 distinct data types and 250,000 entities of subsurface terminologies, millions of ingested files can be enriched with highly granular metadata manifests that guarantee rapid data search and access to high-quality data.