1887

Abstract

Summary

Acting as a technology-agnostic, standards-based data platform, the OSDU has reduced energy data silos and provided the capability for applications developers to build new solutions and data ingestion services.

The current OSDU schemas are primarily created to store file metadata to allow users to query common business content that can be extracted from the files. We utilized a machine-learning and subject matter expert classification process to auto-generate detailed file metadata for millions of files and ingest them directly to the user OSDU instance with source files.

The file classification process currently generates a graph database representation of files and rich metadata labels at a data-object level. The classification results, alongside data lineage and quality, are stored in OSDU work product components and datasets and ready to migrate to the OSDU platform.

The process prevents users having to manually fill or supply the file manifests during file ingestion to their OSDU implementation. With over 700 distinct data types and 250,000 entities of subsurface terminologies, millions of ingested files can be enriched with highly granular metadata manifests that guarantee rapid data search and access to high-quality data.

Loading

Article metrics loading...

/content/papers/10.3997/2214-4609.202332024
2023-03-20
2024-04-28
Loading full text...

Full text loading...

References

  1. 1.Open Subsurface Data Universe Software. Documentation. Wiki. Core Services Overview (opengroup.org).
    [Google Scholar]
  2. 2.Open Subsurface Data Universe Software. Data Definitions and Services. Data Definitions. Repository (opengroup.org).
    [Google Scholar]
  3. 3.Lun, C. H., Hewitt, T. & Hou, S., 2022. A Machine Learning Pipeline for Document Extraction.First Break, 40(2), pp. 73–78.
    [Google Scholar]
http://instance.metastore.ingenta.com/content/papers/10.3997/2214-4609.202332024
Loading
/content/papers/10.3997/2214-4609.202332024
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error