Full text loading...
Geological CO2 storage is a promising climate mitigation option, but its success relies on ensuring long-term containment. We present a proof-of-concept framework that couples synthetic seismic modeling with machine learning to assess the seismic detectability of CO₂ leakage. Based on well log data from the Sleipner field, we create 1000 random synthetic velocity and density models, incorporating realistic stratigraphy and CO2-brine fluid substitution. Using finite difference modelling, we generate 2D acoustic gathers for each model, which serve as input for our machine learning (ML) framework. An unsupervised autoencoder learns latent features from 900 training gathers (validated on the remaining 100 gathers). A regression model (ML1) maps latent features to geologic/leakage parameters, generalizing to 100 new, unseen models. It recovers latent features closely as training loss drops from 9.67 to 0.029 and validation loss to 0.026. A binary classifier (ML2) detects leakage from ML1-predicted features, achieving 80% validation accuracy, but exhibits weak probability calibration, as the training loss decreases from 0.69 to 0.67, consistent with the limited dataset (90 training / 10 validation). Applied to Sleipner 2001 monitor baseline data (expected contained), ML2 returns a leakage probability of 51%, despite its weak performance and domain shift from synthetic to real data. This indicates that synthetic-trained models can distinguish between leaky and contained responses at the threshold level; however, robust deployment will require broader and more diverse training data. This framework provides a path to quantify sensitivity and define detection limits for monitoring CO2 leakage.