1887

Abstract

Summary

Built an end-to-end pipeline that turns 1D stratigraphic realizations into playable 2D subsurface worlds, trains a Deep Q-Network (DQN) to plan drilling, and measures exploration-exploitation trade-offs. Facies sequences were generated with Markov chains conditioned by sea-level history, then expanded and perturbed via deformation, tilting, faulting, and unconformities to inject realistic heterogeneity. OBJ Hydrocarbon migration was simulated agent-by-agent with facies- and permeability-aware move-probability matrices; inverted MPM logic provided plausible source-accumulation connectivity for labeling fluid presence. Fluid properties were mapped into elastic responses using Gassmann fluid substitution, and reflectivity was convolved to produce synthetic seismic amplitudes that serve as DQN observations. The game-like RL environment enforced budgets, per-action costs, well limits, and sparse rewards; the DQN used CNN features, experience replay, and a target network. Learned that annealing ɛ while using a high discount factor (y≈0.99) consistently outperformed constant-ɛ policies, yielding deeper, more profitable wells—evidence that deliberate early exploration plus strong long-term valuation beats premature exploitation. Also learned that geologic complexity synthesized with the Markov-chain + ABM stack improves policy robustness, because facies transitions and permeability contrasts expose the agent to the failure modes it must learn to avoid.

Loading

Article metrics loading...

/content/papers/10.3997/2214-4609.202639100
2026-03-09
2026-02-07
Loading full text...

Full text loading...

References

  1. Mnih, V. [2013]. Playing atari with deep reinforcement learning. arXiv preprint arXiv: 1312.5602.
    [Google Scholar]
  2. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G. and Petersen, S. [2015]. Human-level control through deep reinforcement learning. nature, 518(7540), 529–533.
    [Google Scholar]
  3. Perez, R. [2024a]. Advances in basin modeling using Markov chain: Facies deposition in response to sea level variations and random sequence of geologic processes. The Leading Edge, 43(11), 765–773.
    [Google Scholar]
  4. Perez, R. [2024b]. Advanced Agent-Based Modelling of Subsurface Migration and Accumulation Dynamics of Hydrocarbons. First Break, 42(9), 35–42.
    [Google Scholar]
  5. Perez, R. [2025]. Modeling hydrocarbon pathways from accumulation to source: A stochastic agent-based approach with permeability-driven matrices. The Leading Edge, 44(7), 566–570.
    [Google Scholar]
/content/papers/10.3997/2214-4609.202639100
Loading
/content/papers/10.3997/2214-4609.202639100
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error