Full text loading...
This study presents a reinforcement learning (RL) approach to optimize well-drilling sequences in oilfield development—an NP-hard problem analogous to the generalized Traveling Salesman Problem. Traditional methods rely on computationally expensive brute-force searches or suboptimal heuristic rules. Our solution frames the problem as a Markov Decision Process, where an RL agent learns efficient drilling policies through a synthetic simulator modeling reservoir dynamics and economic constraints.
Experiments on a real-field case (8 wells, 2 rigs) demonstrate that the RL agent achieves 97.3% of the optimal economic value ($45.27M vs. $46.55M brute-force) while reducing computations by 99.9% (48 simulations vs. 40,320). The greedy heuristic, though faster, yields a 4.4% lower NPV ($44.60M).
The system serves as a decision-support tool, augmenting (not replacing) engineer expertise. Its scalability-coupled with HPC parallelization—addresses industry-scale challenges (e.g., 300 wells, 20 rigs), where brute-force methods are infeasible (>10390 combinations)