In the workflow of Full-Waveform Inversion (FWI), we often tune the parameters of the inversion to help us avoid cycle skipping and obtain high resolution models. For example, typically start by using objective functions that avoid cycle skipping, and then later, we utilize the least squares misfit to admit high resolution information. Such hierarchical approaches are common in FWI, and they often depend on our manual intervention based on many factors, and of course, results depend on experience. However, with the large data size often involved in the inversion and the complexity of the process, making optimal choices is difficult even for an experienced practitioner. Thus, as an example, and within the framework of reinforcement learning, we utilize a deep-Q network (DQN) to learn an optimal policy to determine the proper timing to switch between different misfit functions. Specifically, we train the state-action value function (Q) to predict when to use the conventional L2-norm misfit function or the more advanced optimal-transport matching-filter (OTMF) misfit to mitigate the cycle-skipping and obtain high resolution, as well as improve convergence. We use a simple while demonstrative shifted-signal inversion examples to demonstrate the basic principles of the proposed method.


Article metrics loading...

Loading full text...

Full text loading...


  1. Mnih, V., Kavukcuoglu, K., Sliver, D., Graves, A., Antonoglou, I., Wierstra, D. and Riedmiller, M.
    [2013] Playing Atari with deep reinforcement learning.arXiv:1312.5602.
    [Google Scholar]
  2. Silver
    Silver [2017] Mastering the game of Go without human knowledge.Nature, 550(7676), 354–359.
    [Google Scholar]
  3. Sun, B. and Alkhalifah, T.
    [2019a] The application of an optimal transport to a preconditioned data matching function for robust waveform inversion.Geophysics, 84(6), R923–R945.
    [Google Scholar]
  4. [2019b] Salt body inversion using an optimal transport of the preconditioned matching filter.81st EAGE Conference and Exhibition.
    [Google Scholar]
  5. Vinyals
    [2019] Grandmaster level in StarCraft II using multi-agent reinforcement learning.Nature, 575(7782), 350–354.
    [Google Scholar]
  6. Yang, Y. and Engquist, B.
    [2018] Analysis of optimal transport and related misfit functions in full-waveform inversion.Geophysics, 83(1), no. 1, A7–A12.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error