Full text loading...
Handling different seismic processing tasks with a single pretrained neural network model is the main advantage of StorSeismic, a Transformer-based model with a pretraining and fine-tuning framework, which has been developed recently. The pretraining stage included a mixed synthetic and label-less field data, which allowed for the storage of the features of both data, and for the direct inference on field data after a task-driven fine-tuning stage on the labeled synthetic data. Though the vanilla architecture performed well on various processing tasks, recent developments of the Transformer components, particularly the positional encoding and the attention mechanism, opened opportunities for a more effective StorSeismic model and framework. Thus, we experimented with a relative positional encoding approach and a low-rank form of the attention matrix to replace the vanilla sinusoidal positional encoding and dot-product self-attention, respectively. We show that these alternatives offer fewer pretraining iterations and competitive results on the fine-tuning tasks of denoising, demultiple, and first arrival picking on the Marmousi example compared to the vanilla model.