Generating Custom Word Embeddings for Geoscientific Corpi

C.E. Birnie; M. Ravasi

doi:10.3997/2214-4609.202032059

Generating Custom Word Embeddings for Geoscientific Corpi
Authors C.E. Birnie¹, M. Ravasi¹
View Affiliations Hide Affiliations

Affiliations: ¹ Equinor ASA
Publisher: European Association of Geoscientists & Engineers
Source: Conference Proceedings, First EAGE Digitalization Conference and Exhibition, Nov 2020, Volume 2020, p.1 - 5
DOI: https://doi.org/10.3997/2214-4609.202032059

Abstract

Summary

In the field of natural language processing, word embeddings are a set of techniques that transform words from an input corpus into a low-dimensional space with the aim of capturing the relationships between words. It is well known that such relations are highly dependent on the context of the input corpus, which in science varies highly from field to field. In this work we compare the performance of word embeddings pre-trained on generic text versus custom made word embeddings trained on an extensive corpus of geoscientific papers. Numerous examples highlight the difference in meaning and closeness of words betweeen geoscientific and generic context. A prime example is the term ghost which has a specific definition in geophysics, different to that of its common usage in the English language. Moreover, domain specific analogies, such as ‘Compressional is to P-wave what shear is to… S-wave’, are investigated to understand the extent to which the different word embeddings capture the relationship between terms. Finally, we anticipate some use cases of word embeddings aimed at extracting key information from documents and providing better indexing.

Article metrics loading...

/content/papers/10.3997/2214-4609.202032059

2020-11-30

2024-04-28

From This Site

/content/papers/10.3997/2214-4609.202032059

dcterms_title,dcterms_subject,pub_keyword

-contentType:Journal -contentType:Contributor -contentType:Concept -contentType:Institution

10

5

Full text loading...

References

Allen, C. and Hospedales, T.
[2019] Analogies Explained: Towards Understanding Word Embeddings. arXiv preprint arXiv:1901.09813.
[Google Scholar]
Birnie, C.E., Sampson, J., Sjaastad, E., Johansen, B., Obrestad, L.E., Larsen, R. and Khamassi, A.
[2019] Improving the Quality and Efficiency of Operational Planning and Risk Management with ML and NLP. In: SPE Offshore Europe Conference and Exhibition. Society of Petroleum Engineers.
[Google Scholar]
Bojanowski, P., Grave, E., Joulin, A. and Mikolov, T.
[2017] Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.
[Google Scholar]
Maaten, L.v.d. and Hinton, G.
[2008] Visualizing data using t-SNE. Journal of machine learning research, 9(Nov), 2579–2605.
[Google Scholar]
Mikolov, T., Chen, K., Corrado, G. and Dean, J.
[2013] Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[Google Scholar]
Ozsoy, M.G.
[2016] From word embeddings to item recommendation. arXiv preprint arXiv:1601.01356.
[Google Scholar]
Pennington, J., Socher, R. and Manning, C.
[2014] Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.
[Google Scholar]
Turney, P.D. and Pantel, P.
[2010] From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research, 37, 141–188.
[Google Scholar]

http://instance.metastore.ingenta.com/content/papers/10.3997/2214-4609.202032059

Generating Custom Word Embeddings for Geoscientific Corpi

Conference Proceedings 2020, 1 (2020); https://doi.org/10.3997/2214-4609.202032059

/content/papers/10.3997/2214-4609.202032059

Data & Media loading...

Most Cited This Month Most Cited RSS feed

- The natural combination of full and image‐based waveform inversion
  
  Authors Tariq Alkhalifah and Zedong Wu
- Poststack diffraction imaging using reverse‐time migration
  
  Authors Ilya Silvestrov, Reda Baina and Evgeny Landa
- Characterizing the effect of elastic interactions on the effective elastic properties of porous, cracked rocks
  
  Authors Luanxiao Zhao, Qiuliang Yao, De‐hua Han, Fuyong Yan and Mosab Nasser
- Fracture detection by Gaussian beam imaging of seismic data and image spectrum analysis
  
  Authors M.I. Protasov, G.V. Reshetova and V.A. Tcheverda
- Laboratory measurements of guided‐wave propagation within a fluid‐saturated fracture
  
  Authors Seiji Nakagawa, Shinichiro Nakashima and Valeri A. Korneev
More Less

Generating Custom Word Embeddings for Geoscientific Corpi

Abstract

From This Site

Most Read This Month

Most Cited This Month Most Cited RSS feed

The natural combination of full and image‐based waveform inversion

Poststack diffraction imaging using reverse‐time migration

Characterizing the effect of elastic interactions on the effective elastic properties of porous, cracked rocks

Fracture detection by Gaussian beam imaging of seismic data and image spectrum analysis

Laboratory measurements of guided‐wave propagation within a fluid‐saturated fracture