Full text loading...
In this paper, we presented our efforts towards advancing the geological text understanding capability of language models through domain-adapted training. By leveraging our in-house geological text corpus and innovative training strategies, we developed two geological language models. The geological BERT model significantly enhanced its ability to capture the semantic characteristics of words when used in geological contexts, leading to improved performance on the critical NER task. The geological GTE model, adapted with a topic classification task, showed promise in categorizing geological content despite limited domain-specific training data. These efforts highlight the importance of domain adaptation in achieving state-of-the-art performance in specialized fields like geoscience.