Full text loading...
The analysis and description of petrographic thin sections is a critical, highly specialized and time-consuming activity. Recently, a lot of effort has been made to digitalize this process: image acquisition equipment has been integrated within microscopes, and a vast collection of thin section images is being constantly acquired. This broadens the opportunity of applying machine learning techniques to facilitate the current workflow.
However, supervised methodologies rely heavily on labeled data. Labeling thin sections is expensive and, considering the large passive of yet unanalyzed sections, mostly impractical. Unsupervised and self-supervised approaches could allow the generation of insights without the imposition of workload overhead. If associated with section descriptions and metadata - which can be used as weak labels, or as multimodal training data - they may prove powerful tools for accelerating processes.
This work, then, proposes an initial attempt at developing a self-distillation based self-supervised approach capable of creating high-quality latent representation spaces of petrographic thin sections - playfully referred to as the equivalent to a Thin Section DNA. We aim to create a foundation over which it is possible to develop, without the need for extensive labeling work, applications ranging from similarity retrieval of sections to multimodal automatic description generation.