1887

Abstract

Summary

This study presents an automated pipeline for fossil identification and annotation by leveraging Large Vision Models (LVMs), Vision Transformers (ViTs), and Optical Character Recognition (OCR). The proposed framework processes unstructured fossil images to segment individual fossils, extract textual annotations, and align them with corresponding names and descriptions. Using zero-shot object detection with OWL-ViT, the system achieves robust segmentation without extensive labeled data, while a transformer-based OCR model extracts and refines textual information. The pipeline was tested on diverse paleontological datasets featuring various fossil types and image complexities. Results demonstrate high segmentation accuracy, effective text-to-image mapping, and adaptability across datasets. Additionally, an interactive interface enables human-in-the-loop refinement, enhancing reliability and usability for domain experts. Overall, the study establishes a scalable and intelligent approach for digitizing and organizing paleontological imagery, supporting automated fossil identification and advancing digital fossil documentation.

Loading

Article metrics loading...

/content/papers/10.3997/2214-4609.202639034
2026-03-09
2026-02-15
Loading full text...

Full text loading...

References

  1. Orijemie, E. A., Ogunfolakan, A., Aleru, J. O., & Sowunmi, M. A. (2010). The archaeology and palynology of Ajaba, a late Iron-Age settlement in north-east Yoruba land, Nigeria:Some preliminary results. In P.Allsworth-Jones (Ed.), West African Archaeology: New Developments, New Perspectives (BAR International Series 2164, pp. 103–116). Oxford: Archaeopress.
    [Google Scholar]
  2. Nacson, M. S., Aberdam, A., Ganz, R., Avraham, E. B., Golts, A., Kittenplon, Y., & Litman, R. (2024). DocVLM: Make Your VLM an Efficient Reader. *arXiv preprint arXiv:2412.08746.*
    [Google Scholar]
  3. Shu, Q. (1982). Upper Permian and Lower Triassic palynomorphs from eastern Yunnan, China. *Canadian Journal of Earth Sciences, 19*(1), 68–80.
    [Google Scholar]
  4. Zavattieri, A. M., Gutiérrez, P. R., & Ezpeleta, M. (2018). Gymnosperm pollen grains from the Vaca Muerta Formation (Tithonian), Paganzo Basin, Argentina:biostratigraphic and palaeoecological implications. *Alcheringa: An Australasian Journal of Palaeontology, 42*(3), 276–299.
    [Google Scholar]
  5. Steiner, A., Kolesnikov, A., Ning, M., Uszkoreit, J., & Beyer, L. (2021). How to train your ViT? data, augmentation, and regularization in vision transformers. *arXiv preprint arXiv:2106.10270.
    [Google Scholar]
/content/papers/10.3997/2214-4609.202639034
Loading
/content/papers/10.3997/2214-4609.202639034
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error