IBM and NASA collaborate on creating language models to enhance accessibility of scientific information

Spread the love



IBM and NASA have collaborated to create a set of efficient language models trained on scientific literature. These transformer-based models, including BERT, Roberta, Slate, and Granite, are useful for tasks such as classification, entity extraction, question answering, and information retrieval. By training on a corpus of data from various scientific domains, these models outperform generic models like RoBERTa. Specifically, they show improved performance on biomedical tasks, question answering benchmarks, and Earth science entity recognition tests.

One key aspect of these models is their specialized tokenizer, which is trained to recognize scientific terms and vocabulary. This allows the models to process unique tokens and achieve better performance in scientific domains. The models have been trained on 60 billion tokens from fields such as astrophysics, planetary sciences, earth sciences, heliophysics, and biological and physical sciences.

In addition to their strong performance on various benchmarks, these models can be fine-tuned for specific linguistic tasks and generate information-rich embeddings for document retrieval. The retrieval model built on top of the encoding model uses a contrastive loss function to produce embeddings that map the similarity between text pairs. This allows the models to excel at retrieving relevant passages and answering questions with fidelity to the retrieved document.

The collaboration between IBM and NASA has resulted in significant improvements in AI models for scientific research and information retrieval. Both the encoder and retriever models are available on Hugging Face for the scientific and academic community to use and further develop. This open and transparent approach aligns with IBM and NASA’s commitment to advancing AI technology for the benefit of society.

Moving forward, IBM and NASA are working together to enhance the scientific search engine using these advanced language models. By leveraging their expertise in AI and scientific research, the collaboration aims to improve information retrieval and knowledge discovery in the scientific community.

Article Source
https://research.ibm.com/blog/science-expert-LLM