Introducing GraphRAG: A cutting-edge tool for uncovering complex data, now available on GitHub

Spread the love



Earlier this year, RAGGraphic, a graph-based approach to Recall Augmented Generation (RAG) was introduced to answer questions about private or never-before-seen datasets. GraphRAG is now available on GitHub and offers more structured information retrieval and richer response generation than traditional RAG methods. The repository is accompanied by a solutions accelerator for easy deployment on Azure with no-code required.

GraphRAG utilizes a large language model (LLM) to extract a knowledge graph from text documents, allowing for a hierarchical summary of datasets. This approach can answer global questions about the entire dataset, unlike naive RAG methods that struggle with comprehensive responses. By utilizing community summaries, GraphRAG outperforms naive RAG in terms of completeness and diversity.

In a recent study comparing GraphRAG to hierarchical summarization of source text, GraphRAG demonstrated superior performance in generating responses to global questions. Results showed that GraphRAG was more comprehensive and diverse while using fewer tokens per query, making it a cost-effective solution for data analysis.

Future research aims to reduce upfront costs while maintaining response quality, with work focusing on adapting LLM extraction prompts to specific problem domains. The goal is to make graph-based RAG approaches accessible to a wider range of users and use cases requiring global data understanding.

The research team invites feedback and suggestions on the code repository and solution accelerator as they continue to develop GraphRAG and enhance the next generation of RAG experiences.

Article Source
https://www.microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/