graphrag
4 TopicsGraphRAG and PostgreSQL integration in docker with Cypher query and AI agents (Version 2*)
This is update from previous blog (version 1): GraphRAG and PostgreSQL integration in docker with Cypher query and AI agents | Microsoft Community Hub Review the business needs of this solution from version 1 What's new in version 2? MCP tools for GraphRAG and PostgreSQL with Apache AGE This solution now includes MCP tools for GraphRAG and PostgreSQL. There are five MCP tools exposed: [graphrag_search] Used to run query (local or global) with runtime-tunable API parameters. One important aspect is that query behavior can be tuned at runtime, without changing the underlying index. [age_get_schema_cached] Used for schema inspection and diagnostics. It returns the graph schema (node labels and relationship types) from cache by default; and can optionally refresh the cache by re‑querying the database. This tool is typically used for introspection or debugging, not for answering user questions about data. [age_entity_lookup] Used for quick entity discovery and disambiguation. It performs a simple substring match on entity names or titles and is especially useful for questions like “Who is X?” or as a preliminary step before issuing more complex graph queries. [age_cypher_query] Executes a user‑provided Cypher query directly against the AGE graph. This is intended for advanced users who already know the graph structure and want full control over traversal logic and filters. [age_nl2cypher_query] Bridges natural language and Cypher. This tool converts a natural‑language question into a Cypher query (using only Entity nodes and RELATED_TO edges), executes it, and returns the results. It is most effective for multi‑hop or structurally complex questions where semantic interpretation is needed first, but execution must remain deterministic. Besides that, This solution now uses Microsoft agent framework. It enables clean orchestration over MCP tools, allowing the agent to dynamically select between GraphRAG and graph query capabilities at runtime, with a looser coupling and clearer execution model than traditional Semantic Kernel function plugins. The new Docker image includes graphRAG3.0.5. This version stabilizes the 3.x configuration‑driven, API‑based architecture and improves indexing reliability, making graph construction more predictable and easier to integrate into real workflows. New architecture Updated Step 7 - run query in Jupyter notebook This step runs Jupyter notebook in docker, which is the same as stated in previous blog. > docker compose up query-notebook After clicking the link highlighted in the above screen shot, you can explore all files within the project in the docker, then find the query-notebook.ipynb. https://github.com/Azure-Samples/postgreSQL-graphRAG-docker/blob/main/project_folder/query-notebook.ipynb But in this new version of notebook, the graphRAG3.0.5 uses different library for local Search and global Search. New Step 8 - run agent and MCP tools in Jupyter notebook This step runs Jupyter notebook in docker. > docker compose up mcp-agent Click on the highlighted URL, you can start working on agent-notebook.ipynb. https://github.com/Azure-Samples/postgreSQL-graphRAG-docker/blob/main/project_folder/agent-notebook.... Multiple scenarios of agents with MCP tools are included in the notebook: GraphRAG search: local search and global search examples with direct mcp call. GraphRAG search: local search and global search examples with agent and include mcp tools. Cypher query in direct mcp call. Agent to query in natural language, and mcp tool included to convert NL2Cypher. Agent with unified mcp (all five mcp tools), and based on the question route to the corresponding tool. ['graphrag_search', 'age_get_schema_cached', 'age_cypher_query', 'age_entity_lookup', 'age_nl2cypher_query'] Router agent: selecting the right MCP tool The notebook also includes a router agent that has access to all five MCP tools and decides which one to invoke based on the user’s question. Rather than hard‑coding execution paths, the agent reasons about intent and selects the most appropriate capability at runtime. General routing guidance used in this solution Use [graphrag_search] when the question requires: full dataset understanding, themes, patterns, or trends across documents, exploratory or open‑ended analysis, global understanding or evaluation where we have a corpus of many tokens. In these cases, GraphRAG’s semantic retrieval and aggregation are a better fit than explicit graph traversal. Use AGE‑based tools [age_get_schema_cached, age_entity_lookup, age_cypher_query, age_nl2cypher_query] when the question involves: specific entities or explicit relationships, deterministic graph traversal or filtering, questions that depend on graph structure rather than document semantics, complex graph queries involving multiple entities or multi‑hop paths. Within the AGE toolset: [age_entity_lookup] is typically used for quick entity discovery or disambiguation. [age_cypher_query] is used when a precise Cypher query is already known. [age_nl2cypher_query] is used when the question is expressed in natural language but requires a non‑trivial Cypher query to answer. [age_get_schema_cached] is reserved for schema inspection and diagnostics. The router agent dynamically selects between semantic search and deterministic graph tools based on question intent, keeping retrieval, graph execution, and orchestration clearly separated and extensible. Note: The repository also includes [age_get_schema] and [age_get_schema_details] MCP tools for debugging and development purposes. These are not exposed to agents by default and are superseded by [age_get_schema_cached] for normal use. Key takeaways GraphRAG and postgreSQL AGE querying serve different purposes and each has its advantages. MCP tools provide a uniform interface to both semantic search and deterministic graph operations. Microsoft Agent Framework enables tool‑centric orchestration, where agents select the right capability at runtime instead of hard‑coding logic in prompts. The Jupyter‑based agent workflow makes it easy to experiment with different interaction patterns, from direct tool calls to fully routed agent execution. What's next In this solution, the MCP server and agent runtime are architecturally separated but deployed together in a single Docker container to demonstrate how MCP tools work and to keep local experimentation simple. There are other deployment options, such as running MCP servers remotely, where tools can be hosted and operated independently of the agent runtime. Contributions and enhancements are welcome.240Views1like0CommentsGeneral Availability of Graph Database Support in Azure Database for PostgreSQL
We are excited to announce the general availability of the Apache AGE extension for Azure Database for PostgreSQL! This marks a significant milestone in empowering developers and businesses to harness the potential of graph data directly within their PostgreSQL environments, offering fully managed graph database service. Unlocking Graph Data Capabilities Apache AGE (A Graph Extension) is a powerful PostgreSQL extension. It allows users to store and query graph data within Postgres seamlessly, enabling advanced insights through intuitive graph database queries via the openCypher query language. Graph data is instrumental in applications such as social networks, recommendation systems, fraud detection, network analysis, and knowledge graphs. By integrating Apache AGE into Azure Database for PostgreSQL, developers can now benefit from a unified platform that supports both relational and graph data models, unlocking deeper insights and streamlining data workflows. Benefits of Using Apache AGE in Azure Database for PostgreSQL The integration of Apache AGE (AGE) in Azure Database for PostgreSQL brings numerous benefits to developers and businesses looking to leverage graph processing capabilities: Enterprise-grade Managed Graph Database Service: AGE in Azure Database for PostgreSQL provides a fully managed graph database solution, eliminating infrastructure management while delivering built-in security, updates, and high availability. Simplified Data Management: AGE's ability to integrate graph and relational data simplifies data management tasks, reducing the need for separate graph database solutions. Enhanced Data Analysis: With AGE, you can perform complex graph analyses directly within your PostgreSQL database, gaining deeper insights into relationships and patterns in your data. Cost Efficiency: By utilizing AGE within Azure Database for PostgreSQL, you can consolidate your database infrastructure, lowering overall costs and reducing the complexity of your data architecture. Security and Compliance: Leverage Azure's industry-leading security and compliance features, ensuring your graph data is protected and meets regulatory requirements. Index Support: Index graph properties with BTREE and GIN indexes. Real-World Applications Apache AGE opens up a range of possibilities for graph-powered applications. Here are just a few examples: Social Networks: Model and analyze complex relationships, such as user connections and interactions. Fraud Detection: Identify suspicious patterns and connections in financial transactions. Recommendation Systems: Leverage graph data to deliver personalized product or content recommendations. Knowledge Graphs: Structure facts and concepts as nodes and relationships, enabling AI-driven search and data discovery. In the following example, we need to provide Procurement with an updated status of all statements of work (SOW) by vendor, including their invoice status. With AGE and Postgres, this once complex task becomes quite simple. We’ll start by creating the empty graph. SELECT ag_catalog.create_graph('vendor_graph'); Then, we’ll create all the ‘vendor’ nodes from the vendors table. SELECT * FROM ag_catalog.cypher( 'vendor_graph', $$ UNWIND $rows AS v CREATE (:vendor { id: v.id, name: v.name }) $$, ARRAY( SELECT jsonb_build_object('id', id, 'name', name) FROM vendors ) ); Next, we’ll create all the ‘sow’ nodes. SELECT * FROM ag_catalog.cypher( 'vendor_graph', $$ UNWIND $rows AS s CREATE (:sow { id: s.id, number: s.number }) $$, ARRAY( SELECT jsonb_build_object('id', id, 'number', number) FROM sows ) ); Then, we’ll create the ‘has_invoices’ relationships (edges). SELECT * FROM ag_catalog.cypher( 'vendor_graph', $$ UNWIND $rows AS r MATCH (v:vendor { id: r.vendor_id }) MATCH (s:sow { id: r.sow_id }) CREATE (v)-[:has_invoices { payment_status: r.payment_status, amount: r.invoice_amount }]->(s) $$, ARRAY( SELECT jsonb_build_object( 'vendor_id', vendor_id, 'sow_id', sow_id, 'payment_status', payment_status, 'invoice_amount', amount ) FROM invoices ) ); Now that we’ve completed these steps, we have a fully populated vendor_graph with vendor nodes, sow nodes, and has_invoices edges with the invoice attributes. We’re ready to query the graph to start our report for Procurement. SELECT * FROM ag_catalog.cypher('vendor_graph' , $$ MATCH (v:vendor)-[rel:has_invoices]->(s:sow) RETURN v.id AS vendor_id, v.name AS vendor_name, s.id AS sow_id, s.number AS sow_number, rel.payment_status AS payment_status, rel.amount AS invoice_amount $$) AS graph_query(vendor_id BIGINT, vendor_name TEXT, sow_id BIGINT, sow_number TEXT, payment_status TEXT, invoice_amount FLOAT); This statement invokes Apache AGE’s Cypher engine that treats our graph as a relational table: ag_catalog.cypher('vendor_graph', $$ … $$) executes the Cypher query against the graph named “vendor_graph.” The inner Cypher fragment, MATCH (v:vendor)-[rel:has_invoices]->(s:sow) RETURN v.id AS vendor_id, v.name AS vendor_name, s.id AS sow_id, s.number AS sow_number, rel.payment_status AS payment_status, rel.amount AS invoice_amount finds every vendor node with outgoing has_invoices edges to SOW nodes projects each vendor’s ID/name, the target sow’s ID/number, and invoice attributes. Wrapping that in … ) AS graph_query( vendor_id BIGINT, vendor_name TEXT, sow_id BIGINT, sow_number TEXT, payment_status TEXT, invoice_amount FLOAT ); tells PostgreSQL how to map each returned column into a regular SQL result set with proper types. The result? You get a standard table of rows—one per invoice edge—with those six columns populated and ready for further SQL joins, filters, aggregates, etc. Performance notes for this example: AGE will scan all “vendor–has_invoices–sow” paths in the graph. If the graph is large, consider an index on the vendor or sow label properties or filter by additional predicates. You can also push WHERE clauses into the Cypher fragment for more selective matching. Scaling to Large Graphs with AGE The Apache AGE extension in Azure Database for PostgreSQL enables seamless scaling to large graphs. Indexing plays a pivotal role in enhancing query performance, particularly for complex graph analyses. Effective Indexing Strategies To optimize graph queries, particularly those involving joins or range queries, implementing the following indexes is recommended: BTREE Index: Ideal for exact matches and range queries. For vertex tables, create an index on the unique identifier column (e.g., id). CREATE INDEX ON graph_name."VLABEL" USING BTREE (id); GIN Index: Designed for efficient searches within JSON fields, such as the properties column in vertex tables. CREATE INDEX ON graph_name."VLABEL" USING GIN (properties); Edge Table Indexes: For relationship traversal, use BTREE indexes on start_id and end_id columns. CREATE INDEX ON graph_name."ELABEL" USING BTREE (start_id); CREATE INDEX ON graph_name."ELABEL" USING BTREE (end_id); Example: Targeted Key-Value Indexing For targeted queries that focus on specific attributes within the JSON field, a smaller BTREE index can be created for precise filtering. CREATE INDEX ON graph_name.label_name USING BTREE (agtype_access_operator(VARIADIC ARRAY[properties, '"KeyName"'::agtype])); Using these indexing strategies ensures efficient query execution, even when scaling large graphs. Additionally, leveraging the EXPLAIN command helps validate index utilization and optimize query plans for production workloads. How to Get Started Enabling Apache AGE in Azure Database for PostgreSQL is simple: 1. Update Server Parameters Within the Azure Portal, navigate to the PostgreSQL Flexible Server instance and select the Server Parameters option. Adjust the following settings: azure.extensions: In the parameter filter, search for and enable AGE among the available extensions. shared_preload_libraries: In the parameter filter, search for and enable AGE. Click Save to apply these changes. The server will restart automatically to activate the AGE extension. Note: Failure to enable the shared_preload_libraries will result in the following error when you first attempt to use the AGE schema in a query. “ERROR: unhandled cypher(cstring) function call error on first cypher query” 2. Enable AGE Within PostgreSQL Once the server restart is complete, connect to the PostgreSQL instance using the psql interpreter. Execute the following command to enable AGE: CREATE EXTENSION IF NOT EXISTS AGE CASCADE; 3. Configure Schema Paths AGE adds a schema called ag_catalog, which is essential for handling graph data. Ensure this schema is included in the search path by executing: SET search_path=ag_catalog,"$user",public; That’s it! You’re ready to create your first graph within PostgreSQL on Azure. Ready to dive in? Experience the power of graph data with Apache AGE on Azure Database for PostgreSQL. Visit AGE on Azure Database for PostgreSQL Overview for more details, and explore how this extension can transform your data analysis and application development. Get started for free with an Azure free account1.1KViews2likes6Comments