Weāre excited to announce the public preview of Azure Logic Apps as a document indexer for Azure Cosmos DB!
With this release, you can now use Logic Apps connectors and templates to ingest documents directly into Cosmos DBās vector storeāpowering AI workloads like Retrieval-Augmented Generation (RAG) with ease.
This new capability orchestrates the full ingestion pipelineāfrom fetching documents to parsing, chunking, embedding, and indexingāallowing you to unlock insights from unstructured content across your enterprise systems.
Check out the announcement from Azure Cosmos team about this capability!
How It Works
Hereās how Logic Apps powers the ingestion flow:
Connect to Source Systems
While Logic Apps has more than 1400+ prebuilt connectors to pull documents from various systems, this experience streamlines the entire process via out of box templates to pull data from sources like Azure Blob Storage.
Parse and Chunk Documents
AI-powered parsing actions extract raw text. Then, the Chunk Document action:
- Tokenizes content into language model-friendly units
- Splits it into semantically meaningful chunks
This ensures optimal size and quality for embedding and retrieval.
Generate Embeddings with Azure OpenAI
The chunks are passed to Azure OpenAI via connector to generate embeddings (e.g., using text-embedding-3-small). These vectors capture the meaning of your content for precise semantic search.
Write to Azure Cosmos DB Vector Store
Embeddings and metadata (like title, tags, and timestamps) are indexed in Cosmos DBās, using a schema optimized for filtering, semantic ranking, and retrieval.
Logic Apps Templates: Fast Start, Full Flexibility
Weāve created ready-to-use templates to help you get started fast:
- š Blob Storage ā Simple Text Parsing
- š§¾ Blob Storage ā OCR with Azure Document Intelligence
- š SharePoint ā Simple Text Parsing
- š§ SharePoint ā OCR with Azure Document Intelligence
Each template is customizableāso you can adapt it to your business needs or expand it with additional steps.
Weād Love Your Feedback
Weāre just getting startedāand weāre building this with you.
Tell us:
- What data sources should we support next?
- Are there specific formats or verticals you need (e.g., legal docs, invoices, contracts)?
- What enhancements would make ingestion even easier?
š Reply to this post or share feedback through this form. Your input shapes the future of AI-powered document indexing in Cosmos DB.