Blog Post
Enabling SharePoint RAG with LogicApps Workflows
Hi,
Thank you for the write up.
How would you scale this for a SharePoint with about 1 million documents ? Would the costs go through the roof if you had to sync them daily or even weekly ?
Hi, thanks for going through the blog post and your query.
The solution above has 2 workflows - Historic and Ongoing.
For the first time load (e.g. 1 million document scenario), the historic workflow should be used which employs a sliding time window approach to ensure compliance with SharePoint throttling limits by preventing the export of all documents at once, you can customise the sliding window increment to your preference (e.g. 1 hr, 8 hrs, 24 hrs). Once the history export is completed which may take some time, the Ongoing workflow should be enabled (historic should be disabled) to only export documents getting created on ongoing basis.
For cost estimation, consider running workflow on small set of documents (e.g. 1000 or 5000) and extrapolate to get the indicative costs.
Since the blog was posted there have been a number of features we announced that can enable RAG on SharePoint, please check them out.
- Use a SharePoint indexer to ingest permission metadata - Azure AI Search | Microsoft Learn
- SharePoint in Microsoft 365 indexer (preview) - Azure AI Search | Microsoft Learn
- Create a SharePoint (Indexed) Knowledge Source - Azure AI Search | Microsoft Learn
- Create a SharePoint (Remote) Knowledge Source - Azure AI Search | Microsoft Learn
- Add SharePoint as a knowledge source - Microsoft Copilot Studio | Microsoft Learn