Bulk updating performance

Question

Hi,&nbsp;I am currently working on a migration project from SQL server to SharePoint Lists and SharePoint document libraries. The SQL server store around 1.29 million (1.2 TB) files as filestream objects and are within a folder structure of approx 150 thousand folders.&nbsp;I was able to extract the filestream object to HDD and use the SharePoint migration tool to upload the files to SharePoint libraries. This took approximately 10 hours.&nbsp;Now, my next job is to update a custom attribute on each of the sub folders. For this I have written python code in Jupyter notebook. Here is a snippet -&nbsp;&nbsp;&nbsp;I have made 10 copies of the Jupyter noteboo and split the load so that each notebook updates 15000 records each. The total time per notebook was around 400 minutes. The time was spent in the API call&nbsp;ctxFolderItem.execute_query(). This was called once per record update. Is there an API to update records in a single call? The current performance is&nbsp;abysmal and I need a better approach as I next need to update 1.29 million files again updating some custom columns. If this was a migration from one DB to another, my entire migration would be completed in under 20 minutes. Is there a better approach that I could investigate?&nbsp;Any feedback is appreciated.

nicolaskheirallah · Answer

That's because you're being throttled. Check this to mitigate this:

https://learn.microsoft.com/en-us/sharepoint/dev/general-development/how-to-avoid-getting-throttled-or-blocked-in-sharepoint-online

Also make use of Graph API to preform this task

prathyoo · Answer

NicolasKheirallah&nbsp;&nbsp;I switched execute_query() to&nbsp;execute_query_with_incremental_retry() but that doesn't seem to have improved performance. I noticed that the python library also has an execute_batch function. But I have not been able to figure out the structure of the list or dict that needs to be the input to the load function preceding the call to execute_batch().

Forum Discussion

Bulk updating performance

2 Replies

Resources