Yesterday I had a discussion with a customer and realized that the BCP, TDS pattern could use a bit of documentation.
Customer Scenario/Question:I am using BCP.exe to import a large data set. BCP.exe takes 15 minutes to complete but I only see a few seconds for the BULK INSERT command when monitoring dm_exec_requests. What is taking so long if the BULK INSERT is only showing a few seconds in dm_exec_requests?
The customer was correct in her observation. The BCP.exe takes 15 minutes and dm_exec_requests only shows a few seconds for the BULK INSERT command. What she overlooked was the repeated nature of the BULK INSERT command.
The BCP.exe client honors the batch size and each batch submits the internal command ‘INSERT BULK’, which you can see in a trace of your SQL Server and shows up as BULK INSERT in the CMD column of dm_exec_requests. The pattern of activity is the following:
Begin Transaction Insert Bulk Stream the Rows Commit Transaction
— Repeat for all rows/batches
BCP.exe submits commands over the TDS (Tabular Data Stream) connection to the SQL Server. Using XEvent you can trace the various events (Connect, Disconnect, Transaction and Batch) and observe the behavior.
What the customer was observing is the BULK INSERT command in dm_exec_requests. BULK INSERT only lasts for a second or so as the rows are streamed. The session transitions to commit transaction and as the next batch is submitted the BULK INSERT returns to dm_exec_requests for the same session.
Leveraging the BCP, batch size parameter changes the number of INSERT BULK and Commit Transaction pairs. For example, placing all rows in the same batch yields the behavior the customer is expecting, showing a 15 minute BULK INSERT in dm_exec_requests.
Use caution when adjusting the BCP batch size. For example, the appropriate locks are held with each Insert Bulk transaction and can impact the concurrency.