Feb 08 2023 07:48 AM
I am working with a text file with approximately 1.5 million rows. I am not able to import directly into SSMS due to some errors in the format (unclosed quotation marks), so am trying to use Excel to import into a format that I can save and import to SSMS. Due to the large size, Excel only allows me to use the PowerQuery feature and import the file to a Data Model. How can I save the query/view/data in order to import into SSMS? thank you!
Feb 14 2023 06:11 AM
@MichelleK365 << Excel only allows me to use the PowerQuery… >>
But Excel also includes VBA. So if your file is not larger than 2 GB, you can use VBA code to load the content to Excel (as that seems to be desirable) or just split the content with unbalanced quotes out from the rest into two different files.
Here's a procedure that writes the content to Main and Errors worksheets. If you have more than 1,048,576 good lines (or error lines, I hope not), I included a couple commented-out lines of code to allow you to work with multiple passes of the data:
Sub LoadBigFile()
' This procedure loads lines from a text file to one of two worksheets:
' if the line contains no quotes or an even number of qutoes, it
' is loaded to the main worksheet; if it contains an odd number of
' quotes, it is loaded to the errors worksheet.
Const QUOTE = """"
Dim in2VBFileNum As Integer
Dim strLine As String
Dim in4Line As Long
Dim in4CharPosn As Long
Dim in2Quotes As Integer
'
Dim in4GoodLines As Long
Dim in4Errors As Long
'---- Process the file.
' -- Don't let Excel screen updating slow processing down:
Application.ScreenUpdating = False
' --
in2VBFileNum = FreeFile(0)
Open "F:\Temp\BigFile.txt" For Input As in2VBFileNum
Do Until EOF(in2VBFileNum)
' -- Read a line:
Line Input #in2VBFileNum, strLine
in4Line = in4Line + 1
' --
'If in4Line > 750000 Then Exit Do
'If in4Line <= 750000 Then GoTo NextLine
' -- Examine the content of the line:
If InStr(1, strLine, QUOTE) = 0 Then
'(An efficient way of determining that...)
'The line contains no quotes.
in2Quotes = 0
Else
'Count the quotes:
in2Quotes = 0
For in4CharPosn = 1 To Len(strLine)
If Mid$(strLine, in4CharPosn, 1) = QUOTE Then
in2Quotes = in2Quotes + 1
End If
Next in4CharPosn
End If
' -- Output this line, depending on that count:
If in2Quotes Mod 2 = 1 Then
in4Errors = in4Errors + 1
Debug.Assert in4Errors <= 1048576
With Sheets("Errors")
.Cells(in4Errors, 1) = in4Line
.Cells(in4Errors, 2) = strLine
End With
Else
in4GoodLines = in4GoodLines + 1
Debug.Assert in4GoodLines <= 1048576
With Sheets("Main")
.Cells(in4GoodLines, 1) = in4Line
.Cells(in4GoodLines, 2) = strLine
End With
End If
NextLine:
Loop
Close in2VBFileNum
' -- Restore Excel screen updating:
Application.ScreenUpdating = False
'---- Report results:
Call MsgBox("Processed " & Format$(in4Line, "##,###,##0") & " lines, with " _
& Format$(in4Errors, "##,###,##0") & " unbalanced quotes" _
, vbInformation)
End Sub
(You don't even have to save the result as a macro-enabled workbook; you can save the workbook without the VBA if it runs satisfactorily.)