Excel Import

Copper Contributor

I am working with a text file with approximately 1.5 million rows. I am not able to import directly into SSMS due to some errors in the format (unclosed quotation marks), so am trying to use Excel to import into a format that I can save and import to SSMS. Due to the large size, Excel only allows me to use the PowerQuery feature and import the file to a Data Model. How can I save the query/view/data in order to import into SSMS? thank you!

1 Reply

@MichelleK365 << Excel only allows me to use the PowerQuery… >>

But Excel also includes VBA.  So if your file is not larger than 2 GB, you can use VBA code to load the content to Excel (as that seems to be desirable) or just split the content with unbalanced quotes out from the rest into two different files.

 

Here's a procedure that writes the content to Main and Errors worksheets.  If you have more than 1,048,576 good lines (or error lines, I hope not), I included a couple commented-out lines of code to allow you to work with multiple passes of the data:

Sub LoadBigFile()
'   This procedure loads lines from a text file to one of two worksheets:
'       if the line contains no quotes or an even number of qutoes, it
'       is loaded to the main worksheet; if it contains an odd number of
'       quotes, it is loaded to the errors worksheet.

    Const QUOTE = """"
    
    Dim in2VBFileNum    As Integer
    
    Dim strLine     As String
    Dim in4Line     As Long
    Dim in4CharPosn As Long
    Dim in2Quotes   As Integer
    '
    Dim in4GoodLines    As Long
    Dim in4Errors   As Long
    
    '----   Process the file.
    '  --   Don't let Excel screen updating slow processing down:
    Application.ScreenUpdating = False
    '  --
    in2VBFileNum = FreeFile(0)
    Open "F:\Temp\BigFile.txt" For Input As in2VBFileNum
    Do Until EOF(in2VBFileNum)
        '  --   Read a line:
        Line Input #in2VBFileNum, strLine
        in4Line = in4Line + 1
        '  --
        'If in4Line > 750000 Then Exit Do
        'If in4Line <= 750000 Then GoTo NextLine
        
        '  --   Examine the content of the line:
        If InStr(1, strLine, QUOTE) = 0 Then
            '(An efficient way of determining that...)
            'The line contains no quotes.
            in2Quotes = 0
        Else
            'Count the quotes:
            in2Quotes = 0
            For in4CharPosn = 1 To Len(strLine)
                If Mid$(strLine, in4CharPosn, 1) = QUOTE Then
                    in2Quotes = in2Quotes + 1
                End If
            Next in4CharPosn
        End If
        '  --   Output this line, depending on that count:
        If in2Quotes Mod 2 = 1 Then
            in4Errors = in4Errors + 1
            Debug.Assert in4Errors <= 1048576
            With Sheets("Errors")
                .Cells(in4Errors, 1) = in4Line
                .Cells(in4Errors, 2) = strLine
            End With
        Else
            in4GoodLines = in4GoodLines + 1
            Debug.Assert in4GoodLines <= 1048576
            With Sheets("Main")
                .Cells(in4GoodLines, 1) = in4Line
                .Cells(in4GoodLines, 2) = strLine
            End With
        End If
NextLine:
    Loop
    Close in2VBFileNum
    '  --   Restore Excel screen updating:
    Application.ScreenUpdating = False
    
    '----   Report results:
    Call MsgBox("Processed " & Format$(in4Line, "##,###,##0") & " lines, with " _
            & Format$(in4Errors, "##,###,##0") & " unbalanced quotes" _
            , vbInformation)

End Sub

(You don't even have to save the result as a macro-enabled workbook; you can save the workbook without the VBA if it runs satisfactorily.)