Forum Discussion

learnerjane's avatar
learnerjane
Copper Contributor
Jul 15, 2022
Solved

Excel VBA code to break and read text file line by line

 

Hello,

 

I have a requirement. The below is a text file data.

 

In this, I have to ignore the lines starting with HEAD and count the number of lines starting with DET for each data group.

Eg. Count of PersonAddress data group is 2, Count of PersonEmail data group is 2, Count of PersonEmail data group is 2.

 

I require a Excel VBA code to read line by line and fetch the count of each data group ignoring the lines starting with HEAD. Thanks in advance.

 

HEAD|PersonAddress|AddressType|AddressLine1|AddressLine2|AddressLine3|TownOrCity|Region1|Region2|Region3|Country|PostalCode

DET|PersonAddress|HOME|1234 Cabrera Hill Lane|#NULL|#NULL|aabbcc|Fort Bend|TX|#NULL|US|77494

DET|PersonAddress|HOME|1234 Cabrera Hill Lane|#NULL|#NULL|aabbcc|Fort Bend|TX|#NULL|US|77494

HEAD|PersonEmail|EmailType|EmailAddress

DET|PersonEmail|W1|email address removed for privacy reasons

DET|PersonEmail|W1|email address removed for privacy reasons

HEAD|PersonPhone|PhoneType|CountryCodeNumber|AreaCode|PhoneNumber

DET|PersonPhone|HM|1|123|45678

DET|PersonPhone|HM|1|123|45678

 

 

  • learnerjane 

    Here you go.

    Sub ProcessTextFile()
        Dim sFile As String
        Dim f As Integer
        Dim sLines As String
        Dim aLines() As String
        Dim aLine() As String
        Dim i As Long
        Dim n As Long
        Dim sTitle As String
        ' *** Change sFile to the correct path and filename ***
        sFile = "Test.txt"
        f = FreeFile
        Open sFile For Input As #f
        sLines = Input(LOF(f), #f)
        Close #f
        aLines = Split(sLines, vbCrLf)
        For i = 0 To UBound(aLines)
            If aLines(i) <> "" Then
                aLine = Split(aLines(i), "|")
                Select Case aLine(0)
                    Case "HEAD"
                        If n > 0 Then
                            Debug.Print "Group " & sTitle & " has " & n & " items"
                        End If
                        sTitle = aLine(1)
                        n = 0
                    Case "DET"
                        n = n + 1
                End Select
            End If
        Next i
        Debug.Print "Group " & sTitle & " has " & n & " items"
    End Sub
  • learnerjane 

    I realise this is not an answer to the OP but, since HansVogelaar has written a significant block of VBA, I set out to see whether a similar result could be obtained using worksheet formulae.  The worst limitation is that most Lambda functions do not work correctly with 'nested arrays' which is something of a pain when the whole purpose of the calculation is to create an array of arrays.  Nevertheless, in this case

     

    CountByTypeλ = LAMBDA(aLines,
        LET(
            type,     MAP(aLines,IdentifyTypeλ),
            filtered, FILTER(type, ISTEXT(type)),
            distinct, UNIQUE(filtered),
            count,    MAP(distinct, LAMBDA(type, COUNT(IF(filtered = type, 1)))),
            HSTACK(distinct, count)
        )
    );
    
    IdentifyTypeλ = LAMBDA(aLine,
        LET(
            split, IFERROR(TEXTSPLIT(aLine, "|"), "HEAD"),
            IF(INDEX(split, 1, 1) <> "HEAD", INDEX(split, 1, 2))
        )
    );

     

    My thought on this are, whilst Lambda functions to not replace either VBA or DAX, they are capable of making significant inroads for problems of modest dimension.

    Note: the formula requires the text file to be imported to Excel which is not what was requested.  Power query would probably be my tool of choice for the data import.

  • learnerjane 

    Here you go.

    Sub ProcessTextFile()
        Dim sFile As String
        Dim f As Integer
        Dim sLines As String
        Dim aLines() As String
        Dim aLine() As String
        Dim i As Long
        Dim n As Long
        Dim sTitle As String
        ' *** Change sFile to the correct path and filename ***
        sFile = "Test.txt"
        f = FreeFile
        Open sFile For Input As #f
        sLines = Input(LOF(f), #f)
        Close #f
        aLines = Split(sLines, vbCrLf)
        For i = 0 To UBound(aLines)
            If aLines(i) <> "" Then
                aLine = Split(aLines(i), "|")
                Select Case aLine(0)
                    Case "HEAD"
                        If n > 0 Then
                            Debug.Print "Group " & sTitle & " has " & n & " items"
                        End If
                        sTitle = aLine(1)
                        n = 0
                    Case "DET"
                        n = n + 1
                End Select
            End If
        Next i
        Debug.Print "Group " & sTitle & " has " & n & " items"
    End Sub
    • learnerjane's avatar
      learnerjane
      Copper Contributor

      HansVogelaar below is the code which I modified to print the output in excel columns. Tried various things but in vain. Could you please help in getting the output like the one in the attached image? Note that I'm selecting multiple text files to read.

      Current output:

      Issue: The file name in second row should be test-Copy.dat

       

      Desired Output:

      Code:

      FileToOpen = Application.GetOpenFilename _
          (Filefilter:="HDL Files (*.dat), *.dat", Title:="Select HDL Files", MultiSelect:=True)
          
          r = 2
          If IsArray(FileToOpen) Then
              For FileCnt = 1 To UBound(FileToOpen)
                  Filename = fso.getfilename(FileToOpen(FileCnt))
                  f = FreeFile
                  Open FileToOpen(FileCnt) For Input As #f
                  sLines = Input(LOF(f), #f)
                  Close #f
                  
                  If InStr(sLines, vbCrLf) Then
                      delim = vbCrLf
                  ElseIf InStr(sLines, vbLf) Then
                      delim = vbLf
                  End If
                  
                  aLines = Split(sLines, delim)
                  
                  For i = 0 To UBound(aLines)
                      If aLines(i) <> "" Then
                          aLine = Split(aLines(i), "|")
                          Select Case aLine(0)
                              Case "HEAD"
                                  If n > 0 Then
                                      'r = r + 1
                                      Debug.Print "File Count: " & FileCnt
                                      Debug.Print r & "row"
                                      Debug.Print "In " & Filename & ", Group " & sTitle & " has " & n & " items"
                                      With shHDLRecordCountReport
                                          .Range("A" & r).Value = Filename
                                          .Range("B" & r).Value = sTitle
                                          .Range("C" & r).Value = n
                                          .UsedRange.Columns.AutoFit
                                      End With
                                      r = r + 1
                                  End If
                                  sTitle = aLine(1)
                                  n = 0
                              Case "DET"
                                  n = n + 1
                          End Select
                      End If
                  Next i
                  Debug.Print "Bottom"
                  Debug.Print "File Count: " & FileCnt
                  Debug.Print r & "row"
                  Debug.Print "In " & Filename & ", Group " & sTitle & " has " & n & " items"
                  With shHDLRecordCountReport
                      .Range("A" & r).Value = Filename
                      .Range("B" & r).Value = sTitle
                      .Range("C" & r).Value = n
                      .UsedRange.Columns.AutoFit
                  End With
              Next FileCnt
              MsgBox "Report is ready", vbInformation, "Good Job"
          End If

       

      • HansVogelaar's avatar
        HansVogelaar
        MVP

        learnerjane 

        Does this do what you want?

            FileToOpen = Application.GetOpenFilename _
               (Filefilter:="HDL Files (*.dat), *.dat", Title:="Select HDL Files", MultiSelect:=True)
            
            r = 1
            If IsArray(FileToOpen) Then
                For FileCnt = 1 To UBound(FileToOpen)
                    FileName = FSO.GetFileName(FileToOpen(FileCnt))
                    f = FreeFile
                    Open FileToOpen(FileCnt) For Input As #f
                    sLines = Input(LOF(f), #f)
                    Close #f
                    
                    If InStr(sLines, vbCrLf) Then
                        delim = vbCrLf
                    ElseIf InStr(sLines, vbLf) Then
                        delim = vbLf
                    End If
                    
                    aLines = Split(sLines, delim)
                    
                    For i = 0 To UBound(aLines)
                        If aLines(i) <> "" Then
                            aLine = Split(aLines(i), "|")
                            Select Case aLine(0)
                                Case "HEAD"
                                    If n > 0 Then
                                        r = r + 1
                                        Debug.Print "File Count: " & FileCnt
                                        Debug.Print r & "row"
                                        Debug.Print "In " & FileName & ", Group " & sTitle & " has " & n & " items"
                                        With shHDLRecordCountReport
                                            .Range("A" & r).Value = FileName
                                            .Range("B" & r).Value = sTitle
                                            .Range("C" & r).Value = n
                                            .UsedRange.Columns.AutoFit
                                        End With
                                    End If
                                    sTitle = aLine(1)
                                    n = 0
                                Case "DET"
                                    n = n + 1
                            End Select
                        End If
                    Next i
                    r = r + 1
                    Debug.Print "Bottom"
                    Debug.Print "File Count: " & FileCnt
                    Debug.Print r & "row"
                    Debug.Print "In " & FileName & ", Group " & sTitle & " has " & n & " items"
                    With shHDLRecordCountReport
                        .Range("A" & r).Value = FileName
                        .Range("B" & r).Value = sTitle
                        .Range("C" & r).Value = n
                        .UsedRange.Columns.AutoFit
                    End With
                    n = 0
                Next FileCnt
                MsgBox "Report is ready", vbInformation, "Good Job"
            End If

Resources