Forum Discussion
learnerjane
Jul 15, 2022Copper Contributor
Excel VBA code to break and read text file line by line
Hello,
I have a requirement. The below is a text file data.
In this, I have to ignore the lines starting with HEAD and count the number of lines starting with DET for each data group.
Eg. Count of PersonAddress data group is 2, Count of PersonEmail data group is 2, Count of PersonEmail data group is 2.
I require a Excel VBA code to read line by line and fetch the count of each data group ignoring the lines starting with HEAD. Thanks in advance.
HEAD|PersonAddress|AddressType|AddressLine1|AddressLine2|AddressLine3|TownOrCity|Region1|Region2|Region3|Country|PostalCode
DET|PersonAddress|HOME|1234 Cabrera Hill Lane|#NULL|#NULL|aabbcc|Fort Bend|TX|#NULL|US|77494
DET|PersonAddress|HOME|1234 Cabrera Hill Lane|#NULL|#NULL|aabbcc|Fort Bend|TX|#NULL|US|77494
HEAD|PersonEmail|EmailType|EmailAddress
DET|PersonEmail|W1|email address removed for privacy reasons
DET|PersonEmail|W1|email address removed for privacy reasons
HEAD|PersonPhone|PhoneType|CountryCodeNumber|AreaCode|PhoneNumber
DET|PersonPhone|HM|1|123|45678
DET|PersonPhone|HM|1|123|45678
Here you go.
Sub ProcessTextFile() Dim sFile As String Dim f As Integer Dim sLines As String Dim aLines() As String Dim aLine() As String Dim i As Long Dim n As Long Dim sTitle As String ' *** Change sFile to the correct path and filename *** sFile = "Test.txt" f = FreeFile Open sFile For Input As #f sLines = Input(LOF(f), #f) Close #f aLines = Split(sLines, vbCrLf) For i = 0 To UBound(aLines) If aLines(i) <> "" Then aLine = Split(aLines(i), "|") Select Case aLine(0) Case "HEAD" If n > 0 Then Debug.Print "Group " & sTitle & " has " & n & " items" End If sTitle = aLine(1) n = 0 Case "DET" n = n + 1 End Select End If Next i Debug.Print "Group " & sTitle & " has " & n & " items" End Sub
- PeterBartholomew1Silver Contributor
I realise this is not an answer to the OP but, since HansVogelaar has written a significant block of VBA, I set out to see whether a similar result could be obtained using worksheet formulae. The worst limitation is that most Lambda functions do not work correctly with 'nested arrays' which is something of a pain when the whole purpose of the calculation is to create an array of arrays. Nevertheless, in this case
CountByTypeλ = LAMBDA(aLines, LET( type, MAP(aLines,IdentifyTypeλ), filtered, FILTER(type, ISTEXT(type)), distinct, UNIQUE(filtered), count, MAP(distinct, LAMBDA(type, COUNT(IF(filtered = type, 1)))), HSTACK(distinct, count) ) ); IdentifyTypeλ = LAMBDA(aLine, LET( split, IFERROR(TEXTSPLIT(aLine, "|"), "HEAD"), IF(INDEX(split, 1, 1) <> "HEAD", INDEX(split, 1, 2)) ) );
My thought on this are, whilst Lambda functions to not replace either VBA or DAX, they are capable of making significant inroads for problems of modest dimension.
Note: the formula requires the text file to be imported to Excel which is not what was requested. Power query would probably be my tool of choice for the data import.
Here you go.
Sub ProcessTextFile() Dim sFile As String Dim f As Integer Dim sLines As String Dim aLines() As String Dim aLine() As String Dim i As Long Dim n As Long Dim sTitle As String ' *** Change sFile to the correct path and filename *** sFile = "Test.txt" f = FreeFile Open sFile For Input As #f sLines = Input(LOF(f), #f) Close #f aLines = Split(sLines, vbCrLf) For i = 0 To UBound(aLines) If aLines(i) <> "" Then aLine = Split(aLines(i), "|") Select Case aLine(0) Case "HEAD" If n > 0 Then Debug.Print "Group " & sTitle & " has " & n & " items" End If sTitle = aLine(1) n = 0 Case "DET" n = n + 1 End Select End If Next i Debug.Print "Group " & sTitle & " has " & n & " items" End Sub
- learnerjaneCopper Contributor
HansVogelaar below is the code which I modified to print the output in excel columns. Tried various things but in vain. Could you please help in getting the output like the one in the attached image? Note that I'm selecting multiple text files to read.
Current output:
Issue: The file name in second row should be test-Copy.dat
Desired Output:
Code:
FileToOpen = Application.GetOpenFilename _ (Filefilter:="HDL Files (*.dat), *.dat", Title:="Select HDL Files", MultiSelect:=True) r = 2 If IsArray(FileToOpen) Then For FileCnt = 1 To UBound(FileToOpen) Filename = fso.getfilename(FileToOpen(FileCnt)) f = FreeFile Open FileToOpen(FileCnt) For Input As #f sLines = Input(LOF(f), #f) Close #f If InStr(sLines, vbCrLf) Then delim = vbCrLf ElseIf InStr(sLines, vbLf) Then delim = vbLf End If aLines = Split(sLines, delim) For i = 0 To UBound(aLines) If aLines(i) <> "" Then aLine = Split(aLines(i), "|") Select Case aLine(0) Case "HEAD" If n > 0 Then 'r = r + 1 Debug.Print "File Count: " & FileCnt Debug.Print r & "row" Debug.Print "In " & Filename & ", Group " & sTitle & " has " & n & " items" With shHDLRecordCountReport .Range("A" & r).Value = Filename .Range("B" & r).Value = sTitle .Range("C" & r).Value = n .UsedRange.Columns.AutoFit End With r = r + 1 End If sTitle = aLine(1) n = 0 Case "DET" n = n + 1 End Select End If Next i Debug.Print "Bottom" Debug.Print "File Count: " & FileCnt Debug.Print r & "row" Debug.Print "In " & Filename & ", Group " & sTitle & " has " & n & " items" With shHDLRecordCountReport .Range("A" & r).Value = Filename .Range("B" & r).Value = sTitle .Range("C" & r).Value = n .UsedRange.Columns.AutoFit End With Next FileCnt MsgBox "Report is ready", vbInformation, "Good Job" End If
Does this do what you want?
FileToOpen = Application.GetOpenFilename _ (Filefilter:="HDL Files (*.dat), *.dat", Title:="Select HDL Files", MultiSelect:=True) r = 1 If IsArray(FileToOpen) Then For FileCnt = 1 To UBound(FileToOpen) FileName = FSO.GetFileName(FileToOpen(FileCnt)) f = FreeFile Open FileToOpen(FileCnt) For Input As #f sLines = Input(LOF(f), #f) Close #f If InStr(sLines, vbCrLf) Then delim = vbCrLf ElseIf InStr(sLines, vbLf) Then delim = vbLf End If aLines = Split(sLines, delim) For i = 0 To UBound(aLines) If aLines(i) <> "" Then aLine = Split(aLines(i), "|") Select Case aLine(0) Case "HEAD" If n > 0 Then r = r + 1 Debug.Print "File Count: " & FileCnt Debug.Print r & "row" Debug.Print "In " & FileName & ", Group " & sTitle & " has " & n & " items" With shHDLRecordCountReport .Range("A" & r).Value = FileName .Range("B" & r).Value = sTitle .Range("C" & r).Value = n .UsedRange.Columns.AutoFit End With End If sTitle = aLine(1) n = 0 Case "DET" n = n + 1 End Select End If Next i r = r + 1 Debug.Print "Bottom" Debug.Print "File Count: " & FileCnt Debug.Print r & "row" Debug.Print "In " & FileName & ", Group " & sTitle & " has " & n & " items" With shHDLRecordCountReport .Range("A" & r).Value = FileName .Range("B" & r).Value = sTitle .Range("C" & r).Value = n .UsedRange.Columns.AutoFit End With n = 0 Next FileCnt MsgBox "Report is ready", vbInformation, "Good Job" End If
- learnerjaneCopper ContributorWorked like a charm Thanks a lot!