Forum Discussion
Sandeeep
Oct 21, 2022Brass Contributor
VBA Optimization help for Large string manipulation related to memory management?
TLDR;
The issue I need to solve.
How do I optimize the code, so the runtime for each 250 is approximately the same?
i.e, about say 15 secs? consistently from 0 - 35,000 conversations.
and even if I re-run the same macro, it'll take about the same time, and not keep increasing as the code keep running.
What is my code about?
As a teacher (I am not but as an example), I have say 100 students, whom I have told to provide screenplay writing, which I have in each cell.
So for example in cell A1, the data is
Mary: good morning
John: Hello!, nice to meet you
Mary: The weather is great today, do you want to go to the movies and hang out?
John: Yes, I would like that.
and cell a2, the data is another large screenwriting play.
Now my macro creates 12 columns, each with unique useful info, like what are all the stuff Mary said, who spoke first, how many time did each person speak, what happened at the end, the last three sentences used, etc etc.
With the help online, and by asking questions here, with the help of you guys. I finally did it!!! Yay!.
The macro now runs for 25,000 such plays, gathering a frick-ton of information for my work.
What I did and observed with my VBA Code,
Now is the runtime problem, and I assume memory management problem.
So this code works great for about 500-1000 conversations.
but it started taking SUPER long.
I thus wrote a timer. (sample code)
Dim StartTime As Double
Dim LapTime As Double
Dim PrevTimer As Double
Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done - " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
This helped me know the time difference between each set of conversations.
What I did was make a for loop, and take 250 conversations at a time.
(sample code)
While Beginrow < Total_Conv_Count
If Beginrow + 250 > Total_Conv_Count Then
Lastrow = Total_Conv_Count
Else
Lastrow = Beginrow + 250
End If
Call My_CustomScript(Beginrow, Lastrow)
Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done - " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
Wend
I noticed when I first boot my Laptop.
For 250 conversations, it takes 11 secs
For 2,000 conversations, it takes 1.30 secs
Kay. I stopped it (ESC key), and made a change to VBA, etc etc.
Ran the VBA code again.
For 250 conversations, it takes 32 secs
For 2,000 conversations, it takes 3.24 secs
At around 8,000 conversations on my first run, it started taking 2 mins for every 250 conversations.
At around 9,000 conversations, the same run, the VBA code started taking 5 mins for every 250 conversations.
Great, I needed to Optimize.
So I went to this website https://vbacompiler.com/optimize-vba-code/
As well as googled a lot. I did what they all recommended, which is basically the same.
Now I use the status bar updating, cause I disabled everything else.
Application.StatusBar = Beginrow & " - " & Lastrow & " is Done - Time Elapsed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
Cool, it runs faster now (or so I thought)
Cause I had just rebooted and restarted my laptop.
I find, that although faster, it still increases in time.
So the first run is faster than the third (re-running the same macro)
and the first 1,000 conversations is faster than the next subsequent 1,000
The times I recorded are, in seconds (for each 250)
11 - 13 - 15 - 16 - 17 - ............ - (at 2,500) 1min - 1.30min - 2min .......
Sigh, no progress. idky why.
I also tried
Set Search_Range = Nothing
Erase data_array
at the end of each private subroutine, for each 250, once done. I will clear the memory. No luck.
The issue I need to solve.
How do I optimize the code, so the runtime for each 250 is approximately the same?
i.e, about say 15 secs? consistently from 0 - 35,000 conversations.
and even if I re-run the same macro, it'll take about the same time, and not keep increasing as the code keep running.
Workaround solution I am currently using.
Split the 30,000 conversations into 5,000 each, thus making 6 sheets in the same workbook.
Run the macro for the first sheet.
Then reboot the Laptop, open the Excel workbook again, and do the second sheet - Repeat.
Finally, once all is done, copy and paste all 6 sheets into a single sheet by manually appending the data.
Clearly, I want to avoid this workaround.
- SandeeepBrass Contributor
I'm adding the code. (but removing almost every sensitive data I can)
Also, a related question of mine here, solve my Hans (file share was also something he made)
Sub SomethingCode() ' Main MasterCode Macro ' Obtains ' Main Macro ' ' Keyboard Shortcut: Ctrl+Shift+E ' ' SomeCode for Data Extraction .... with Colors and everything! Yay! ActiveSheet.Range("A1").Select Dim Beginrow As Integer Dim Lastrow As Integer Dim StartTime As Double Dim MinutesElapsed As String StartTime = Timer Total_Conv_Count = Cells(Rows.Count, "A").End(xlUp).Row - 1 Beginrow = 1 Columns("I:I").ColumnWidth = 28 Cells(2, 9).Value2 = "First 250 Starting" Cells(3, 9).Value2 = "100 Conversations Takes 1min" Call SomethingCode_Headers While Beginrow < Total_Conv_Count If Beginrow + 250 > Total_Conv_Count Then Lastrow = Total_Conv_Count Else Lastrow = Beginrow + 250 End If Call SomethingCode_Script(Beginrow, Lastrow) ' SubRoutine for the Script 'ThisWorkbook.Save Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done!" Beginrow = Lastrow + 1 Cells(4, 9) = "Time Elapsed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss") Wend Cells(3, 9).Value2 = "Now Coloring and Beautifying" Call ColorFormatting ' Subroutine for Making the whole thing Beautiful Cells(5, 9) = "Time Completed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss") ActiveSheet.Range("I2").Select End Sub
Then I call the Sub-routine Private SomethingCode()
(Ton of info is removed, I have used ?????????? to mask stuff)
Sub SomethingCode_Script(Beginrow As Integer, Lastrow As Integer) ' ' Obtains Data Extracted ' Main ' ' ' ActiveSheet.Range("A1").Select Dim midText() As String Lastrow = Cells(Rows.Count, "A").End(xlUp).Row - 1 Set Search_Range = Range(Range("H1").Offset(Beginrow, 0), Range("H1").Offset(Lastrow, 0)) 'Set idk_Range = Range(Range("AA3"), Range("AA3").End(xlDown)) ' Has to be after the cells are populated ' Code Block for Step Counter of Conversations Count - Column JKL For Each cell In Search_Range cell(1, 3).Value2 = UBound(Split(cell, "||")) + 1 ' - Column J cell(1, 4).Value2 = UBound(Split(cell, "||?????????")) + 1 ' - Column K cell(1, 5).Value2 = cell(1, 3).Value2 - cell(1, 4).Value2 ' - Column L ' Empty the cells if we re-run the same Macro If IsEmpty(cell(1, 3).Value2) = False Then ' If IsEmpty(cell(1, 19).Value2) = False Or IsEmpty(cell(1, 18).Value2) = False Then cell(1, 8).Clear cell(1, 13).Clear cell(1, 15).Clear cell(1, 18).Clear cell(1, 19).Clear End If ' Code Block for ????????????? - Column R If InStr(1, cell, "?????????????") <> 0 Then cell(1, 11) = "!!!!!!!!!!!!!!!!!!" ElseIf InStr(1, cell, "@@@@@@@@@@@@@@@@@") <> 0 Then cell(1, 11) = "###########" ElseIf InStr(1, cell, "$$$$$$$$$$$$$$$$$$") <> 0 _ Or InStr(1, cell, "^^^^^^^^^^^^^") <> 0 Then cell(1, 11) = "(((((((((((((((" Else cell(1, 11) = "%&%$%^$^%$%^&^*&^(&" End If Next cell ' Code Block for Iterating over every Sentence in a conversation. For Each cell In Search_Range midText = Split(cell, "||") Counter = 0 SnippetStart = 0 ' Code Block for First Sentence via delimiter. - Column NYZ For i = LBound(midText) To UBound(midText) If InStr(1, midText(i), "?????????????") > 0 Then cell(1, 19).Value2 = cell(1, 19).Value2 & ":" & midText(i) & vbLf ' ????????????? Else cell(1, 18).Value2 = cell(1, 18).Value2 & ":" & midText(i) & vbLf ' All ????????????? If Counter = 0 Then ' If Block for First sentence cell(1, 7) = midText(i) Counter = 1 End If End If ' Find ????????????? - Columns TV If InStr(1, midText(i), "WHAT!", vbTextCompare) > 0 Then ' what came After sentence to WHAT! cell(1, 13).Value2 = cell(1, 13).Value2 & midText(i - 1) & vbLf ElseIf InStr(1, midText(I), "????????????? Ton of ElseIFs for different statements removed") cell(1, 15).Value2 = cell(1, 15).Value2 & midText(i - 1) & vbLf ' previous sentence End If Next ' Code Block ????????????? - Columns UW If UBound(Split(cell(1, 13).Value2, vbLf)) > 0 Then cell(1, 14) = UBound(Split(cell(1, 13).Value2, vbLf)) Else cell(1, 14).Clear End If If UBound(Split(cell(1, 15).Value2, vbLf)) > 0 Then cell(1, 16) = UBound(Split(cell(1, 15).Value2, vbLf)) Else cell(1, 16).Clear End If ' Code Block for Last 3 sentences-ish kinda ????????????? - Column OPQ If InStr(1, cell, "?????????????") = 0 Then For i = UBound(midText) To LBound(midText) Step -1 ' Reading the Array backwards If InStr(1, midText(i), "?????????????") = 0 Then If InStr(1, midText(i - 1), "?????????????") > 0 Then ' Checking if the word ?????????????is there, if not, it is ????????????? SnippetStart = i - 1 ' Then this is where the snippet starts Exit For ' So exit the for loop, we have the starting point End If End If Next Else For i = UBound(midText) To LBound(midText) Step -1 ' Reading the Array backwards If InStr(1, midText(i), "?????????????") > 0 Then For j = i To LBound(midText) Step -1 ' Reading the Array backwards ????????????? If InStr(1, midText(j), "?????????????") = 0 Then If InStr(1, midText(j - 1), "?????????????") > 0 Then ' ????????????? SnippetStart = j - 1 ' Then this is where the snippet starts Exit For ' So exit the for loop, we have the starting point End If End If Next End If Next End If For i = SnippetStart To UBound(midText) cell(1, 8).Value2 = cell(1, 8).Value2 & midText(i) & vbLf ' Final Snippet cell(1, 9).Value2 = midText(SnippetStart) ' ????????????? what caused the ending cell(1, 10).Value2 = midText(SnippetStart + 1) ' ????????????? Response Next ' Code Block for Date & Week Manipulation for student submission '(not yet added) ' End the Iteration Next cell ' Memory Management Set Search_Range = Nothing Erase midText End Sub
ColorFormatting subroutine is fine, no problems there.
and SomethingCode_Headers() is just a subroutine to add text to row1
Private Sub SomethingCode_Headers() ' Code Block for Header Titles Range("I1").Value2 = "null - 2 Progress Bar" ' Tons of others in the same way here, till cell AH1 End sub
Apologizes for most stuff being removed. Hope this gives an idea-ish?
- JKPieterseSilver ContributorIf you want us to comment on the performance of your code we need, well, your code. And some nonsense but relevant sample data.
- SandeeepBrass ContributorKay, kinda sent it.
I have to remove tons of stuff tho. But should give a rough idea.- JKPieterseSilver Contributor
If you need this speeded up, the best thing to do is process everything in memory first, by using arrays. Then after it has all been worked through, push out the entire array to the worksheet in a single line of code. This may be orders of magnitudes faster that what you currently have.
But a very simple first step would be to turn off screen-updating and calculation prior to your loop and turn those back on afterwards:
'...Code Application.ScreenUpdating = False Application.Calculation = xlCalculationManual 'Loop here Application.ScreenUpdating = True Application.Calculation = xlCalculationAutomatic 'Some more code