Oct 21 2022 05:13 AM
TLDR;
The issue I need to solve.
How do I optimize the code, so the runtime for each 250 is approximately the same?
i.e, about say 15 secs? consistently from 0 - 35,000 conversations.
and even if I re-run the same macro, it'll take about the same time, and not keep increasing as the code keep running.
What is my code about?
As a teacher (I am not but as an example), I have say 100 students, whom I have told to provide screenplay writing, which I have in each cell.
So for example in cell A1, the data is
Mary: good morning
John: Hello!, nice to meet you
Mary: The weather is great today, do you want to go to the movies and hang out?
John: Yes, I would like that.
and cell a2, the data is another large screenwriting play.
Now my macro creates 12 columns, each with unique useful info, like what are all the stuff Mary said, who spoke first, how many time did each person speak, what happened at the end, the last three sentences used, etc etc.
With the help online, and by asking questions here, with the help of you guys. I finally did it!!! Yay!.
The macro now runs for 25,000 such plays, gathering a frick-ton of information for my work.
What I did and observed with my VBA Code,
Now is the runtime problem, and I assume memory management problem.
So this code works great for about 500-1000 conversations.
but it started taking SUPER long.
I thus wrote a timer. (sample code)
Dim StartTime As Double
Dim LapTime As Double
Dim PrevTimer As Double
Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done - " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
This helped me know the time difference between each set of conversations.
What I did was make a for loop, and take 250 conversations at a time.
(sample code)
While Beginrow < Total_Conv_Count
If Beginrow + 250 > Total_Conv_Count Then
Lastrow = Total_Conv_Count
Else
Lastrow = Beginrow + 250
End If
Call My_CustomScript(Beginrow, Lastrow)
Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done - " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
Wend
I noticed when I first boot my Laptop.
For 250 conversations, it takes 11 secs
For 2,000 conversations, it takes 1.30 secs
Kay. I stopped it (ESC key), and made a change to VBA, etc etc.
Ran the VBA code again.
For 250 conversations, it takes 32 secs
For 2,000 conversations, it takes 3.24 secs
At around 8,000 conversations on my first run, it started taking 2 mins for every 250 conversations.
At around 9,000 conversations, the same run, the VBA code started taking 5 mins for every 250 conversations.
Great, I needed to Optimize.
So I went to this website https://vbacompiler.com/optimize-vba-code/
As well as googled a lot. I did what they all recommended, which is basically the same.
Now I use the status bar updating, cause I disabled everything else.
Application.StatusBar = Beginrow & " - " & Lastrow & " is Done - Time Elapsed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
Cool, it runs faster now (or so I thought)
Cause I had just rebooted and restarted my laptop.
I find, that although faster, it still increases in time.
So the first run is faster than the third (re-running the same macro)
and the first 1,000 conversations is faster than the next subsequent 1,000
The times I recorded are, in seconds (for each 250)
11 - 13 - 15 - 16 - 17 - ............ - (at 2,500) 1min - 1.30min - 2min .......
Sigh, no progress. idky why.
I also tried
Set Search_Range = Nothing
Erase data_array
at the end of each private subroutine, for each 250, once done. I will clear the memory. No luck.
The issue I need to solve.
How do I optimize the code, so the runtime for each 250 is approximately the same?
i.e, about say 15 secs? consistently from 0 - 35,000 conversations.
and even if I re-run the same macro, it'll take about the same time, and not keep increasing as the code keep running.
Workaround solution I am currently using.
Split the 30,000 conversations into 5,000 each, thus making 6 sheets in the same workbook.
Run the macro for the first sheet.
Then reboot the Laptop, open the Excel workbook again, and do the second sheet - Repeat.
Finally, once all is done, copy and paste all 6 sheets into a single sheet by manually appending the data.
Clearly, I want to avoid this workaround.
Oct 21 2022 05:20 AM
Oct 21 2022 05:38 AM
I'm adding the code. (but removing almost every sensitive data I can)
Also, a related question of mine here, solve my Hans (file share was also something he made)
Sub SomethingCode()
' Main MasterCode Macro
' Obtains
' Main Macro
'
' Keyboard Shortcut: Ctrl+Shift+E
'
' SomeCode for Data Extraction .... with Colors and everything! Yay!
ActiveSheet.Range("A1").Select
Dim Beginrow As Integer
Dim Lastrow As Integer
Dim StartTime As Double
Dim MinutesElapsed As String
StartTime = Timer
Total_Conv_Count = Cells(Rows.Count, "A").End(xlUp).Row - 1
Beginrow = 1
Columns("I:I").ColumnWidth = 28
Cells(2, 9).Value2 = "First 250 Starting"
Cells(3, 9).Value2 = "100 Conversations Takes 1min"
Call SomethingCode_Headers
While Beginrow < Total_Conv_Count
If Beginrow + 250 > Total_Conv_Count Then
Lastrow = Total_Conv_Count
Else
Lastrow = Beginrow + 250
End If
Call SomethingCode_Script(Beginrow, Lastrow) ' SubRoutine for the Script
'ThisWorkbook.Save
Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done!"
Beginrow = Lastrow + 1
Cells(4, 9) = "Time Elapsed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
Wend
Cells(3, 9).Value2 = "Now Coloring and Beautifying"
Call ColorFormatting ' Subroutine for Making the whole thing Beautiful
Cells(5, 9) = "Time Completed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
ActiveSheet.Range("I2").Select
End Sub
Then I call the Sub-routine Private SomethingCode()
(Ton of info is removed, I have used ?????????? to mask stuff)
Sub SomethingCode_Script(Beginrow As Integer, Lastrow As Integer)
'
' Obtains Data Extracted
' Main
'
'
'
ActiveSheet.Range("A1").Select
Dim midText() As String
Lastrow = Cells(Rows.Count, "A").End(xlUp).Row - 1
Set Search_Range = Range(Range("H1").Offset(Beginrow, 0), Range("H1").Offset(Lastrow, 0))
'Set idk_Range = Range(Range("AA3"), Range("AA3").End(xlDown)) ' Has to be after the cells are populated
' Code Block for Step Counter of Conversations Count - Column JKL
For Each cell In Search_Range
cell(1, 3).Value2 = UBound(Split(cell, "||")) + 1 ' - Column J
cell(1, 4).Value2 = UBound(Split(cell, "||?????????")) + 1 ' - Column K
cell(1, 5).Value2 = cell(1, 3).Value2 - cell(1, 4).Value2 ' - Column L
' Empty the cells if we re-run the same Macro
If IsEmpty(cell(1, 3).Value2) = False Then ' If IsEmpty(cell(1, 19).Value2) = False Or IsEmpty(cell(1, 18).Value2) = False Then
cell(1, 8).Clear
cell(1, 13).Clear
cell(1, 15).Clear
cell(1, 18).Clear
cell(1, 19).Clear
End If
' Code Block for ????????????? - Column R
If InStr(1, cell, "?????????????") <> 0 Then
cell(1, 11) = "!!!!!!!!!!!!!!!!!!"
ElseIf InStr(1, cell, "@@@@@@@@@@@@@@@@@") <> 0 Then
cell(1, 11) = "###########"
ElseIf InStr(1, cell, "$$$$$$$$$$$$$$$$$$") <> 0 _
Or InStr(1, cell, "^^^^^^^^^^^^^") <> 0 Then
cell(1, 11) = "((((((((((((((("
Else
cell(1, 11) = "%&%$%^$^%$%^&^*&^(&"
End If
Next cell
' Code Block for Iterating over every Sentence in a conversation.
For Each cell In Search_Range
midText = Split(cell, "||")
Counter = 0
SnippetStart = 0
' Code Block for First Sentence via delimiter. - Column NYZ
For i = LBound(midText) To UBound(midText)
If InStr(1, midText(i), "?????????????") > 0 Then
cell(1, 19).Value2 = cell(1, 19).Value2 & ":" & midText(i) & vbLf ' ?????????????
Else
cell(1, 18).Value2 = cell(1, 18).Value2 & ":" & midText(i) & vbLf ' All ?????????????
If Counter = 0 Then ' If Block for First sentence
cell(1, 7) = midText(i)
Counter = 1
End If
End If
' Find ????????????? - Columns TV
If InStr(1, midText(i), "WHAT!", vbTextCompare) > 0 Then ' what came After sentence to WHAT!
cell(1, 13).Value2 = cell(1, 13).Value2 & midText(i - 1) & vbLf
ElseIf InStr(1, midText(I), "????????????? Ton of ElseIFs for different statements removed")
cell(1, 15).Value2 = cell(1, 15).Value2 & midText(i - 1) & vbLf ' previous sentence
End If
Next
' Code Block ????????????? - Columns UW
If UBound(Split(cell(1, 13).Value2, vbLf)) > 0 Then
cell(1, 14) = UBound(Split(cell(1, 13).Value2, vbLf))
Else
cell(1, 14).Clear
End If
If UBound(Split(cell(1, 15).Value2, vbLf)) > 0 Then
cell(1, 16) = UBound(Split(cell(1, 15).Value2, vbLf))
Else
cell(1, 16).Clear
End If
' Code Block for Last 3 sentences-ish kinda ????????????? - Column OPQ
If InStr(1, cell, "?????????????") = 0 Then
For i = UBound(midText) To LBound(midText) Step -1 ' Reading the Array backwards
If InStr(1, midText(i), "?????????????") = 0 Then
If InStr(1, midText(i - 1), "?????????????") > 0 Then ' Checking if the word ?????????????is there, if not, it is ?????????????
SnippetStart = i - 1 ' Then this is where the snippet starts
Exit For ' So exit the for loop, we have the starting point
End If
End If
Next
Else
For i = UBound(midText) To LBound(midText) Step -1 ' Reading the Array backwards
If InStr(1, midText(i), "?????????????") > 0 Then
For j = i To LBound(midText) Step -1 ' Reading the Array backwards ?????????????
If InStr(1, midText(j), "?????????????") = 0 Then
If InStr(1, midText(j - 1), "?????????????") > 0 Then ' ?????????????
SnippetStart = j - 1 ' Then this is where the snippet starts
Exit For ' So exit the for loop, we have the starting point
End If
End If
Next
End If
Next
End If
For i = SnippetStart To UBound(midText)
cell(1, 8).Value2 = cell(1, 8).Value2 & midText(i) & vbLf ' Final Snippet
cell(1, 9).Value2 = midText(SnippetStart) ' ????????????? what caused the ending
cell(1, 10).Value2 = midText(SnippetStart + 1) ' ????????????? Response
Next
' Code Block for Date & Week Manipulation for student submission
'(not yet added)
' End the Iteration
Next cell
' Memory Management
Set Search_Range = Nothing
Erase midText
End Sub
ColorFormatting subroutine is fine, no problems there.
and SomethingCode_Headers() is just a subroutine to add text to row1
Private Sub SomethingCode_Headers()
' Code Block for Header Titles
Range("I1").Value2 = "null - 2 Progress Bar"
' Tons of others in the same way here, till cell AH1
End sub
Apologizes for most stuff being removed. Hope this gives an idea-ish?
Oct 21 2022 05:39 AM
Oct 21 2022 06:34 AM
If you need this speeded up, the best thing to do is process everything in memory first, by using arrays. Then after it has all been worked through, push out the entire array to the worksheet in a single line of code. This may be orders of magnitudes faster that what you currently have.
But a very simple first step would be to turn off screen-updating and calculation prior to your loop and turn those back on afterwards:
'...Code
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
'Loop here
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
'Some more code
Oct 21 2022 06:46 AM
Right, I did try that
Private Sub OptimizedMode(ByVal enable As Boolean)
Application.EnableEvents = Not enable
Application.Calculation = IIf(enable, xlCalculationManual, xlCalculationAutomatic)
Application.ScreenUpdating = Not enable
Application.EnableAnimations = Not enable
' Application.DisplayStatusBar = Not enable
Application.PrintCommunication = Not enable
End Sub
From the website here.
https://vbacompiler.com/optimize-vba-code/
The idea you are providing with arrays, is point 6 from this blog post correct?
Dim vArray As Variant
Dim iRow As Integer
Dim iCol As Integer
Dim dValue As Double
vArray = Range("A1:C10000").Value2 ‘ read all the values at once from the Excel cells, put into an array
For iRow = LBound(vArray, 1) To UBound(vArray, 1)
For iCol = LBound(vArray, 2) To UBound(vArray, 2)
dValue = vArray (iRow, iCol)
If dValue > 0 Then
dValue=dValue*dValue ‘ Change the values in the array, not the cells
vArray(iRow, iCol) = dValue
End If
Next iCol
Next iRow
Range("A1:C10000").Value2 = vArray ‘ writes all the results back to the range at once
So, I haven't tried this yet.
I'm confused with this part from above
For iRow = LBound(vArray, 1) To UBound(vArray, 1) For iCol = LBound(vArray, 2) To UBound(vArray, 2) dValue = vArray (iRow, iCol)
With regards to Lbound(vArray, 1) & again the loop Lbound(vArray, 2)
What is 1 & 2 referring to?
and when we put them back with
Range("A1:C10000").Value2 = vArray
is (vArray, 1) referring to A1? & the second loop is B1?
i.e a1 a2 a3 a4... then b1 b2 b3 b4....
So is C column not being looked at? but I'mma assume it is?
I understood the examples here
https://learn.microsoft.com/en-us/office/vba/language/reference/user-interface-help/lbound-function
and here,
https://www.automateexcel.com/vba/ubound-lbound-function/
but didn't understand what and where I need to modify.
as is.
should I make vArray = Search_Range( entire H column) with tons of conversations,
then, cell = split(vArray(iRow,iCol), "||") then run my for loops?
if so, how am I putting the values back into the 25 columns the specific data needs to be entered into?
Cause, I am using cell(1, 20) etc, to reference which column I need to put it in.
and idk how to loop and assign properly.
Oct 21 2022 08:52 AM
Oct 25 2022 12:31 PM
Oct 26 2022 01:07 AM