Forum Discussion

Sandeeep's avatar
Sandeeep
Brass Contributor
Oct 21, 2022

VBA Optimization help for Large string manipulation related to memory management?

TLDR;

The issue I need to solve.

How do I optimize the code, so the runtime for each 250 is approximately the same?

i.e, about say 15 secs? consistently from 0 - 35,000 conversations.

and even if I re-run the same macro, it'll take about the same time, and not keep increasing as the code keep running.

 

What is my code about?

As a teacher (I am not but as an example), I have say 100 students, whom I have told to provide screenplay writing, which I have in each cell.

 

So for example in cell A1, the data is

Mary: good morning

John: Hello!, nice to meet you

Mary: The weather is great today, do you want to go to the movies and hang out?

John: Yes, I would like that.

 

and cell a2, the data is another large screenwriting play.

 

Now my macro creates 12 columns, each with unique useful info, like what are all the stuff Mary said, who spoke first, how many time did each person speak, what happened at the end, the last three sentences used, etc etc.

 

With the help online, and by asking questions here, with the help of you guys. I finally did it!!! Yay!.

 

The macro now runs for 25,000 such plays, gathering a frick-ton of information for my work.

 

What I did and observed with my VBA Code,

Now is the runtime problem, and I assume memory management problem.

So this code works great for about 500-1000 conversations.

but it started taking SUPER long.

 

I thus wrote a timer. (sample code)

Dim StartTime As Double
Dim LapTime As Double
Dim PrevTimer As Double

Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done - " & Format((Timer - StartTime) / 86400, "hh:mm:ss")

 

This helped me know the time difference between each set of conversations.

What I did was make a for loop, and take 250 conversations at a time.

(sample code)

While Beginrow < Total_Conv_Count
If Beginrow + 250 > Total_Conv_Count Then
Lastrow = Total_Conv_Count
Else
Lastrow = Beginrow + 250
End If

Call My_CustomScript(Beginrow, Lastrow)

Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done - " & Format((Timer - StartTime) / 86400, "hh:mm:ss")

Wend

 

I noticed when I first boot my Laptop.

For 250 conversations, it takes 11 secs

For 2,000 conversations, it takes 1.30 secs

 

Kay. I stopped it (ESC key), and made a change to VBA, etc etc.

Ran the VBA code again.

 

For 250 conversations, it takes 32 secs

For 2,000 conversations, it takes 3.24 secs

 

At around 8,000 conversations on my first run, it started taking 2 mins for every 250 conversations.

At around 9,000 conversations, the same run, the VBA code started taking 5 mins for every 250 conversations.

 

Great, I needed to Optimize.

So I went to this website https://vbacompiler.com/optimize-vba-code/

As well as googled a lot. I did what they all recommended, which is basically the same.

 

Now I use the status bar updating, cause I disabled everything else.

Application.StatusBar = Beginrow & " - " & Lastrow & " is Done - Time Elapsed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss")

 

Cool, it runs faster now (or so I thought)

Cause I had just rebooted and restarted my laptop.

 

I find, that although faster, it still increases in time.

So the first run is faster than the third (re-running the same macro)

and the first 1,000 conversations is faster than the next subsequent 1,000

 

The times I recorded are, in seconds (for each 250)

11 - 13 - 15 - 16 - 17 - ............ - (at 2,500) 1min - 1.30min - 2min .......

 

Sigh, no progress. idky why.

I also tried

Set Search_Range = Nothing

Erase data_array

 

at the end of each private subroutine, for each 250, once done. I will clear the memory. No luck.

 

 

The issue I need to solve.

How do I optimize the code, so the runtime for each 250 is approximately the same?

i.e, about say 15 secs? consistently from 0 - 35,000 conversations.

and even if I re-run the same macro, it'll take about the same time, and not keep increasing as the code keep running.

 

Workaround solution I am currently using.

Split the 30,000 conversations into 5,000 each, thus making 6 sheets in the same workbook.

Run the macro for the first sheet.

Then reboot the Laptop, open the Excel workbook again, and do the second sheet - Repeat.

Finally, once all is done, copy and paste all 6 sheets into a single sheet by manually appending the data.

 

Clearly, I want to avoid this workaround.

  • Sandeeep's avatar
    Sandeeep
    Brass Contributor

    Sandeeep 

     

    I'm adding the code. (but removing almost every sensitive data I can) 

    Also, a related question of mine here, solve my Hans (file share was also something he made)

    https://techcommunity.microsoft.com/t5/excel/obtain-data-from-a-cell-with-lots-of-paragraphs/m-p/3280768#M141815

     

    Sub SomethingCode()
    ' Main MasterCode Macro
    ' Obtains
    ' Main Macro
    '
    ' Keyboard Shortcut: Ctrl+Shift+E
    '
    
    ' SomeCode for Data Extraction .... with Colors and everything! Yay!
    
    ActiveSheet.Range("A1").Select
    Dim Beginrow As Integer
    Dim Lastrow As Integer
    
    Dim StartTime As Double
    Dim MinutesElapsed As String
    StartTime = Timer
    
    Total_Conv_Count = Cells(Rows.Count, "A").End(xlUp).Row - 1
    Beginrow = 1
    Columns("I:I").ColumnWidth = 28
    Cells(2, 9).Value2 = "First 250 Starting"
    Cells(3, 9).Value2 = "100 Conversations Takes 1min"
    
    Call SomethingCode_Headers
    
    While Beginrow < Total_Conv_Count
        If Beginrow + 250 > Total_Conv_Count Then
            Lastrow = Total_Conv_Count
        Else
            Lastrow = Beginrow + 250
        End If
        
        Call SomethingCode_Script(Beginrow, Lastrow) ' SubRoutine for the Script
        
        'ThisWorkbook.Save
        Cells(2, 9).Value2 = Beginrow & " - " & Lastrow & " is Done!"
        Beginrow = Lastrow + 1
        Cells(4, 9) = "Time Elapsed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
    Wend
    
     Cells(3, 9).Value2 = "Now Coloring and Beautifying"
     Call ColorFormatting ' Subroutine for Making the whole thing Beautiful
     Cells(5, 9) = "Time Completed: " & Format((Timer - StartTime) / 86400, "hh:mm:ss")
     ActiveSheet.Range("I2").Select
        
    End Sub

     

     

    Then I call the Sub-routine Private SomethingCode()

    (Ton of info is removed, I have used ?????????? to mask stuff)

     

    Sub SomethingCode_Script(Beginrow As Integer, Lastrow As Integer)
    '
    ' Obtains Data Extracted 
    ' Main 
    '
    '
    '
    
    ActiveSheet.Range("A1").Select
    Dim midText() As String
    Lastrow = Cells(Rows.Count, "A").End(xlUp).Row - 1
    Set Search_Range = Range(Range("H1").Offset(Beginrow, 0), Range("H1").Offset(Lastrow, 0))
    
    'Set idk_Range = Range(Range("AA3"), Range("AA3").End(xlDown)) ' Has to be after the cells are populated
    
    ' Code Block for Step Counter of Conversations Count - Column JKL
    For Each cell In Search_Range
        cell(1, 3).Value2 = UBound(Split(cell, "||")) + 1 ' - Column J
        cell(1, 4).Value2 = UBound(Split(cell, "||?????????")) + 1 ' - Column K
        cell(1, 5).Value2 = cell(1, 3).Value2 - cell(1, 4).Value2 ' - Column L
    
    ' Empty the cells if we re-run the same Macro
        If IsEmpty(cell(1, 3).Value2) = False Then ' If IsEmpty(cell(1, 19).Value2) = False Or IsEmpty(cell(1, 18).Value2) = False Then
            cell(1, 8).Clear
            cell(1, 13).Clear
            cell(1, 15).Clear
            cell(1, 18).Clear
            cell(1, 19).Clear
        End If
    
    ' Code Block for ????????????? - Column R
        If InStr(1, cell, "?????????????") <> 0 Then
            cell(1, 11) = "!!!!!!!!!!!!!!!!!!"
        ElseIf InStr(1, cell, "@@@@@@@@@@@@@@@@@") <> 0 Then
            cell(1, 11) = "###########"
        ElseIf InStr(1, cell, "$$$$$$$$$$$$$$$$$$") <> 0 _
            Or InStr(1, cell, "^^^^^^^^^^^^^") <> 0 Then 
                cell(1, 11) = "(((((((((((((((" 
        Else
            cell(1, 11) = "%&%$%^$^%$%^&^*&^(&"
        End If
    Next cell
    
    ' Code Block for Iterating over every Sentence in a conversation.
    For Each cell In Search_Range
        midText = Split(cell, "||")
        Counter = 0
        SnippetStart = 0
    
    ' Code Block for First Sentence via delimiter. - Column NYZ
        For i = LBound(midText) To UBound(midText)
            If InStr(1, midText(i), "?????????????") > 0 Then
                cell(1, 19).Value2 = cell(1, 19).Value2 & ":" & midText(i) & vbLf ' ?????????????
            Else
                cell(1, 18).Value2 = cell(1, 18).Value2 & ":" & midText(i) & vbLf ' All ?????????????
                If Counter = 0 Then ' If Block for First sentence
                    cell(1, 7) = midText(i)
                    Counter = 1
                End If
            End If
    
    ' Find ????????????? - Columns TV
            If InStr(1, midText(i), "WHAT!", vbTextCompare) > 0 Then ' what came After sentence to WHAT!
                cell(1, 13).Value2 = cell(1, 13).Value2 & midText(i - 1) & vbLf
            ElseIf InStr(1, midText(I), "????????????? Ton of ElseIFs for different statements removed")
                    cell(1, 15).Value2 = cell(1, 15).Value2 & midText(i - 1) & vbLf ' previous sentence
            End If
        Next
        
    ' Code Block ????????????? - Columns UW
        If UBound(Split(cell(1, 13).Value2, vbLf)) > 0 Then
            cell(1, 14) = UBound(Split(cell(1, 13).Value2, vbLf))
        Else
            cell(1, 14).Clear
        End If
        If UBound(Split(cell(1, 15).Value2, vbLf)) > 0 Then
            cell(1, 16) = UBound(Split(cell(1, 15).Value2, vbLf))
        Else
            cell(1, 16).Clear
        End If
        
    
    ' Code Block for Last 3 sentences-ish kinda ????????????? - Column OPQ
        If InStr(1, cell, "?????????????") = 0 Then
            For i = UBound(midText) To LBound(midText) Step -1 ' Reading the Array backwards
                If InStr(1, midText(i), "?????????????") = 0 Then
                    If InStr(1, midText(i - 1), "?????????????") > 0 Then ' Checking if the word ?????????????is there, if not, it is ?????????????
                        SnippetStart = i - 1 ' Then this is where the snippet starts
                        Exit For ' So exit the for loop, we have the starting point
                    End If
                End If
            Next
        Else
            For i = UBound(midText) To LBound(midText) Step -1 ' Reading the Array backwards
                If InStr(1, midText(i), "?????????????") > 0 Then
                    For j = i To LBound(midText) Step -1 ' Reading the Array backwards ?????????????
                        If InStr(1, midText(j), "?????????????") = 0 Then
                            If InStr(1, midText(j - 1), "?????????????") > 0 Then ' ?????????????
                                SnippetStart = j - 1 ' Then this is where the snippet starts
                                Exit For ' So exit the for loop, we have the starting point
                            End If
                        End If
                    Next
                End If
            Next
        End If
    
        For i = SnippetStart To UBound(midText)
            cell(1, 8).Value2 = cell(1, 8).Value2 & midText(i) & vbLf ' Final Snippet
            cell(1, 9).Value2 = midText(SnippetStart) ' ????????????? what caused the ending
            cell(1, 10).Value2 = midText(SnippetStart + 1) ' ????????????? Response
        Next
        
    ' Code Block for Date & Week Manipulation for student submission
        '(not yet added)
       
    ' End the Iteration
    Next cell
    
    ' Memory Management
    Set Search_Range = Nothing
    Erase midText
    
    
    End Sub

     

    ColorFormatting subroutine is fine, no problems there.

    and SomethingCode_Headers() is just a subroutine to add text to row1

    Private Sub SomethingCode_Headers()
    
    
    ' Code Block for Header Titles
    Range("I1").Value2 = "null - 2 Progress Bar"
    ' Tons of others in the same way here, till cell AH1
    
    End sub

     

    Apologizes for most stuff being removed. Hope this gives an idea-ish?

  • JKPieterse's avatar
    JKPieterse
    Silver Contributor
    If you want us to comment on the performance of your code we need, well, your code. And some nonsense but relevant sample data.
    • Sandeeep's avatar
      Sandeeep
      Brass Contributor
      Kay, kinda sent it.
      I have to remove tons of stuff tho. But should give a rough idea.
      • JKPieterse's avatar
        JKPieterse
        Silver Contributor

        Sandeeep 

        If you need this speeded up, the best thing to do is process everything in memory first, by using arrays. Then after it has all been worked through, push out the entire array to the worksheet in a single line of code. This may be orders of magnitudes faster that what you currently have.

        But a very simple first step would be to turn off screen-updating and calculation prior to your loop and turn those back on afterwards:

        '...Code
        Application.ScreenUpdating = False
        Application.Calculation = xlCalculationManual
        'Loop here
        Application.ScreenUpdating = True
        Application.Calculation = xlCalculationAutomatic
        'Some more code

Resources