Mar 04 2020 11:25 PM - edited Mar 30 2020 11:51 PM
Hi, Does anyone have any suggestions for this?
Chris
---------------------------------------
Hi I
I have already posted this on the SQL community forum, but I am now thinking this may also be good place to post this question. I use PowerQuery however the job I am trying to do below just causes the application to crash, it seems to be memory related So I am hoping doing this in MS Access will be much easier and less strain on excel.
So my rawdata comes from an SQL View, I want to strip out all the corresponding codes in the column called 'IN_CODE', and create a new record for each corresponding code I find. Some columns can have 10 or more codes, so the routine needs to know when there are no more values to check or extract. I want to strip out any unnecessary text, for example, any commas, periods, names or words and even brackets used around any text etc. Each code is only 6 digits long, I thought if I could remove unwanted text, and try and remove every space, I could extract the code in 6 digits chunks; this roughly what I was doing in power query but the process or task was too memory intensive.
Step 1; my raw data, I am needing to extract all the 6 digit codes from the column called 'IN_CODE'.
Step1 | Codes are all 6 digits length, there also extra spaces between codes | |
PROJECT_ID | PROJECT_TITLE | IN_CODE |
1246 | NAME2 | R12345 and R45647, RA4564 & RB4561 |
1261 | NAME3 | R12478 |
1276 | NAME4 | R12457, |
1291 | NAME5 | RA4789 |
1306 | NAME6 | TY7894 |
1321 | NAME7 | FG5674 |
1336 | NAME8 | R45789 , TY7894 |
1351 | NAME9 | R13445 R78456 & R96325 (SCCA) R44557 (Dept) R77889 R11122 |
1366 | NAME10 | RT7845 |
1381 | NAME11 | RT1234 |
Step 2, I then want to remove all unwanted text, e.g. 'and' ',', '&' , '(SCCA)' etc There are also extra spaces just for a good measure.
Step2 | Codes are all 6 digits length, there also extra spaces between codes | |
PROJECT_ID | PROJECT_TITLE | IN_CODE |
1246 | NAME2 | R12345 R45647 RA4564 RB4561 |
1261 | NAME3 | R12478 |
1276 | NAME4 | R12457 |
1291 | NAME5 | RA4789 |
1306 | NAME6 | TY7894 |
1321 | NAME7 | FG5674 |
1336 | NAME8 | R45789 TY7894 |
1351 | NAME9 | R13445 R78456 R96325 R44557 R77889 R11122 |
1366 | NAME10 | RT7845 |
1381 | NAME11 | RT1234 |
Step 3, so I want to get the data from step 1 to look like this -
Step3 | ||
PROJECT_ID | PROJECT_TITLE | IN_CODE |
1246 | NAME2 | R12345 |
1246 | NAME2 | R45647 |
1246 | NAME2 | RA4564 |
1246 | NAME2 | RB4561 |
1261 | NAME3 | R12478 |
1276 | NAME4 | R12457 |
1291 | NAME5 | RA4789 |
1306 | NAME6 | TY7894 |
1321 | NAME7 | FG5674 |
1336 | NAME8 | R45789 |
1336 | NAME8 | TY7894 |
1351 | NAME9 | R13445 |
1351 | NAME9 | R78456 |
1351 | NAME9 | R96325 |
1351 | NAME9 | R44557 |
1351 | NAME9 | R77889 |
1351 | NAME9 | R11122 |
1366 | NAME10 | RT7845 |
1381 | NAME11 | RT1234 |
Any help appreciated.
Chris C
Sep 21 2020 04:59 AM
SolutionYou could solve this with a helper table and VBA in Access. The helper table is called UnwantedText and contains the (pieces of) text you want to remove from IN_CODE:
You can extend this table with any text you want/need.
Next, we define a table (ProjectParsed) for the parsed result:
Note that the three fields together form the primary key.
Assuming that the raw project data is in a (linked) table which is called ProjectRaw (you can name it anything you like, but you have to make a slight modification in the coming VBA code if you name it different), this VBA subroutine removes the unwanted text and parses the 6 digit occurrences in IN_CODE into separate rows/records in the table ProjectParsed:
Sub RemoveAndParse()
'Declare
Dim sInCode As String, sSQL As String
Dim i As Integer
'Switch off warnings
DoCmd.SetWarnings False
'Remove all earlier records from the result table
sSQL = "DELETE FROM ProjectParsed"
DoCmd.RunSQL sSQL
'Open tables Project and UnwantedText as recordsets to loop over
Dim rsProject As DAO.Recordset
Set rsProject = CurrentDb.OpenRecordset("ProjectRaw", dbOpenTable)
Dim rsUnwantedText As DAO.Recordset
Set rsUnwantedText = CurrentDb.OpenRecordset("UnwantedText", dbOpenTable)
'Loop over the (raw) projects
Do Until rsProject.EOF
sInCode = rsProject!IN_CODE
'Remove all the spaces from IN_CODE
sInCode = Replace(sInCode, " ", "")
'Remove the unwanted text from IN_CODE
rsUnwantedText.MoveFirst 'go back to first unwanted text for every cycle
'Loop over the unwanted text
Do Until rsUnwantedText.EOF
sInCode = Replace(sInCode, rsUnwantedText!Unwanted, "") 'remove text (replace it by a blank)
rsUnwantedText.MoveNext 'move to next unwanted text
Loop
'Parse the 6 digit codes
For i = 1 To Len(sInCode) Step 6
sSQL = "INSERT INTO ProjectParsed VALUES('" & rsProject!PROJECT_ID & "', '" & _
rsProject!PROJECT_TITLE & "', '" & Mid(sInCode, i, 6) & "')"
DoCmd.RunSQL sSQL
Next
rsProject.MoveNext 'move to next project
Loop
rsUnwantedText.Close
rsProject.Close
'Switch on warnings again
DoCmd.SetWarnings True
End Sub
I assume and hope that the comments in the code explain what the subroutine does. You can run the routine from the VBA editor or you can call it from a button in one of your Access forms.
After running it you'll get something in the table ProjectParsed like this:
Hope this does the job. And if you need more clarification, please let me know.
Best wishes,
Tieme
Sep 24 2020 01:14 PM
Sep 21 2020 04:59 AM
SolutionYou could solve this with a helper table and VBA in Access. The helper table is called UnwantedText and contains the (pieces of) text you want to remove from IN_CODE:
You can extend this table with any text you want/need.
Next, we define a table (ProjectParsed) for the parsed result:
Note that the three fields together form the primary key.
Assuming that the raw project data is in a (linked) table which is called ProjectRaw (you can name it anything you like, but you have to make a slight modification in the coming VBA code if you name it different), this VBA subroutine removes the unwanted text and parses the 6 digit occurrences in IN_CODE into separate rows/records in the table ProjectParsed:
Sub RemoveAndParse()
'Declare
Dim sInCode As String, sSQL As String
Dim i As Integer
'Switch off warnings
DoCmd.SetWarnings False
'Remove all earlier records from the result table
sSQL = "DELETE FROM ProjectParsed"
DoCmd.RunSQL sSQL
'Open tables Project and UnwantedText as recordsets to loop over
Dim rsProject As DAO.Recordset
Set rsProject = CurrentDb.OpenRecordset("ProjectRaw", dbOpenTable)
Dim rsUnwantedText As DAO.Recordset
Set rsUnwantedText = CurrentDb.OpenRecordset("UnwantedText", dbOpenTable)
'Loop over the (raw) projects
Do Until rsProject.EOF
sInCode = rsProject!IN_CODE
'Remove all the spaces from IN_CODE
sInCode = Replace(sInCode, " ", "")
'Remove the unwanted text from IN_CODE
rsUnwantedText.MoveFirst 'go back to first unwanted text for every cycle
'Loop over the unwanted text
Do Until rsUnwantedText.EOF
sInCode = Replace(sInCode, rsUnwantedText!Unwanted, "") 'remove text (replace it by a blank)
rsUnwantedText.MoveNext 'move to next unwanted text
Loop
'Parse the 6 digit codes
For i = 1 To Len(sInCode) Step 6
sSQL = "INSERT INTO ProjectParsed VALUES('" & rsProject!PROJECT_ID & "', '" & _
rsProject!PROJECT_TITLE & "', '" & Mid(sInCode, i, 6) & "')"
DoCmd.RunSQL sSQL
Next
rsProject.MoveNext 'move to next project
Loop
rsUnwantedText.Close
rsProject.Close
'Switch on warnings again
DoCmd.SetWarnings True
End Sub
I assume and hope that the comments in the code explain what the subroutine does. You can run the routine from the VBA editor or you can call it from a button in one of your Access forms.
After running it you'll get something in the table ProjectParsed like this:
Hope this does the job. And if you need more clarification, please let me know.
Best wishes,
Tieme