Forum Discussion
Merged subcells and displaced data
- Aug 03, 2022
Attached is a possible LAMBDA solution. Please note my solution accounts for 6 columns. I see there were a few entries in the column right of Skill so I included it in the calculations.
Fill-in gaps in the data:
=LAMBDA(arr,col_num,SCAN("",INDEX(arr,SEQUENCE(ROWS(arr)),col_num),LAMBDA(a,v,IF(v="",a,v))))Piece it all together:
=LAMBDA(arr,CHOOSE({1,2,3,4,5,6},FILL(arr,1),FILL(arr,2),FILL(arr,3),FILL(arr,4),INDEX(TEXT(arr,";;;General"),SEQUENCE(ROWS(arr)),5),INDEX(TEXT(arr,";;;General"),SEQUENCE(ROWS(arr)),6)))I avoided using insider functions which would have made it simpler.
The end result (all you need to give CLEANUP is the entire range, including header row):
JeanGoodrich It's difficult to visualise how the data exactly looks like, but you seem to get "garbage in" and you need make something pretty out of it. Try connecting to the date with Power Query. That will deal with merged cells, and perhaps it will also take care of the data that is shifted into column A rather than column E in Excel. But I can't tell for sure without seeing the data. Can you somehow share (Onedrive, Dropbox or similar) a part of the output file produced by the "other program". That is, not all 500K lines. By the way, what format is it? Does it come out as an xlsx file, or txt or perhaps csv?
Riny_van_Eekelen , thank you for your response. And hello!
I have uploaded a sample file to Dropbox:
https://www.dropbox.com/scl/fi/cxjfcb4elus6igo2afooe/Example.Problem.xlsx?dl=0&rlkey=7dbwhlpzrirjfedwkvuioiw09
It is definitely a garbage-in situation, and it will take longer to fix the third-party software than I have to provide useable data. I'm just starting to learn Pivot Tables but I don't trust my ability with that yet.
I need to keep the relationship of User to Skill without changing the number of Users. The merged cells keep me from sorting by Care Type.
If you have any suggestions, I would be very grateful!
Thank you,
Jean
- Riny_van_EekelenAug 03, 2022Platinum Contributor
JeanGoodrich Thanks for the file. It seems that in this small example, all is pretty neat up to and including row 34. Power Query (PQ) can resolve this with a few clicks, without the need to un-merge cells first and the use of complicated formulas.
Then the garbage kicks in. I can guess how this should have to be reorganised (which data should go where in order to create the same structure as the data in rows 2 to 34), but can't be sure until you explain it. Also important to know is if these "garbage rows" are consistent (for instance, there are always six garbage rows with data constantly in the wrong place), but I fear that neither PQ nor Patrick2788 's LAMBDA solution can easily fix that. By the way, how often does this garbage kick-in?
But, just to be sure that we are really in the beginning of the process. Does this xlsx file come directly from the 3rd-party system? Or did it perhaps produce a csv file that you just double-clicked on, then opened in Excel looking like the file you shared?