Forum Discussion
Category seperation
I have a dataset which has values in a column like the fig below. The number with colon " : " are the IDs which contains the values as sub category.
After cleaning it I want to represent the ID's of the values in a separate row for further analysis. the data set have 152 unique ID, and doing it manually for 1000000 rows will drive me crazy. Any help would be appreciated. Have attached the original dataset too.
Put this formula into cell B3 (after inserting an empty column B) and copy it down.
=IF(AND(ISNUMBER(A3),ISTEXT(A2)),A2,B2)
It works because fortunately the fields in Column A with the colon are text, whereas the others are numbers. What the logic is saying is
If the cell to the immediate left of the current cell (B3 at the start) is a number and the cell right above it is text, then put that text here in B3 (at the start). If, on the other hand, the cell to the immediate left is a number and the cell above is a number also, then repeat what is in Column B immediately above the current cell.
Then when, many rows later it suddenly encounters another text, it changes to that new text and continues with it.
It's not a perfect solution, but should serve.
Just out of curiosity, what does that data reflect?
For future reference, you really didn't need to upload that entire file; it is over 24 meg in size. It's good to have an actual file, but it could have been a representative subset and saved everybody a lot of bandwidth. That's why I'm only giving you the formula back.
6 Replies
- mathetesSilver Contributor
You say, "somewhat like" in your opening statement...which isn't as precise as a computer would need to be, and you're asking for a computer resolution.
It's also not clear whether that layout you give as the desired "clean" outcome is one column or two. If one, is the spacing a part of the desired outcome as well?
A few other questions:
- Where does this data come from in the first place?
- Is this a one-time clean up or are you going to need to be doing this on a regular basis?
- Is it possible to fix the "mess" at an earlier stage in the process--again, back to wherever it comes from?
- Is it possible for you to post the actual spreadsheet in which this column appears? That would be most helpful to anyone trying to help you.
- Eshan_pCopper ContributorThis is the first time asking a query, had no idea about the format. And yes it is in separate column. the data set has 1000000 rows. So have to keep doing it till every main category separated with a colon