How to selectively split cells

Copper Contributor

Hello, noob here. I have a giant list of data that I copied from a pdf online and I need to selectively separate it into separate cells so that I can work with it. Right now it is all in one column, with around 34000 rows, so doing this one by one is impossible. The data goes like this: name (usually several words), identification and reference numbers (usually starting with 0), then a description, which can be a combination of words and numbers. See image below:

 

 

 

I need to split this data  in one line into the following: Name, then each separate number in its own sell, and then after the long number the rest in one cell. For example, 

 

BROWNLEE CHARLIE JR | 01 | 00 | 0194 | 3 | 31 | 0123000310240040000000 | LOT 6 BLK 2 AIRPORT ESTS

 

I have tried doing the text to column function using spaces but this doesn't align the data in one column. For example, with names of different lengths, I get the same reference number not aligned with the one above it (so I need to somehow insert blank cells. How do I do this? May be a tough problem but I have no idea so anything helps.

4 Replies

@alaskanbear52 

If in PDF data separated by columns you may try Power Query connector From PDF. If not, afraid no way if formal logic of how to separate is not defined.

@alaskanbear52 

 

It appears the numbers are fixed width. If that's correct, then it looks like the hurdle is splitting the name from the rest of the text string.

Say the data is in Column A (starting in cell A4).

First, find the character position of the first number. This is an array formula, so you have to hit Ctrl+Shift+Enter after keying/copying it into the formula bar.
B4 =MIN(IF(ISNUMBER(--(MID(A4,ROW(INDIRECT("1:"&LEN(A4))),1))),ROW(INDIRECT("1:"&LEN(A4))),""))

Extract the name:
C4 =TRIM(LEFT(A4,B4-1))

Extract the remaining portion of the string:
D4 =MID(A4,B4,LEN(A4))

Copy/paste special - value columns C and D. Then, it looks like you can use text to columns (fixed width) on Column D to separate the rest of the data into columns. I attached a file you can download to look at.

Looking at the attachment, it appears your data may have two numbers on the back end after a variable length string that can have both numbers and letters (so my suggestion would still leave a piece at the end that would need to be split).

If that's the case, then let me know. I believe we can split the last piece by finding the character positions of the last and next to last spaces and using text functions to split the remaining three columns.

 

Or, are the two numbers in a row by themselves (alternating rows of long text string then a string of two numbers)?


If you can share it, maybe it would help to upload the file.

Thanks everyone for your help, I got it now. I would have never figured this out on my own.