Forum Discussion

Andrew_Hinson's avatar
Andrew_Hinson
Brass Contributor
Oct 10, 2024

Analysing code in column B

Hi everyone!

 

Hope everyone is good! I could really use some help if anyone has the time...

 

So, I have information in column B that contains a reference number along with other information, maybe such as a date, initials of someone etc. The reference numbers either start with DO, 80, 10 or 11 and they could be anywhere within that cell. For example the cell could read DO1014534610/10 or maybe HTJ80256734 or 11546732-1010HTJ. It is data I am downloading from an accounting system and so have no control as to what order things will go in when they are downloaded and put into column B; it's a mix of a few things. 

 

80 numbers have 9 digits

10 and 11 numbers either have 9 digits or 12 digits

DO numbers have "DO" and then 6 digits

 

What I need to do is the following:

 

1) If cell B (B1 and then B2 etc) contains one of these numbers within the text, identify by the number of digits and whether it starts with DO, 80 etc what type of reference number it is. So in cell C (let's say) it will look at the entry in B and if it contains "80" that is 9 digits, it will say "Type 1", 10 or 11 numbers with 9 digits will say "Type 2" etc. 

2) Then, in column D lets say - based on the type that has been returned as above (type 1 etc), strip out the reference number from all of the other information in cell B# by maybe counting the digits, and then provide just the reference number in say cell D.

 

So it would be something like the following, which I could then drag down.

A                                  B                     C                D

Date/Time        AJ805639876K4    Type 1        805639876

 

I really hope this make sense to everyone! I have tried playing with ISNUMBER(SEARCH functions and all sorts to try and get to where I need to be but I'm quite stuck. I'm only an amateur really with Excel so I'm keen to find the answer to my problem and then understand how it works, so I can apply it to different conditions. I have more things to add to this model than just above, but having an understanding of how to do above will allow me (I hope!) to add more things in.

 

Can anyone help?

 

Thank you so much.

  • NikolinoDE's avatar
    NikolinoDE
    Gold Contributor

    Andrew_Hinson 

    It sounds like you're working with mixed data in column B, and you need to extract specific reference numbers based on patterns, then classify them according to type. I'll walk you through a solution for both identifying the type of reference and extracting the number.

    We can achieve this using a combination of Excel functions like IF, SEARCH, MID, LEN, and TEXT.

    Step 1: Classifying the Reference Type in Column C

    You can create a formula to classify the reference number based on the patterns you mentioned. Here's how you can approach this:

    • "80" with 9 digits: Type 1
    • "10" or "11" with 9 or 12 digits: Type 2
    • "DO" with "DO" followed by 6 digits: Type 3

    In Column C, use this formula to classify the reference type:

    =IF(AND(ISNUMBER(SEARCH("80",B1)),LEN(MID(B1,SEARCH("80",B1),9))=9),"Type 1",

    IF(AND(OR(ISNUMBER(SEARCH("10",B1)),ISNUMBER(SEARCH("11",B1))),OR(LEN(MID(B1,SEARCH("10",B1),9))=9,LEN(MID(B1,SEARCH("11",B1),12))=12)),"Type 2",

    IF(AND(ISNUMBER(SEARCH("DO",B1)),LEN(MID(B1,SEARCH("DO",B1)+2,6))=6),"Type 3","Unknown")))

    Explanation:

    • SEARCH("80",B1) finds the position of "80" in the cell (if present).
    • LEN(MID(B1,SEARCH("80",B1),9))=9 checks if the length of the number following "80" is exactly 9 digits.
    • The formula repeats similar logic for "10", "11", and "DO", checking their respective patterns and digit lengths.
    • The IF conditions are nested to handle the different types and patterns.

    Step 2: Extracting the Reference Number in Column D

    Once you have classified the reference type, the next step is to extract the reference number itself from the text. Here's how you can do it based on the type identified in Column C:

    In Column D, use this formula:

    =IF(C1="Type 1", MID(B1,SEARCH("80",B1),9),

    IF(C1="Type 2", IF(ISNUMBER(SEARCH("10",B1)),MID(B1,SEARCH("10",B1),IF(LEN(MID(B1,SEARCH("10",B1),12))=12,12,9)),MID(B1,SEARCH("11",B1),IF(LEN(MID(B1,SEARCH("11",B1),12))=12,12,9))),

    IF(C1="Type 3", MID(B1,SEARCH("DO",B1)+2,6),"")))

    Explanation:

    • For Type 1, the formula extracts exactly 9 digits following the "80".
    • For Type 2, it checks if the number starts with "10" or "11" and extracts either 9 or 12 digits accordingly.
    • For Type 3, it extracts 6 digits following "DO".

    Example Table:

    A

    B

    C

    D

    Date/Time

    AJ805639876K4

    Type 1

    805639876

    Date/Time

    HTJ80256734

    Type 1

    80256734

    Date/Time

    11546732-1010HTJ

    Type 2

    11546732

    Date/Time

    DO10145346X123

    Type 3

    10145346

    This should work as expected when you drag the formulas down.

    Breakdown of Key Functions:

    • SEARCH: Finds the position of a substring (like "80" or "DO").
    • MID: Extracts a specific number of characters from a string starting at a specific position.
    • LEN: Returns the length of the string, useful for checking the digit count.
    • IF: Nested to handle different conditions (types).

    Troubleshooting:

    • If you still get issues with special cases, it might be helpful to manually review and adjust the pattern matching to ensure there are no inconsistencies in the data structure.

    You can now drag these formulas down the entire column to handle all your data, and it will extract and classify the reference numbers as required. The text, steps and functions were created with the help of AI. All formulas are untested, it is always recommended to make a backup of the existing file in advance before using the above formulas.

     

    My answers are voluntary and without guarantee!

     

    Hope this will help you.

    Was the answer useful? Mark as best response and Like it!

    This will help all forum participants.

Resources