Forum Discussion

Joel_Bauer's avatar
Joel_Bauer
Copper Contributor
Sep 09, 2021

Excel Text Manipulation

Device: PC/Windows 10

Microsoft Office Home and Business 2013

Excel Product ID: 00196-20943-85146-AA714

 

In the enclosed spreadsheet there is a column headed ‘Animals’.
Each row contains the name of a person and, in the Animals’ column, their favorite animals.
Dogs and cats are listed by breed and prefixed with ‘a-‘ or ‘b-‘ respectively.
Is it possible to break out dog breeds and cat breeds into their own columns while still keeping them under ‘Animals’?
Thanks, Joel

7 Replies

  • Riny_van_Eekelen's avatar
    Riny_van_Eekelen
    Platinum Contributor

    Joel_Bauer I agree with mathetes (i.e. John) that collecting data as displayed in your example is sub-optimal. 

     

    Attached is a possible solution that combines PowerQuery, PowerPivot and a minimal amount of DAX coding (Windows only, I'm afraid). The output is not exactly as what you asked for, but it's close.

     

    Note that this only works because I put a comma between "Lion" and "a-Beagle" on the animal listing for John Smith. If you can't guarantee the consistent use of the comma to delimit animals, it's going to break.

     

    Having said that, it's probable better to collect your data along the lines as explained by John.

     

     

    • Joel_Bauer's avatar
      Joel_Bauer
      Copper Contributor
      Hi Riny, Yes the data is sub_optimal, but its what we have. Yes all 'animals' will be comma delimited. I have many questions, but your arrived at solution looks very promising. I need to be able to implement it myself on real data. I will be back to you after I have googled a bunch of things. Thank you so much. Joel
  • mathetes's avatar
    mathetes
    Silver Contributor

    Joel_Bauer  asks: In the enclosed spreadsheet there is a column headed ‘Animals’.
    Each row contains the name of a person and, in the Animals’ column, their favorite animals.
    Dogs and cats are listed by breed and prefixed with ‘a-‘ or ‘b-‘ respectively.
    Is it possible to break out dog breeds and cat breeds into their own columns while still keeping them under ‘Animals’?

     

    It no doubt is possible.

     

    But may I first answer your question with another question (or maybe two or three)?

    1. What is your ultimate purpose here? How will this data be used?
    2. At what stage are you in the creation of the real database (this is obviously just an example)?
    3. Are you open to a wholly different approach?

     

    Why do I ask those questions? Because in general, it's a terrible design idea to lump all kinds of disparate types of data into a single cell. A terrible design idea! As proof of that strong statement, the very fact that you're coming and asking "can I break these out into their own columns?" makes my point for me. 

     

    It is a LOT easier to take data from multiple columns, even multiple rows, and produce a report that puts them together, a LOT easier to do that than it is to take data all stuffed into a single cell and break it apart. Are there text manipulation functions that could be assembled to do that breaking apart? Yes. But they're not easy, and the more inconsistent and erratic that data is (e.g., the spelling of "Frnch Poodle" or is it "French Poodle"--in your example), the harder it is to make sense of things.

     

    So IF you're open to re-thinking the approach, I have some ideas:

    • Have one row per favorite per member--yes that means a LOT of rows, but stay with me
    • Use data validation to get the spelling of each breed consistent
    • If you have the most recent version of Excel, you can even have two or more levels of "cascading" data validation, such that first you'd select among "Jungle, Dog, Cat....", which selection would lead to a focused secondary list "Elephant, Lion, Tiger, Zebra..." or "French Poodle, German Shepherd, Bulldog...." or "Persian, Siamese, American Wirehair...."

     

    Creating a database such as I've alluded to there would enable:

    • lists of all members who've identified "Elephant" as a favorite (or whatever)
    • lists of all of John Smith's favorite dogs
    • lists of all of John Smith's favorites of all types
    • lists of all people who have "Horses" AND "Zebras"
    • of lists of all people who have "Horses" AND "Donkeys"
    • etc. [as noted, having separate data elements makes it a lot easier for Excel to do wonders with data retrieval]
    • Joel_Bauer's avatar
      Joel_Bauer
      Copper Contributor
      Hi Mathetes, Sorry I am not open to a different approach. Data is coming from a legacy system. We are lucky to get what we have, Thanks, Joel
      • Joel_Bauer's avatar
        Joel_Bauer
        Copper Contributor

        Hi again Rimy,

          I cannot follow your suggested solution.  It’s the first time of I’ve heard of PowerQuery, PowerPivot, or DAX coding.

          What I am looking for is a function or series of functions or something that I can plug into Cells E2 and F2 and then copy down the rest of the rows.

          These Functions will look at the text in C2 and extract the Dog Breeds and Cat Breeds.

          I have enclosed a revised spreadsheet. The data is now in a table.  The ‘Before’ tab shows what I have now. The ‘After’ tab shows what I want.

          Any thoughts would be appreciated.

        Thanks, Joel

Resources