Forum Discussion

Csaba73's avatar
Csaba73
Copper Contributor
Feb 25, 2022

Count string occurrences in semi-colon delimited data set

Hi all, this is my first post, so pls be nice.  I have a large survey dataset (80k IT developers).  One of the rows contains an unspecified number of programming languages per cell separated by semicolon (;) in alphabetical order.  I would like to build a pivot table to show for a subset of these IT developers what are the top-3 programming languages.

 

example

1 India C++; Java

2 Russia C++; Python; Swift

3 Israel C++; JavaScript; Ruby

4 Israel JavaScript

 

So how many IT developers in Israel use JavaScript (Answer = 2)

Across all developers how many use C++? (Answer = 3)

 

Ideally the pivot would look like below:

 

India C++ (100%); Java (100%)

Israel JavaScript (100%); C++ (50%);  Ruby (50%)

Russia C++ (100%); Python (100%); Swift (100%)

 

Does anyone know how to do this?  thanks a lot

2 Replies

  • Lorenzo's avatar
    Lorenzo
    Silver Contributor

    Hi Csaba73 

     

    Easily doable with Get & Transform aka Power Query assuming you run a recent version of Excel on Windows + there is always a space between the Country name and the 1st language, as in the few examples you provided

     

     

    Sample attached

    • Csaba73's avatar
      Csaba73
      Copper Contributor

      Hi Lorenzo - thanks so much for your kind reply. I am completely lost. I have the latest version of Excel as part of O365. I can do Data Import but I do not see how this is Power Query. When I connect to the external .csv I essentially get the same as any normal import into Excel (albeit a bit prettier). Once imported I have the same issue as before ... no clue how to manipulate the columns which contain data in this format

      C;C++;HTML/CSS;JavaScript;Matlab;Node.js;PHP;Python;SQL;TypeScript

      Your example above seems perfect but after 2 hours of trying to figure out what you did I have given up. Can you pls help and share how you went in your example from 'table 1' to 'By country'? thanks a lot

       

      p.s. I am using Excel for Mac (just discovered that it does NOT include Power Query Desktop)

Resources