Forum Discussion
Count string occurrences in semi-colon delimited data set
Hi all, this is my first post, so pls be nice. I have a large survey dataset (80k IT developers). One of the rows contains an unspecified number of programming languages per cell separated by semicolon (;) in alphabetical order. I would like to build a pivot table to show for a subset of these IT developers what are the top-3 programming languages.
example
1 India C++; Java
2 Russia C++; Python; Swift
3 Israel C++; JavaScript; Ruby
4 Israel JavaScript
So how many IT developers in Israel use JavaScript (Answer = 2)
Across all developers how many use C++? (Answer = 3)
Ideally the pivot would look like below:
India C++ (100%); Java (100%)
Israel JavaScript (100%); C++ (50%); Ruby (50%)
Russia C++ (100%); Python (100%); Swift (100%)
Does anyone know how to do this? thanks a lot
2 Replies
- Csaba73Copper Contributor
Hi Lorenzo - thanks so much for your kind reply. I am completely lost. I have the latest version of Excel as part of O365. I can do Data Import but I do not see how this is Power Query. When I connect to the external .csv I essentially get the same as any normal import into Excel (albeit a bit prettier). Once imported I have the same issue as before ... no clue how to manipulate the columns which contain data in this format
C;C++;HTML/CSS;JavaScript;Matlab;Node.js;PHP;Python;SQL;TypeScript
Your example above seems perfect but after 2 hours of trying to figure out what you did I have given up. Can you pls help and share how you went in your example from 'table 1' to 'By country'? thanks a lotp.s. I am using Excel for Mac (just discovered that it does NOT include Power Query Desktop)