Dec 04 2020 04:04 AM
I've tried to be very clear about what is being done and what has been tried, it's not a simple question so thankyou to anyone who is up for the challenge to help me tackle this problem!
I'm trying to count data across collumns and rows.
Here's the scenario:
The organisation I'm working with is a non-profit and wants to analyse their data. They can't do this from within their CRM system, so have to download it into Excel and then transform and analyse the data as follows.
At the moment, a full activities report from their CRM of ALL activities between two dates is downloaded. (For an idea of scale, this is about 5,000 rows and 90 collumns) (I've included a screenshot of the sample data for Outcome 1 below).
They are trying to then count the instances when certain Outcomes (there are about 50 Outcomes) are acheived. They need to count the number of activities (rows) AND the number of beneficiaries (unique values in one of the collumns) who have acheived that outcome.
For example: Outcome 1 is acheived when: Advice Given = Benefits OR Referred to = (Benefits OR housing support) OR Appointment attended = (GP OR Housing Support)
At the moment, the raw data is transformed by:
As you may be able to guess, for all 50 Outcomes (each one looks across 3 collums) this took a long time to build and, with all the collumns, it's really tricky to navigate. If anything changes in the raw data (e.g. a new collumn is added) it takes ages to re-structure the collumns to do that. The have asked me to look if I can do a process to make this more automated.
Solutions tried:
Attached is a sample of the raw data how it appears in Excel (bear in mind I've only included data for one Outcome out of the 50!)
Activity Name | Date | Beneficiary Name | Advice Given | Referred to | Appointment Attended |
Beneficiary meeting | 03/01/2020 | Jane Smith | Benefits, Housing support, GP | Benefits, Housing support, GP | |
Beneficiary meeting | 04/01/2020 | Shamyla Aslam | Benefits, GP | ||
Beneficiary meeting | 06/02/2020 | Jane Smith | Housing Support | Housing support | |
Beneficiary meeting | 04/03/2020 | Shamyla Aslam | Benefits, Housing support, GP | ||
Beneficiary meeting | 05/05/2020 | Jane Smith | Benefits, Housing support, GP | Benefits |
Here;s the criteria required for one of the outcomes:
Outcome achieved when: | |
Field | Field includes: |
Advice Given | Benefits |
Referred to | Benefits, Housing Support |
Appointment Attended | GP, Housing Support |
Required to Count | |
Number of beneficiaries who have achieved the outcome (unique count) | |
Number of activities where the outcome was achieved |
If you've got this far, thankyou **SO MUCH** for taking a look and I hope you enjoy the challenge of helping me to find alternative solutions!
Chloe
Dec 04 2020 05:48 AM
OK. I realize the actual data includes real names of real people, so although I'm going to ask you to attach an actual file--the small representations in your initial query just don't suffice to grasp it all--I'm not asking for all the real data.
Take a representative portion of the real data, all columns, for maybe 10-20 clients. Replace real names with those of Disney or Star Wars characters. Replace any other identifiable data (addresses, SSNs, emails) with nonsense data.
Then post that representative data set....and we'll see if we can make sense of it.
Dec 04 2020 11:12 PM
Dec 05 2020 05:18 AM - edited Dec 05 2020 06:53 AM
@ChloeCI I must admit that the description of your problem is so elaborate that it becomes hard to follow what you actually want. But, perhaps the attached file is a step towards a solution. I've taken your raw data into a structured table (blue), loaded it into "Get & Transform Date" a.k.a. Power Query, and added columns for each of the outcomes where its originating column contains either of the relevant key-words. Then I cleaned it up a bit. The outcome is in the green table with some totals to calculate what I believe you asked for.
Note that I have used a relatively new functions FILTER and UNIQUE to count the unique number of beneficiaries in each column. Also added a record for "John Doe" to test the model.
If this is not what you had in mind, it would be a good idea if you could manually create a table with the desired outcome for your example and share it here.
File attached.
Dec 05 2020 05:39 AM
You're a good man, with a lot more patience...to take the time to create your own sample table here! Well done.
Dec 05 2020 06:00 AM
@mathetes Copy/paste does wonders. Creating the blue table took about 5 seconds :)