Counting rows that contain certain data across 3 collumns

Copper Contributor

I've tried to be very clear about what is being done and what has been tried, it's not a simple question so thankyou to anyone who is up for the challenge to help me tackle this problem!

 

I'm trying to count data across collumns and rows.

 

Here's the scenario:
The organisation I'm working with is a non-profit and wants to analyse their data. They can't do this from within their CRM system, so have to download it into Excel and then transform and analyse the data as follows.

 

At the moment, a full activities report from their CRM of ALL activities between two dates is downloaded. (For an idea of scale, this is about 5,000 rows and 90 collumns) (I've included a screenshot of the sample data for Outcome 1 below).

 

They are trying to then count the instances when certain Outcomes (there are about 50 Outcomes) are acheived. They need to count the number of activities (rows) AND the number of beneficiaries (unique values in one of the collumns) who have acheived that outcome.


For example: Outcome 1 is acheived when: Advice Given = Benefits OR Referred to = (Benefits OR housing support) OR Appointment attended = (GP OR Housing Support)

 

At the moment, the raw data is transformed by:

  • Creating a new collumn for each Outcome that needs to be reported on. This looks for the data that meets the outcomes criteria (e.g. new collumn created that searches in the 'Referred to' collumn for 'Benefits' and marks that activities with a 1 if it exists. Then another collumn is created for 'Referred to' collumn for 'Housing Support' and marks that row with a 1 if it exists etc.).
  • A further collumn (called Outcome 1) is then created to look across the new collumns and marks with a 1 if, for that activity, one of the criteria is met.
  • A seperate sheet is then created to sum the outcomes (e.g. Number of activities with Outcome 1 achieved AND the number of Beneficiary Names with Outcome 1 achieved)

As you may be able to guess, for all 50 Outcomes (each one looks across 3 collums) this took a long time to build and, with all the collumns, it's really tricky to navigate. If anything changes in the raw data (e.g. a new collumn is added) it takes ages to re-structure the collumns to do that. The have asked me to look if I can do a process to make this more automated.

Solutions tried:

  • Building a Macros to do the data transformation. LIMITATIONS: the formulae would have to be hardcoded into the Macros which would mean if any additional fields are added or removed to the raw Civi data, the formulae would have to be re-edited in the Macros. At the moment don't have the skills to do this, so this would be quite unsustainable.
  • Creating Pivot tables for each of the outcomes without transforming the raw data first. LIMITATIONS. I need to create 3 pivot tables to look across the 3 collumns (the pivot table filter is an AND rather than an OR, as far as I'm aware so I can't do it in one pivot table), then use a lookup to see which individuals have met at least one of the field's criteria. This enables me to count Beneficiaries, but double counts the activities (e.g. if 'Benefits' appears in more than one collumn on the same row it counts it twice).
  • Countif on the raw data. I can make this work using a formula similar to =COUNTIF('Raw Civi Data'!$H$2:$H$4500,"Benefits") this counts activities but not unique contacts. I can't find a formula to help me count contacts. If anyone can suggest one that would be the BIGGEST help!!

Attached is a sample of the raw data how it appears in Excel (bear in mind I've only included data for one Outcome out of the 50!)

 

Activity NameDateBeneficiary NameAdvice GivenReferred toAppointment Attended
Beneficiary meeting03/01/2020Jane SmithBenefits, Housing support, GPBenefits, Housing support, GP 
Beneficiary meeting04/01/2020Shamyla Aslam  Benefits, GP
Beneficiary meeting06/02/2020Jane Smith Housing SupportHousing support
Beneficiary meeting04/03/2020Shamyla Aslam  Benefits, Housing support, GP
Beneficiary meeting05/05/2020Jane SmithBenefits, Housing support, GPBenefits 

 

Here;s the criteria required for one of the outcomes:

Outcome achieved when: 
FieldField includes:
Advice GivenBenefits
Referred toBenefits, Housing Support
Appointment AttendedGP, Housing Support
  
Required to Count 
Number of beneficiaries who have achieved the outcome (unique count)
Number of activities where the outcome was achieved


If you've got this far, thankyou **SO MUCH** for taking a look and I hope you enjoy the challenge of helping me to find alternative solutions!

Chloe

 

5 Replies

@ChloeCI 

 

OK. I realize the actual data includes real names of real people, so although I'm going to ask you to attach an actual file--the small representations in your initial query just don't suffice to grasp it all--I'm not asking for all the real data.

 

Take a representative portion of the real data, all columns, for maybe 10-20 clients. Replace real names with those of Disney or Star Wars characters. Replace any other identifiable data (addresses, SSNs, emails) with nonsense data.

 

Then post that representative data set....and we'll see if we can make sense of it.

Number of beneficiaries who have achieved the outcome (unique count) - Means ??

@ChloeCI I must admit that the description of your problem is so elaborate that it becomes hard to follow what you actually want. But, perhaps the attached file is a step towards a solution. I've taken your raw data into a structured table (blue), loaded it into "Get & Transform Date" a.k.a. Power Query, and added columns for each of the outcomes where its originating column contains either of the relevant key-words. Then I cleaned it up a bit. The outcome is in the green table with some totals to calculate what I believe you asked for.

 

Note that I have used a relatively new functions FILTER and UNIQUE to count the unique number of beneficiaries in each column. Also added a record for "John Doe" to test the model.

Screenshot 2020-12-05 at 14.17.19.png

 

If this is not what you had in mind, it would be a good idea if you could manually create a table with the desired outcome for your example and share it here.

 

File attached.

 

@Riny_van_Eekelen 

 

You're a good man, with a lot more patience...to take the time to create your own sample table here! Well done.

@mathetes Copy/paste does wonders. Creating the blue table took about 5 seconds :)