Restarting Formulas at Specified Data Points

Rich_100 · ‎Jun 22 2021

Hi,

I have a very large data set and need to move all relevant data onto the same row to complete the analysis I am working on. After playing around with it for a while, I have been unable to find a way to get the equations to restart when they reach a set reference cell (see below). This is a two-part problem, as follows:

Problem 1

I am trying to move everything in columns C and D to a single row (in this example, rows 2, 11, 16 and 30) and want the formula to reset every time it comes across a “1” reference (in this example, cells B2, B11, B16, B30), as each customer’s data set varies in length.

Problem 2

Similar to problem 1, I need the COUNTIF function to restart the equation every time it comes across a “1” reference (in this example, cells B2, B11, B16, B30), as, again, each customer’s data set varies in length.

Current

Desired Outcome (yellow cells, with all rows below appearing blank)

I have tried a bunch of different things and have hit a wall, so any help would be greatly appreciated!

Additionally, I am trying to complete this without VBA code at this stage; however, I am open to VBA or macros if the solution needs to go there.

General Information

Device platform: Windows 10
Excel Version: Office 365

Thanks in advance!

Riny_van_Eekelen · ‎Jun 22 2021

@Rich_100 Since you begin by mentioning "I have a very large data set", I'd suggest you look into Power Query (a.k.a. Get & Transform Data in MS365). But it would be helpful if you could upload a file containing some of your real data (replace any data that identifies real people or any other confidential information). 15 customers or so would do.

Rich_100 · ‎Jun 23 2021

Hi @Riny_van_Eekelen,

Thanks for getting back to me and suggesting the use of Power Query. I’m new to this feature, so just looking into how to use it now.

I've had to sign an NDA with this client, so don't think I'll be able to use anyone’s real data. In place of this, I have drafted some dummy data in the same format. Would this be OK for what you need?

To confirm, the data set is a few hundred thousand cells and the yellow cells in the attached Excel workbook are for the data I am hoping to calculate in the following format:

Age (column C) and gender (column D) – pull data from the corresponding columns to the top line for each client
Active member (column H)
- An active member is someone who has paid their membership within the last 2 months
For columns I – N, I just need to get a count of occurrences from the relevant columns for each client

Thank you for any advice or guidance you can provide, it’s greatly appreciated, and please let me know if you have any questions?

I’m going to have a play around with Power Query now and see if I can make any head way.

Riny_van_Eekelen · ‎Jun 23 2021

@Rich_100 Thanks for uploading. Very helpful. Not able to look at this right now. Welcoming others to jump in.

Rich_100 · ‎Jun 23 2021

All good, thanks. I need to deliver this work to the client by Monday 28th of June, so will continue to play around with Power Query and see where I get to. Any input, tips and tricks are definitely welcome!

Riny_van_Eekelen · ‎Jun 24 2021

@Rich_100 Forgive me for challenging your reporting request, but perhaps you could consider a more condensed approach. Rather than creating a large, rather unstructured, list with sub totals of some kind, why not create one ore more reports on the bases of cleaned and unpivoted data. A rough example is included in the attached workbook. I chose to load an intermediate table into Excel, just to demonstrate. In reality, you would probably load it into the Data Model and work from there.

Rich_100 · ‎Jun 24 2021

Thanks for the response @Riny_van_Eekelen, the data looks great! I’ve been asked to help on a project, which is outside of my usual area of expertise, so any advice or recommendations for the most efficient way to sort and display this data is definitely welcome.

Currently, I am learning Power Query as I go, which is a fairly steep learning curve. I’ve tried to reverse engineer what you’ve done, and made some head way, but can’t seem to get my outputs to look as clean as yours. Would you mind detailing your steps please?

Rich_100 · ‎Jun 24 2021

@Rich_100 Okay, let's give it a try. You can follow the applied steps in my file. No need to reverse engineer them. But, ignore the query "Table1". I forgot to delete it.

Step 1 was to create a separate table with just the Customer, Age and Gender.

Step 2 is to go back to the same source (Query "Table1 (2)". Remove some unwanted columns. Then merge the query from step 1 with the cleaned-up table in step 2. This will add the Age and Gender to each Customer record. Reorder columns and then, probably the most important step is to select the customer, age and gender columns and then select to "unpivot other columns", Then you get a long list of "records" from which you can filter out the date fields.

Now you can merge the "Attribute" and "Value" columns, separated by a colon. Close and load to a table, to create the output that you see in columns M:P.

Step 3 is to create pivot table from that table (i.e. the end result from Step 2) in order to give you the condensed view per customer.

As you noticed, PQ has quite a steep learning curve. But once you get over the first hurdles, you'll love it. Good luck!

Rich_100 · ‎Jun 24 2021

Perfect, thank you @Riny_van_Eekelen!

Apologies if this is a silly question, but how have you got the pivot table into the format where it is showing the age and gender on the same row, as I can only get it to display as seen in the attached document?

Riny_van_Eekelen · ‎Jun 24 2021

@Rich_100 Click anywhere inside the pivot table. On the "Design" ribbon, select "Report layout". First select "Show in tabular form" and then "Repeat all item labels". Still in the Design tab, select the "Subtotals" button and check "Don't show subtotals".

Rich_100 · ‎Jun 25 2021

Perfect, thanks! You're an absolute life saver @Riny_van_Eekelen. Thank you for all the help and taking the time, it's greatly appreciated!

Riny_van_Eekelen · ‎Jun 25 2021

@Rich_100 Most welcome!

Rich_100 · ‎Jun 24 2021

@Rich_100 Okay, let's give it a try. You can follow the applied steps in my file. No need to reverse engineer them. But, ignore the query "Table1". I forgot to delete it.

Step 1 was to create a separate table with just the Customer, Age and Gender.

Step 2 is to go back to the same source (Query "Table1 (2)". Remove some unwanted columns. Then merge the query from step 1 with the cleaned-up table in step 2. This will add the Age and Gender to each Customer record. Reorder columns and then, probably the most important step is to select the customer, age and gender columns and then select to "unpivot other columns", Then you get a long list of "records" from which you can filter out the date fields.

Now you can merge the "Attribute" and "Value" columns, separated by a colon. Close and load to a table, to create the output that you see in columns M:P.

Step 3 is to create pivot table from that table (i.e. the end result from Step 2) in order to give you the condensed view per customer.

As you noticed, PQ has quite a steep learning curve. But once you get over the first hurdles, you'll love it. Good luck!

View solution in original post

Restarting Formulas at Specified Data Points

Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Re: Restarting Formulas at Specified Data Points

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

Restarting Formulas at Specified Data Points