Filtering poorly formatted source data...

Hi,

I have a weirdly arranged data set that I am importing into Excel via Power Query.

My source has the structure I have indicated in the attached example (although the example is greatly simplified), i.e. the columns are divided by sub-headers (highlighted in the example), and the values that I am interested in – the Regional data – repeat multiple times in each subsection.

What I need are Subsection:Region:Value associations, i.e.

1A : Region1 : Value1

1A : Region2 : Value2

1A : Region3 : Value3

1B : Region1 : Value4

1B : Region2 : Value5

etc.

(They don't have to be represented like that neccessarily, but that's the data structure I need).

Note that the subsections (1A, 1B, 1C) are not necessarily the same length: some subsections might be missing data on one or more sites, meaning that the row count for each section is unpredictable… (so, I can’t just use Group by number of rows, in the Power Query import, for instance). This could also change between imports, so any solution has to be able to adapt to variable row counts.

The way I initially did it was to remove the 2 Site columns altogether in the Power Query import, leaving just Region and RegData, and then just call UNIQUE() on the resulting import table, but that only works if no region has the same value in 2 different subsections, which could happen, and would break downstream calculations if it did (because my UNIQUE() function would return a range missing a row). It would also be a problem if, by chance, one subsection had no values for a particular Region. I need the solution to return a null value in that case (e.g. 1B : Region2 : NULL), or in other words, the solution has to return a 3x3 range, even if some values are empty, and/or some values are the same...

I hope that makes sense.

Many thanks for any suggestions.

FilterChallenge.xlsx13 KB

BI & Data Analysis

excel

Formulas and Functions

power bi