Split Dataset

%3CLINGO-SUB%20id%3D%22lingo-sub-2409759%22%20slang%3D%22en-US%22%3ESplit%20Dataset%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2409759%22%20slang%3D%22en-US%22%3E%3CP%3EHello%20Tech%20Community%2C%3CBR%20%2F%3E%3CBR%20%2F%3EI%20want%20to%20be%20able%20to%20split%20a%20data%20set%20that%20I%20have.%20I%20want%20to%20split%20it%20evenly%2050%2F50%20but%20want%20a%20particular%20column%20to%20contain%20equal%20distributions%20in%20the%20splits.%20For%20instance%20this%20particular%20column%20is%20a%20code%20column%20that%20contains%20numbers%20from%201%20-%204.%20I%20want%20my%20segmentations%20from%20the%2050%2F50%20to%20have%20similar%20distributions%20of%20that%20code%20column.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EThanks%20for%20the%20Help!%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-2409759%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EBI%20%26amp%3B%20Data%20Analysis%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EExcel%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EFormulas%20and%20Functions%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2410050%22%20slang%3D%22en-US%22%3ERe%3A%20Split%20Dataset%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2410050%22%20slang%3D%22en-US%22%3E%3CP%3EFilter%20the%20first%20column%20and%20indclude%20the%20values%20less%20or%20equal%20to%20median.%20The%20second%20column%20imclude%20the%20values%20greater%20than%20or%20equal%20to%20median.%20Should%20split%20the%20lits%20into%20two%20simila%20columns.%3CBR%20%2F%3E%3DFILTER(D4%3AD7%3BC4%3AC7%26lt%3B%3DMEDIAN(C4%3AC7))%3CBR%20%2F%3E%3CBR%20%2F%3EWhere%20C4%3AC7%20is%20your%20list%201%20to%204.%3CBR%20%2F%3E%3CBR%20%2F%3E-%20Geir%3C%2FP%3E%3C%2FLINGO-BODY%3E
Occasional Contributor

Hello Tech Community,

I want to be able to split a data set that I have. I want to split it evenly 50/50 but want a particular column to contain equal distributions in the splits. For instance this particular column is a code column that contains numbers from 1 - 4. I want my segmentations from the 50/50 to have similar distributions of that code column.

 

Thanks for the Help! 

1 Reply

Filter the first column and indclude the values less or equal to median. The second column imclude the values greater than or equal to median. Should split the lits into two simila columns.
=FILTER(D4:D7;C4:C7<=MEDIAN(C4:C7))

Where C4:C7 is your list 1 to 4.

- Geir