Forum Discussion

venkat_nalla's avatar
venkat_nalla
Copper Contributor
Mar 21, 2023

Filter duplicate records in data flow expression builder

Hi Team,

 

Please support on filtering duplicate records through the expression builder expression

 

Ex: 

 

 

in the above data, i wanted to filter only highlighted records though the expression builder.

I am writing like below but it's not giving expected results.

 

countif({ID}>1,{ID}).

 

Thanks in Advance,

 

Kind regards,

VN

 

  • Gunjan_Kanani's avatar
    Gunjan_Kanani
    Copper Contributor
    1) First Use the Aggregate Transformation: Aggregate Settings
    Group By: - ID (select your id column)
    Aggregates: -
    1. Give column name: count
    2. Use count function in expression: count()

    2) Then Use the Conditional Split transformation and gives the following condition:
    Stream name: Duplicates
    Condition: count>1
    Stream name: Distinct
    Distinct is the default field so it will automatically give you distinct set of records and duplicate condition gives you duplicate data.

    After this step you can create a branch and add sink for both different data one for Duplicates and one for Distinct.

    For your better understanding please see the below video link:
    https://youtu.be/JK50gtmoUSo

Resources