Forum Discussion
venkat_nalla
Mar 21, 2023Copper Contributor
Filter duplicate records in data flow expression builder
Hi Team, Please support on filtering duplicate records through the expression builder expression. Ex: in the above data, i wanted to filter only highlighted records though the ...
Gunjan_Kanani
Apr 08, 2023Copper Contributor
1) First Use the Aggregate Transformation: Aggregate Settings
Group By: - ID (select your id column)
Aggregates: -
1. Give column name: count
2. Use count function in expression: count()
2) Then Use the Conditional Split transformation and gives the following condition:
Stream name: Duplicates
Condition: count>1
Stream name: Distinct
Distinct is the default field so it will automatically give you distinct set of records and duplicate condition gives you duplicate data.
After this step you can create a branch and add sink for both different data one for Duplicates and one for Distinct.
For your better understanding please see the below video link:
https://youtu.be/JK50gtmoUSo
Group By: - ID (select your id column)
Aggregates: -
1. Give column name: count
2. Use count function in expression: count()
2) Then Use the Conditional Split transformation and gives the following condition:
Stream name: Duplicates
Condition: count>1
Stream name: Distinct
Distinct is the default field so it will automatically give you distinct set of records and duplicate condition gives you duplicate data.
After this step you can create a branch and add sink for both different data one for Duplicates and one for Distinct.
For your better understanding please see the below video link:
https://youtu.be/JK50gtmoUSo