Forum Discussion
pulled datasets from CTEs but got stuck with joining them together
Hi - Thanks for your time, I would like to give the scenario clearly and kindly request for some ideas on the best route..
I am using CTEs to pull data ... say for example data that belongs to Parent Incident and this parent incident dataset KEY column has duplicates because some other columns in that dataset has more than 1 value for example
With CTE Parent Incident AS
(
),
Test1 AS
(
),
Test2 AS
(
),
Final_Parent_Data AS
(
data from the 3 CTEs above joined
)
SELECT * FROM Final_Parent_Data
say from the 3 CTEs above I have the parent incident dataset that i need which also has some other columns that are causing the KEY column values to be repeating
GroupID GroupName IncidentID CustomerImpacted
G123 GlobalSupport INC123 Customer111
G123 GlobalSupport INC123 Customer222
G345 North America INC345 Customer333
G345 North America INC345 Customer444
This is my parent kind of dataset now i have to get the child dataset which is based on some CTES for ex; how is the child dataset linked to the parent dataset based on IncidentID
CTE Child_Dataset AS
(
),
Test888 AS
(
),
Test999 AS
(
),
Final_Child_Dataset AS
(
CTEs joined
)
select * from Final_Child_Dataset
IncidentID Incdent_Task_ID Incident Task Name
INC123 ITASK1 TaskName1
INC123 ITASK2 TaskName2
INC123 ITASK3 TaskName3
INC123 ITASK4 TaskName4
INC123 ITASK5 TaskName5
INC345 ITASK888 TaskName888
INC345 ITASK999 TaskName999
INC345 ITASK111 TaskName111
Now I have to do Full Outer JOIN to get all the records from both sides (Why? to get anything that is not mapped) - I just gave an example simple to understand
I have to use the IncidentID to join - but as we can see this will cause the cartesian join
4 * 8 = 32 records
what are the best options to get the cartesian join removed -
Note: This is just an example of what i have been working on - my dataset is in this format, and iam using the CTE's but the final dataset that comes out of the CTE is exactly like above - the KEY field in both places is M:M - thanks for your time, what could be best options to remove the cartesian join that is caused by the join
Select
A.*m
B.*
FROM Final_Parent_Data A FULL OUTER JOIN Final_Child_Dataset B ON A.IncidentID=B.IncidentID
- rpreddiz123Copper Contributorany suggestions please..just looking for the ideas
- olafhelperBronze ContributorWe don't have your database to test anything.
lease post table design as DDL, some sample data as DML statement and the expected result.- rpreddiz123Copper Contributor
Hi olafhelper - Thanks for your time. I understand that it is very difficult to without looking at the data. It is kind of very messy since the CTEs are from multiple tables and finally these CTEs need to combined together - If we look at this dataset below - it is not a straight pull from a single table - this is the final output of the first dataset
GroupID GroupName IncidentID CustomerImpacted
G123 GlobalSupport INC123 Customer111
G123 GlobalSupport INC123 Customer222
G345 North America INC345 Customer333
G345 North America INC345 Customer444if we look at these rows in which the IncidentID is the KEy column we have multple customers, multiple groups - correct?
what do we need to do next? there is another dataset again this dataset is from the CTE which is from few tables - this includes IncidentID which is like parent and it can have multiple rows
cardinality between these 2 dataset - M: M - why?
the same IncidentID in both sides is repeating for example
dataset 1
4 rows for 1 IncidentID INC123 because it has different groups, different customers,
when i go to dataset 2 INC123 can have 50 rows why? INC123 is the parent for few records like
INC123 is the parent of INCHILD1, INCCHILD2, INCCHILD3, INCCHILD4 etc
what is happening now? cartesian join - 4 * 50 = 200
one option i know is
1).Take out the duplicates in dataset1 - so we get 1 row for INC123
does the bridge table or intermediate table logic works here?