Forum Discussion
How to create view joining two table of similar records without duplicating.
- Jul 24, 2024
SAGAR1985 I think I might see where the issue lies.
Your query for the [dbo].[ARC_Skill] table is selecting distinct Personal_IDs, meaning you are counting unique IDs. If there's duplicates, you're not seeing them.select distinct Personal_ID from [dbo].[ARC_Skill] where Calendar_day = '2024-05-31'; --42678 rows
If you do the following queries, I believe you'll find that you have the same amount of duplicates in both your tables:Select Personal_ID, /* The personal_ID */ Calendar_day, /* The calendar day */ Count(Personal_ID) as NumberOfDuplicates /* Number of rows with this given Personal_ID and Calendar_day */ from [dbo].[ARC_SKILL] where Calendar_day = '2024-05-31' Group By Personal_ID, Calendar_day /* Grouping enables counting rows pr. Personal_ID */ Order by NumberOfDuplicates Desc /* Largest number on top */ /* You may have to use 'order by COUNT(Personal_ID) desc' instead */ Select Personal_ID, /* The personal_ID */ Calendar_day, /* The calendar day */ Count(Personal_ID) as NumberOfDuplicates /* Number of rows with this given Personal_ID and Calendar_day */ from [dbo].[Skill_Analysis] where Calendar_day = '2024-05-31' Group By Personal_ID, Calendar_day /* Grouping enables counting rows pr. Personal_ID */ Order by NumberOfDuplicates Desc /* Largest number on top */ /* You may have to use 'order by COUNT(Personal_ID) desc' instead */
In your counting table at the end, we see the columnsRow Label View Count Actual RL1 x^2 x RL2 y^2 y
This means you've accidentally gotten a cartesian cross product where your tables contain the same amount of duplicates for the combination of [Personal_ID] and [Calendar_day].
For RL '51512295', that means all 11 rows with a given [Personal_ID] and [Calendar_day] in [ARC_Skill] has been matched separately with all 11 rows with the same [Personal_ID] and [Calendar_day] in [Skill_Analysis]. 11*11 = 121.
----
In conclusion, [Personal_ID] and [Calendar_day] are not sufficient to uniquely match between these tables.
I hope there's another column you can add in the join predicate in order to get the expected result.
SAGAR1985 I think I might see where the issue lies.
Your query for the [dbo].[ARC_Skill] table is selecting distinct Personal_IDs, meaning you are counting unique IDs. If there's duplicates, you're not seeing them.
select distinct Personal_ID from [dbo].[ARC_Skill] where Calendar_day = '2024-05-31';
--42678 rows
If you do the following queries, I believe you'll find that you have the same amount of duplicates in both your tables:
Select
Personal_ID, /* The personal_ID */
Calendar_day, /* The calendar day */
Count(Personal_ID) as NumberOfDuplicates /* Number of rows with this given Personal_ID and Calendar_day */
from [dbo].[ARC_SKILL]
where Calendar_day = '2024-05-31'
Group By Personal_ID, Calendar_day /* Grouping enables counting rows pr. Personal_ID */
Order by NumberOfDuplicates Desc /* Largest number on top */
/* You may have to use 'order by COUNT(Personal_ID) desc' instead */
Select
Personal_ID, /* The personal_ID */
Calendar_day, /* The calendar day */
Count(Personal_ID) as NumberOfDuplicates /* Number of rows with this given Personal_ID and Calendar_day */
from [dbo].[Skill_Analysis]
where Calendar_day = '2024-05-31'
Group By Personal_ID, Calendar_day /* Grouping enables counting rows pr. Personal_ID */
Order by NumberOfDuplicates Desc /* Largest number on top */
/* You may have to use 'order by COUNT(Personal_ID) desc' instead */
In your counting table at the end, we see the columns
| Row Label | View Count | Actual |
| RL1 | x^2 | x |
| RL2 | y^2 | y |
This means you've accidentally gotten a cartesian cross product where your tables contain the same amount of duplicates for the combination of [Personal_ID] and [Calendar_day].
For RL '51512295', that means all 11 rows with a given [Personal_ID] and [Calendar_day] in [ARC_Skill] has been matched separately with all 11 rows with the same [Personal_ID] and [Calendar_day] in [Skill_Analysis]. 11*11 = 121.
----
In conclusion, [Personal_ID] and [Calendar_day] are not sufficient to uniquely match between these tables.
I hope there's another column you can add in the join predicate in order to get the expected result.
Thanks a lot for your quick support and insight to resolve. It really helped and after adding Project_Code column into that view, the count is really checked and minor variation is coming. But still way better than the previous one. Thanks for all your effort. Really appreciated.
Regards,
Sagar