Forum Discussion
Materialized view with inner join over 2 tables skips data
Deyan_Petrov if it was my decision, I would try to avoid having this join in the materialized view definition. Instead, I would prepare a base table thich contains information of both tables. I would .append new data every 20 minutes in an ADF task from both tables. The materialized view would just refer to this base table.
The basetable could have a retention policy of a few days, so you would not keep the data twice.
- wernerzirkelJul 24, 2024Brass Contributor
Deyan_Petrov I like the concept of getting in data without any change before handling it, because the ingest is always a somewhat critical phase.
So I would start with two incoming tables: data1_raw and data2_raw. Both tables would forward ingested to a joint basetable in a batch mode:
.append basetable
<| data1_raw
| extend ingtime = ingestion_time()
| where ingtime > lastingest //lastingest could be a "watermark" variable in ADF
| project col1, col2,... //use common columns in your basetable
Now you have all the data in your basetable. Now you can put a materialized view on your base table which de-duplicates data and just keeps the columns you need.
Note: maybe it's also possible to use update policies instead of the ADF batch mode... I did not consider it