Forum Discussion
Deyan_Petrov
Jan 22, 2024Copper Contributor
Materialized view with inner join over 2 tables skips data
Hi,
I have 2 tables which are ingested from event hubs with standard injection policy (10 minutes etc).
I have created a materialized view which has roughly the following definition:
.create materialized-view TransactionsWithCustDataMatView on table Transactions
{
Transactions
| join kind=inner Customers on CustomerId
| where CreatedOn > LastModifiedOn1
| summarize arg_max(LastModifiedOn1, *) by TransactionId
}
The problem is that sometimes a customer registers and within 1 minute also performs a transactions. However, since Customers and Transactions are different tables, each with its own ingestion (batching), it happens that sometimes the transaction of a new customer is ingested before the corresponding customer is ingested, and then the above materialized view "skips" the transaction due to the inner join.
Is there a way how to control/sync the ingestions of the 2 source table, or schedule the materialized view to run with some delay of 10-20 minutes (when ingestions to both tables would have happened)?
Br,
Deyan
Deyan
4 Replies
- Deyan_PetrovCopper ContributorNote, the column CreatedOn belongs to Transactions table, and the column LastModifiedOn1 belongs to the Customers table. The purpose of this condition is to select the last data record for a customer at the time the transaction was executed (new customer data records may be written also after the transaction has been executed, but we need the customer data *at* the time the transaction was executed).
- wernerzirkelBrass Contributor
Deyan_Petrov if it was my decision, I would try to avoid having this join in the materialized view definition. Instead, I would prepare a base table thich contains information of both tables. I would .append new data every 20 minutes in an ADF task from both tables. The materialized view would just refer to this base table.
The basetable could have a retention policy of a few days, so you would not keep the data twice.- Deyan_PetrovCopper Contributorwernerzirkel, thanks for the advice but how do you "prepare a base table thich contains information of both tables"? I have 2 different event hubs (Customers and Transactions) which are mapped via ingestion mapping each to a separate table. How do you map to different event hubs to the same table?