Forum Discussion
RamonMA1360
Nov 06, 2020Copper Contributor
Setting up a new Data Factory
Hi all, I'm not a data expert/novice, but I do love data. I'm looking to up my companies game when it comes to ingesting and storing data from our various sources. We currently have a few on prem da...
DennesTorres
Nov 17, 2020MVP
Hi, RamonMA1360 ,
Your explanation is a bit confuse, but I will try to answer following some patterns. It doesn't mean they will fit your situation, and in my opinion you need to hire a consultant to help solve your problem, either me or other you may find in the data community.
From the beginning, the very first question is: What are you trying to build? In what definition does it fit?
Production: Regular daily work of the company
Intelligence: Analytics data, never exactly the same, to help with the business plan
If you are trying to build something that fits in the definition of "Production", than you have an architecture problem. The microservices you have should be correct planned in a way to cover the production needs without this need of data integration. They can use each other, of course, but they should have a low coupling and be mostly independent. If your need to join the data they produce is to solve a production problem, this means they are not correctly designed and should be fixed in the first place.
If what you are building is for data intelligence, than starting from production the first thing you need to build is a data warehouse. It's not a simple data transfer that a copy activity may handle. You need to build a data warehouse model, which is different than the relational model.
Fact table and dimension, star model and snowflake, de-normalizing, these are just some of the concepts you may need to use to build the data warehouse model. The data will be imported from the production using an ETL process. You don't have the need to call the API's for the ETL, at least you shouldn't. You can use Data Factory to read directly from each microservices database and transform the data for the data warehouse.
If any of the microservices is using unstructured data (should they? SQL or No SQL, usually there is a structure) you may need to build a data lake, import the files into the data lake and use a tool such as Synapse SQL On Demand Pool to read the data and integrate as needed with the data warehouse, what may have many different meanings.
So, in summary: Your question was a bit confuse, so I'm exposing some usual patterns. You should try to fit on them, but that doesn't mean your environment will fit precisely on them. As I say, you may need a consultant.
Kind Regards,
Dennes