Timeseries resampling with Data Factory

%3CLINGO-SUB%20id%3D%22lingo-sub-130821%22%20slang%3D%22en-US%22%3ETimeseries%20resampling%20with%20Data%20Factory%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-130821%22%20slang%3D%22en-US%22%3E%3CP%3EDear%20Community%2C%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3EI%20am%20currently%20working%20on%20a%20data%20analytics%20project%20and%20have%20the%20challenge%20of%20handling%20time%20series%20data%20within%20azure.%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3EWe%20have%20several%20thousand%20sensors%2C%20which%20report%20their%20values%20to%20a%20control%20unit%2C%20which%20writes%20this%20data%3CBR%20%2F%3Ein%20form%20of%20json%20files%20into%20an%20Azue%20Data%20Lake.%3C%2FP%3E%0A%3CP%3EWe're%20also%20considering%20the%20option%20of%20writing%20this%20data%20directly%20to%20a%20Database.%3CBR%20%2F%3EThe%20sensors%20deliver%20their%20data%20in%20different%20intervals.%20But%20within%20our%20data%20platform%2C%20we%20wish%20to%20have%20one%3CBR%20%2F%3Euniform%20sampling%20rate%20over%20all%20sensors.%20Therefore%20we're%20required%20to%20upsample%20(interpolate)%20the%20signal%20if%20the%20frequency%20is%20too%20low%20or%20downsample%20(average)%20the%20signal%20if%20the%20frequency%20is%20too%20high.%3CBR%20%2F%3EWe%20want%20to%20keep%20this%20as%20simple%20as%20possible%20and%20thought%20about%20starting%20a%20Data%20Factory%20Job%2C%20which%20performs%20this%20job%20every%205%20minutes%20on%20the%20newly%20data%20which%20came%20into%20the%20source.%20The%20actual%20implementation%20can%20be%20in%20C%23%2C%20python%2C%20R%20or%20even%20javascript.%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3EFrom%20what%20I've%20learnt%20about%20Data%20Factory%20so%20far%2C%20there%20are%20different%20ways%20to%20do%20that%3A%3CBR%20%2F%3E1)%20Use%20HD%20Insights%3CBR%20%2F%3EI%20see%20this%20as%20pretty%20much%20work%20to%20setup%20and%20get%20familiar%20with%20it.%20Therefore%20looking%20for%20an%20easier%20option%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3E2)%20Use%20a%20U-SQL%20Query%20with%20C%23%2FPython%20or%20R%3CBR%20%2F%3EAs%20far%20as%20I%20understand%2C%20this%20is%20only%20possible%20via%20Data%20Lake%20analytics%2C%20thus%20only%20a%20Data%20Lake%20is%20a%20possible%20input%20source%2C%20correct%3F%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3E3)%20Create%20custom%20activity%3CBR%20%2F%3EThis%20is%20only%20in%20Data%20Factory%20v2%20available%2C%20which%20is%20currently%20in%20preview%2C%20not%20available%20in%20our%20favorite%20location%20and%20the%20integration%20into%20Visual%20Studio%202017%20is%20almost%200.%20Moreover%20it%20is%20a%20pretty%20complex%20setup%20alltogehter%20with%20the%20batch%20processing.%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3EMy%20question%20now%20is%3A%20%3CBR%20%2F%3EIs%20there%20any%20other%20possible%20setup%20you%20would%20suggest%20I%20did%20not%20see%3F%20And%20if%20not%2C%20which%20of%20the%20solutions%20you%20would%20suggest%3F%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3E%26nbsp%3B%3CBR%20%2F%3EThank%20you%20in%20advance%20for%20any%20input%20on%20this.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-130821%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAnalytics%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EHDInsight%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
New Contributor

Dear Community,
 
I am currently working on a data analytics project and have the challenge of handling time series data within azure.
 
We have several thousand sensors, which report their values to a control unit, which writes this data
in form of json files into an Azue Data Lake.

We're also considering the option of writing this data directly to a Database.
The sensors deliver their data in different intervals. But within our data platform, we wish to have one
uniform sampling rate over all sensors. Therefore we're required to upsample (interpolate) the signal if the frequency is too low or downsample (average) the signal if the frequency is too high.
We want to keep this as simple as possible and thought about starting a Data Factory Job, which performs this job every 5 minutes on the newly data which came into the source. The actual implementation can be in C#, python, R or even javascript.
 
From what I've learnt about Data Factory so far, there are different ways to do that:
1) Use HD Insights
I see this as pretty much work to setup and get familiar with it. Therefore looking for an easier option
 
2) Use a U-SQL Query with C#/Python or R
As far as I understand, this is only possible via Data Lake analytics, thus only a Data Lake is a possible input source, correct?
 
3) Create custom activity
This is only in Data Factory v2 available, which is currently in preview, not available in our favorite location and the integration into Visual Studio 2017 is almost 0. Moreover it is a pretty complex setup alltogehter with the batch processing.
 
 
My question now is:
Is there any other possible setup you would suggest I did not see? And if not, which of the solutions you would suggest?
 
 
Thank you in advance for any input on this.

0 Replies