Blog Post

Data Architecture Blog

3 MIN READ

DevOps for Data Science – Part 9 - Application Performance Monitoring

Microsoft

Jul 27, 2021

In this series on DevOps for Data Science, I’ve explained the concept of a DevOps “Maturity Model” – a list of things you can do, in order, which will set you on the path for implementing DevOps in Data Science. The first thing you can do in your projects is to implement Infrastructure as Code (IaC) , and the second thing to focus on is Continuous Integration (CI). However, to set up CI, you need to have as much automated testing as you can – and in the case of Data Science programs, that’s difficult to do. From there, the next step in the DevOps Maturity Model is Continuous Delivery (CD). Once you have that maturity level down, you can focus on Release Management. And now we’re off to the next Maturity Level: Application Performance Monitoring (APM).

Now that the creation, testing and delivery of your organization’s solution is automated (along with the Data Science part of that application) you need a method to make it faster, and more efficient. Once again, starting at the very beginning of the process, the Data Science team should think about building performance monitoring and reporting directly into the code. This isn’t as hard as you might think – simply take a methodical approach to performance just as you would the predictive or classification algorithm itself.

Start with the basics – instrumentation. Your code should have at least three levels of monitoring down to a central log file or metrics collection system:

Minimal – Log the time the prediction/classification started, and when it ended
Standard - Log the start and stop times of the event, the name of the event, and the calling function(s)
Debug – Log the start and stop times, the name of the event, the calling function, as much of the call-stack as you can securely record, and any information you can securely and legally gather on the user and user environment

Next, devise a method to view the collected metrics. Finally, do a little Data Science over that data – standard reporting, fivenum summaries, and even predictive and classification work to optimize the system.

And that brings us to the next point – if you think about performance monitoring and management at the beginning of the process, and then build it in as you go, you can use those metrics in analyzing your automated tests, including stress-testing the system. It’s a virtuous cycle, and the very point of DevOps.

To be sure, there are more formal methods and concepts you should study to fully implement APM – there’s a good reference on that here: https://dzone.com/articles/what-is-application-performance-monitoring-apm-app . But if you simply metric and monitor the application, you’ll have a great start to this process. Want a complete series on APM? Here’s one that introduces Application Insights, which simplifies this process: https://channel9.msdn.com/Series/DevOps-Fundamentals/Application-Performance-Monitoring-and-Availability-Monitoring

See you in the next installment on the DevOps for Data Science series, where I’ll cover the next level in your DevOps Maturity Model for Data Science teams.

For Data Science, I find this progression works best – taking these one step at a time, and building on the previous step – the entire series is here:

Infrastructure as Code (IaC)
Continuous Integration (CI) and Automated Testing
Continuous Delivery (CD)
Release Management (RM)
Application Performance Monitoring (This article)
Load Testing and Auto-Scale

In the articles in this series that follows, I’ll help you implement each of these in turn.

If you’d like to implement DevOps, Microsoft has a site to assist. You can even get a free offering for Open-Source and other projects: https://azure.microsoft.com/en-us/pricing/details/devops/azure-devops-services/

Need a quick introduction to DevOps? Check out this series
Here’s a complete, full course on DevOps on Microsoft Learn

Updated Aug 31, 2021

Version 2.0

data architecture

Data Science

BuckWoodyMSFT

Microsoft

Joined September 16, 2018

View Profile

Data Architecture Blog

Follow this blog board to get notified when there's new activity

3 Comments

BuckWoodyMSFT
Microsoft
Aug 31, 2021
Thanks for the bug-catch, Stephen! I'll fix that now. And yes, models absolutely degrade over time. Retraining is a huge part of the effort - I've called that out (and will again) in past articles. Thanks for reading!
Stephen Hayes
Microsoft
Aug 12, 2021
Because we are talking data science, how should we think about measuring the DS model performance? For instance if the model was a text classifier, is there a method to measure how it is performing in terms of classifying text, I have read that model performance can degrade over time as the date being sent them deviates from the original training data?
Stephen Hayes
Microsoft
Aug 12, 2021
Bug: the links to previous articles are not updated, this article is Application Performance Monitoring