Open-Source Repository of Forecasting Best Practices for Accelerating Solution Development
Published Apr 14 2020 10:17 AM 2,597 Views

Chenhui Hu, Vanja Paunic, Hong Ooi, Tao Wu, Wee Hyong Tok


Time series forecasting is one of the most important topics in data science. Imagine that you are a business owner, you might want to predict different sorts of future events to make better decisions and optimize your resource allocation. Typical examples of time series forecasting use cases are retail sales forecasting, package shipment delay forecasting, energy demand forecasting, and financial forecasting. As you can see, forecasting is everywhere! Given its ubiquitous nature and wide-ranging business applications, we have developed an open-source forecasting repo that puts world-class models and forecasting best practices in the hands of data scientists and industry experts – i.e., you!



Figure 1: Visualization of training and testing iterations of a sales forecasting scenario using LightGBM model


Forecasting Best Practices and Solution Accelerators

This repository provides examples of building forecasting solutions presented as Python Jupyter notebooks, R markdown files, and a library of utility functions. Our goal is to help you as a data scientist or machine learning engineer with varying levels of knowledge in forecasting:

  • Learn best practices for the development of forecasting solutions in a variety of languages.
  • Leverage recent advances in forecasting algorithms to build high-performance solutions and operationalize them.
  • Accelerate the solution development process for real-world forecasting problems. With the provided examples, you will be able to significantly reduce the “time to market” by simplifying the experience from defining the business problem to the development of solutions by orders of magnitude.

In the repository, you will find state-of-the-art (SOAT) forecasting models using traditional machine learning and deep learning approaches. Implementations of SOTA models in this release are centered around retail sales forecasting and are written in Python and R, two of the most popular programming languages in the forecasting domain. To enable high-throughput forecasting scenarios, we have included notebooks for forecasting multiple time series with distributed training techniques such as Ray in Python, the parallel package in R, and multi-threading in LightGBM. The following is a quick summary of forecasting models covered in this repository.







Auto Regressive Integrated Moving Average (ARIMA) model that is automatically selected

Linear Regression


Linear regression model trained on lagged features of the target variable and external features



Gradient boosting decision tree implemented with LightGBM package for high accuracy and fast speed



Dilated Convolutional Neural Network that captures long-range temporal flow with dilated causal connections

Mean Forecast


Simple forecasting method based on historical mean



ARIMA model without or with external features



Exponential Smoothing algorithm with additive errors



Automated forecasting procedure based on an additive model with non-linear trends and Tidyverts framework


The repository also comes with Azure Machine Learning (Azure ML) themed notebooks and best practices recipes to accelerate the development of scalable, production-grade forecasting solutions on Azure. You will find the following examples for forecasting with Azure AutoML as well as tuning and deploying a forecasting model on Azure.





Azure AutoML


Azure ML service that automates model development process and identifies the best machine learning pipeline



Azure ML service for tuning hyperparameters of machine learning models in parallel on cloud

Azure ML Web Service


Azure ML service for deploying a model as a web service on Azure Container Instance


Developing an accurate forecasting solution can be a complex and time-consuming process. We hope the forecasting repo will help shorten your development cycle.


To Learn More and Contribute

For more information, please visit:

Contributions from open-source community are always welcome! Please feel free to check our contribution guide if you would like to contribute to the content and bring in the latest SOTA algorithms.







Version history
Last update:
‎Apr 14 2020 04:30 PM
Updated by: