Model Lifecycle Management for Azure Digital Twins
Published May 07 2021 09:00 AM 7,061 Views
Copper Contributor

Model Lifecycle Management for Azure Digital Twins

Author – Andy Cross (External), Director of Elastacloud Ltd, a UK based Cloud and Data consultancy
Azure MVP, Microsoft RD.

 

Ten years ago, my business partner Richard Conway and I founded Elastacloud to operate as a consultancy that truly understood the value of the Cloud around data, elasticity and scale; building next generation systems on top of Azure that are innovative and impactful. For the last year, I’ve been leading the build of a Digital Twin based IoT product we call Elastacloud Intelligent Spaces.

 

When working with Azure Digital Twins, customers often ask what the best practice is for managing DTDL Versions. At Elastacloud, we have been working with Azure Digital Twins for some time and I’d like to share the approach we developed to manage our DTDL model lifecycles from .NET 5.0.

 

What is DTDL?

If you are not familiar with Azure Digital Twins and DTDL, Azure Digital Twins is a PaaS service for modelling related data such as you’d often find in real world scenarios. It is a natural fit for IoT projects, since you can model how a sensor relates to a building, to a room, to a carbon intensity metric, to their enclosing electrical circuit, to an owner, to neighboring sensors and their respected metrics, owners, rooms and so on. It is a Graph Database, which focusses on the links that exist in the graph, giving it the edge over more commonly found relational databases, since it features the ability to rapidly and concisely traverse data by its links across a whole data set.

 

Azure Digital Twins adopts the idea that the nodes on the graph (known as Digital Twins) can be typed. This means that the store of Entities that holds the data are in defined sets of shapes that are defined in Digital Twin Definition Language. The definition language allows developers to constrain the data that an entity can store, in a list of contents. These are broadly synonymous with the notion of columns in a traditional relational database. Just like in other database systems, when a development team iterates on a data structure to add a property, edit or remove one, the development team has to consider how to keep the software and the data structure in sync.

 

What is the Version challenge?

Models in DTDL are stored in a JSON format, and therefore typically stored as a .json file. We store these in a git repository right alongside the code that interacts with the data shapes that they define.

 

The key question of the Version Challenge therefore is: “When I update my model definitions in my local dev environment, how do I automatically update the models that are available in Azure Digital Twin?”

 

There is one additional twist, when you want to use a model, for example to create a new digital twin, you have to know the version number of the model that you want to create. This means your software needs to also be kept in sync with your models, and your deployment.

 

In order to keep track of all this, each Azure Digital Twin model has a model identifier. The structure of a Digital Twin Model Identifier (DTMI) is:

 

dtmi:[some:segmented:name];[version]

 

 

For example:

 

dtmi:com:elastacloud:intelligentspaces:room;168

 

 

Our solution then needs to solve these top-level issues, whilst being developer friendly, and fitting into best practice for deployments.

We might consider this ideal workflow:

A developer workflow that includes continuous deployment of DTDL models as described in the text.A developer workflow that includes continuous deployment of DTDL models as described in the text.

Building Blocks

We want to be able to construct our approach to versioning without prejudicing our ability to use the fullness of ADT features. There are a few main options that present themselves to us:

  1. Hold the JSON representation of the DTDL on disk as a file
  2. Build the JSON representation from a software representation (for instance .NET class)

Both of these are valid cases. The JSON representation reflects the on-the-wire payload. The .NET class might give us the ability to later use this class to create instances of the DTDL defined Twin.

 

Considering this idea, we might consider something like the following:

 

{
  "@id": "dtmi:elastacloud:core:NamedTwin;1",
  "@type": "Interface",
  "contents": [
    {
      "@type": "Property",
      "displayName": {
        "en": "name",
        "es": "nombre"
      },
      "name": "name",
      "schema": "string",
      "writable": true
    }
  ],
  "description": {
    "en": "This is a Twin object that holds a name.",
    "es": "Este es un objeto Twin que contiene un nombre."
  },
  "displayName": {
    "en": "Named Twin Object",
    "es": "Objeto Twin con nombre"
  },
  "@context": "dtmi:dtdl:context;2"
}

 

 

We might then want to create a Plain Old CLR Object (POCO) representation:

 

public class NamedTwinModel
{
  public string name { get; set; }
}

 

 

While we are able to see that the Interface is in alignment with the DTDL definition of contents, it is not immediately apparent how we would manage displayName and globalisation concerns thereof within a POCO.

 

Note that from a purist’s perspective, a POCO should try to avoid attributes where possible, to boost readability. So a [DisplayName("en", "name")] annotated approach is possible, but not ideal.

 

Furthermore, you'll note that the DTDL wraps the contents which is the type definition, with a set of descriptors and globalization values. In order to achieve this, we might consider a wrapped generic POCO approach:

 

public class Globalisation {
   public string En { get; set; }
   public string Es { get; set; }
}
public class DtdlWrapper<TContents> {
    public T Contents { get; set; }
    public Globalisation Description { get; set; }
}
...
var namedDtdl = new DtdlWrapper<NamedTwinModel>();
namedDtdl.Contents = new NamedTwinModel();
namedDtdl.Contents.name = "what should I put here?";

 

 

The problem we start to face when expressing things in this case for the DTDL definitions themselves, is that we are actually building a class hierarchy that is more akin to the Azure Digital Twin instances than it is to the DTDL definitions. As such, we're going to have to create instances, then use Reflection over them but ignore their values. We could use default values or lookup the types more directly, but still the problem is the same; class definitions in .NET describe how you can create instances, and don't directly translate to DTDL in an easy to understand way.

 

Thus, from our perspective, we want to make sure that our description DTDL is native json since there are aspects which are not naturally amenable to encapsulating with a Plain Old CLR Object (POCO). We will use our POCOs to represent instances of Azure Digital Twins, i.e. the data itself, and not the schema.

 

This means we store the DTDL in JSON format on disk. But this isn't anywhere near the end of the story for versioning and .NET development.

 

We just learned that POCOs can represent instances or Digital Twins quite effectively. If we're going to code with .NET we will still need to use some kind of class to interact with, in order to do CRUD operations on the Azure Digital Twin.

 

The building blocks are therefore:

  • Raw JSON held as a file
  • POCOs to describe instances of those DTDL defined classes

Versioning

Versioning models in DTDL is achieved in a DTMI using an integer value held in the identifier. From the DTDL v2 documentation :

In DTDL, interfaces are versioned by a single version number (positive integer) in the last segment of their identifier. The use of the version number is up to the model author. In some cases, when the model author is working closely with the code that implements and/or consumes the model, any number of changes from version to version may be acceptable. In other cases, when the model author is publishing an interface to be implemented by multiple devices or digital twins or consumed by multiple consumers, compatible changes may be appropriate.

 

Firstly, mapping POCOs to DTDL in the way we have discussed requires that we choose to actively validate against DTDL, passively validate or don't validate at all. Some options:

  • Active; we build a way to check whether a DTDL model exists in Azure Digital Twins on any CRUD activity, that the properties match in name and type
  • Passive; we do similarly to Active, but use JSON files as the validation target, and assume that the JSON files are in-line with the target database
  • None; we don't validate, but instead lead Azure Digital Twins error if we get something wrong, and we react to that error.

In our approach, we want to be able to support either radical or compatible changes but we will have to consider some additional factors brought in by .NET type constraints:

  • if a DTDL interface changes types, the .NET POCO properties that exist must match its DTDL values
  • if a DTDL interface changes its named properties, the .NET POCO needs to be updated to reflect this
  • if a DTDL interface adds a new property, we need to decide whether it's an error or not for the POCO to not have the property. This is a happy problem, as we're roughly compatible even if we don't add the property.
  • if the DTDL interface deletes a property, we need to decide whether we do create and update methods but omit that value at runtime.
A workflow that shows the order of checking a Model Existence and the states that it may be in.A workflow that shows the order of checking a Model Existence and the states that it may be in.

Applying Versioning

Once we have our DTDL prepared in JSON, we still need to get these into Azure Digital Twins. We have a few choices again to make around how we want to handle versioning.

 

The absolute core of creating Azure Digital Twins DTDL models from a .NET perspective is to use the Azure.DigitalTwins.Core package available on NuGet, to create the models. In short:

 

// you need to setup three variables first; tenantId, clientId 
// and adtInstanceUrl. var 
credentials = new InteractiveBrowserCredential(tenantId, clientId);
DigitalTwinsClient client = new DigitalTwinsClient(new Uri(adtInstanceUrl), credentials);
await client.CreateModelsAsync(new string[] { "DTDL Model in JSON here..." } );

 

 

That's the core of creating those DTDL models. We could just load the JSON files directly from disk as a string and add it to the array passed to CreateModelsAsync, however we have options to employ that might help us out in the future.

 

For example, we can get the existing models by calling client.GetModelsAsync. We can iterate on these models and check whether our new models to create share a @id including the version. If this is the case we can validate whether the contents are the same, and choose to throw an exception if not, if we are seeking to maintain a high level of compatibility.

 

Should we find that a model exists for a previous version (i.e. our JSON file has a higher dtmi version) we can choose to decommission that model. This is a one way operation, so we better be careful to do this in a managed fashion. For instance, we might want to decommission a model only after it has been replaced for a period of time, so that we may have live-updates to the system. If this is the case, we should be comfortable that all writers to the Azure Digital Twin have been upgraded.

 

When a model is decommissioned, new digital twins will no longer be able to be defined by this model. However, existing digital twins may continue to use this model. Once a model is decommissioned, it may not be recommissioned.

Anyway, should we choose to do that, once a model is created (say dtmi:elastacloud:core:NamedTwin;2) we might choose to decommission the previous version:

 

await client.DecommissionModelAsync("dtmi:elastacloud:core:NamedTwin;1");

 

 

The key thought process around Decommissioning relates to the choice you want to make around version compatibility with your code. The idea we take at Elastacloud is that we want to be able to be sure that the latest Git-held version of the DTDL model is available but also that previous versions should also be available for a period of time that we consider to be an SLA, until we are sure that all consumers have been updated to the latest version.

 

A strategy for decommissioning DTDL Models in Azure Digital Twins, shown as a workflow that checks an SLAA strategy for decommissioning DTDL Models in Azure Digital Twins, shown as a workflow that checks an SLA

Other Considerations

Naming standards between .NET and JSON are different. We should name according to the framework that hosts the code, and use Serialization techniques to convert between naming divergences. For example, Properties in .NET start with a capital letter in many circumstances, whereas in JSON they tend to start with lowercase.

 

DTDL includes a set of standard semantic types that can be applied to Telemetries and Properties. When a Telemetry or Property is annotated with one of these semantic types, the unit property must be an instance of the corresponding unit type, and the schema type must be a numeric type (double, float, integer, or long).

.NET Tooling Approach

So far we have a few key components that we have to build in order to hit our best practice goal.

  • A .NET application that deploys the models to the Azure Digital Twin instance. That understands versions of DTDL that are already deployed, and the versions held locally, and helps assert compatibility.
  • A .NET application that holds POCOs that can represent DTDL deployed to Azure Digital Twins and can help marshal data between .NET and Azure Digtal Twins.

This helps us define two main categories of error conditions; deployment and runtime.

A tooling approach to deploying Azure Digital Twin DTDL model changesA tooling approach to deploying Azure Digital Twin DTDL model changes

CI/CD deployment

At Elastacloud we use our own `twinmigration` tool for managing this process. The tool is a dotnet global tool that we built and that provides features designed for CI/CD purposes.

 

Since a dotnet global tool is a convenient way of distributing software into pipelines, we add a task to our CI/CD pipeline that takes the latest version of JSON files from a git repo, and validates them against what is already deployed in an ADT instance.

Following the output of a validation stage, we might choose to also run a deploy stage. This will do the action of adding the models to an Azure Digital Twin.

 

Finally, we have a decommissioning step which causes “older” models to be made unavailable for creation, so that we can keep good data quality practices.

 

In Summary

For more information about what we're doing with Azure Digital Twins, visit our website at Intelligent Spaces — Elastacloud, we'll be updating it regularly with information on our approaches. We have some tools that are ready to go, such as NuGet Gallery | Elastacloud.TwinMigration that help you to do the things we've described here!

 

Thanks for reading. 

Co-Authors
Version history
Last update:
‎May 05 2021 03:42 PM
Updated by: