Introducing the R and Python packages for Viva Insights
The Viva Insights suite of Power BI templates and custom queries provide opportunities for deeper analysis on workplace collaboration and trends, for analysts who don’t have a coding background. Now, for more technical users, we’re excited to announce new complementary tools that will add even more details, visualizations, and efficient workflows to these types of analyses.
The Viva Insights R package and the Viva Insights Python package are open-source repositories of functions that enable analysts to build visualizations and advanced analyses on top of existing queries. Armed with either of these packages, analysts can perform:
- data validation - e.g. checking for outliers, unusual gaps
- data visualization - direct from data to visualization, with a single line of code
- exploratory data analysis - automatically identify groups that are significant
- text mining - discover and quantify patterns from meeting subject lines
- organizational network analysis (ONA) - visualize networks, compute whether individuals/groups are more central, and detect clusters/communities using the network queries
Plots like the following that show how workplace metrics trend across different factors can be generated with a single line of code using the Person Query:
Why use R or Python to analyze data from Viva Insights?
A typical analysis cycle may involve the analyst developing a set of hypotheses and key observations quickly with the turnkey Power BI templates and moving on to either the R or the Python packages to validate and further explore the early findings.
As two of the most popular tools used in data science, analysts can easily use R and Python to boost their productivity and the quality of their work. For example, they can:
- Perform statistical/correlation tests
- Build regression models - for example, these models can be useful for predicting churn, or understanding the causes behind high engagement or productivity
- Create and automate reports that can be interactive and fully branded/customized - even PowerPoint decks can be fully automated!
- Create scores or KPIs tailored for their organization using a combination of metrics
- Identify groups of employees with importantly distinct behaviors, using clustering algorithms
Aside from enabling analysts to take their analyses with Viva Insights further, code-based tools like R and Python also encourage best practices around reproducibility. What this means is that manual steps such as copy-paste, point-click or drag-drop operations – which can be prone to errors and can be time-consuming - are minimized. This is particularly important for complex and long-term analyses.
Save time and produce better analyses with R and Python packages
Previously, to achieve the above, analysts had to write analysis code from scratch. Now, the R and Python packages will save analysts a significant amount of time by providing the functions to perform the most frequently used analyses.
Moreover, the functions in the R and Python packages incorporate the following best practices for analyzing Viva Insights data:
- Filtering out groups less than a certain size to protect privacy
- Displaying base sizes of the analyzed population
- Using color-blind friendly palettes
- Data validation functions to identify unusual work weeks, mailboxes, organization data values, and collaboration behavior - so they can be properly interpreted and mitigated
Both libraries come with functions to identify outliers and anomalies in Viva Insights data, as shown below.
Organization Network Analysis (ONA)
In the last month, the following cross-collaboration queries have become available:
- Between lists of individuals (person-to-person)
- Between an individual and a group (person-to-group)
- Between groups
In the latest releases, both the R (v0.4.1) and Python packages (v0.2.0) come with the features that allow analysts to perform organizational network analysis (ONA) with these cross-collaboration queries, which provides a network-lens view into organizations. This type of analysis can surface unique insights into both problem spots as well as opportunities for business leaders.
ONA, for instance, can help analysts:
- Determine if managers are maintaining strong relationships with their teams
- Learn whether information silos exist in an organization
- See if employee networks are being impacted by real-world changes
- Decide with whom an organization should pilot an initiative to maximize adoption
Both libraries offer community detection features that can be deployed for scenarios such as workspace planning or identifying areas of the organization that are at risk of becoming siloed.
Using the packages, analysts can create and visualize communities on top of the person-to-person cross-collaboration query, as shown below.
How do I get started?
To make use of the R or Python packages, the Analyst role must be assigned and you must be able to create and download queries from the Analyst’s experience. Both packages operate on top of the .csv files downloaded from Viva Insights.
For detailed instructions on installation and usage examples, please visit the links below for more information.
| R | Python | 
What’s coming next?
In the coming months, our collaborators will be working on creating more dynamic report templates within the packages, and introducing functionalies designed for analyzing survey data from Glint. We will also be working on publishing more examples to make it easier for analysts to run and customize the analyses.
As these are open-source projects, we welcome and rely on feature requests, bug reports, and code contributions for our users. If you’re interested in getting involved (or wish to find out how), please email martin.chan@microsoft.com or submit an Issue on either the R or the Python Issues page.
See here for a full list of contributors to our libraries: