Special thanks to Nicholas DiCola (SECURITY JEDI) and Mor Rubin that collaborated with me on this blog post.
GitHub online platform enables developers to find, share, build, and collaborate on software. Many organizations are using GitHub as their software development version control mechanism and source code management. The site hosts public and private folders, or repositories, through which remote developers can upload source code and share it with collaborators.
With the increase usage of GitHub, there was an increase in the numbers of attacks against it. Example attacks campaigns usually starts with phishing email to users, which leads to compromised user account accessing the organization's GitHub repositories - cloning private repositories and exposing sensitive data.
There are multiple features to help you secure your GitHub organization, but in this blog we will introduce a solution which uses Logic Apps to pull GitHub audit logs & ingest them into Sentinel. Helping SecOps gain visibility to their organization's GitHub repositories, which is often lacking, and provide SOC team hunting queries & detection which spans across Mitre ATT&CK framework to protect their GitHub data, organization and users.
Scenario |
Type |
Mitre Tactic |
Mitre Techniques |
Detection |
Credential Access |
||
Detection |
Initial Access |
||
Detection |
Credential Access |
||
Detection |
Mitigation |
||
Detection |
Defense Evasion |
||
Detection |
Execution |
||
Hunting |
Persistence |
||
Hunting |
Collection |
||
Hunting |
Persistence Privilege Escalation |
||
Hunting |
Persistence Defense Evasion |
||
Hunting |
Impact |
||
Hunting |
Defense Evasion Persistence |
T1089 - Disabling Security Tools
|
|
Hunting |
Collection |
||
Hunting |
Exfiltration |
||
Hunting |
Impact |
||
Hunting |
Persistence |
Collecting GitHub Data
This section explains how to use the ARM template to deploy the Logic Apps playbooks, Key Vault and Storage Account to ingest GitHub logs into Azure Sentinel. This is done by using three Logic Apps Playbooks.
All three playbooks use Key Vault to read a secret which is needed for API authentication. They also use a storage account which holds two file types:
- ORGS.json - contains is a list of GitHub Orgs you want the playbooks to query.
- lastrun-*.json - stores the last run time and a cursor string. This string is used to calculate the last record that was received and only query newer records since then.
Playbooks
Get-GitHubAuditEntry Playbook - runs every 5 minutes and uses GitHub v4 API (GraphQL) to query for a set of AuditEntry. The GraphQL query doesn’t pull all of the Objects that implement AuditEntry Interface. It pulls the most relevant used for the detections and hunting queries listed abpve. The returned data is written to the Azure Sentinel Workspace in GitHub_CL custom table.
- Get-GitHubRepoLogs Playbook - runs every hour and uses GitHub v3 API to query each repo in each Org for Forks, Clones, Commits, Referrers, Paths, Views, and Collaborators The returned data is written to the Azure Sentinel Workspace in GitHubRepoLogs_CL custom table.
-
Get-GitHubVulnerabilityAlerts Playbook - runs every hour and uses GitHub V4 API to query for SecurityVulnerabilities. The returned data is written to the Azure Sentinel Workspace in GitHubRepoLogs_CL custom table.
ARM Template
To deploy the ARM template containing all the playbooks, connections, storage account and key vault and configure the resources:
- Generate a GitHub Personal Access Token https://github.com/settings/tokens.
- GitHub user settings -> Developer settings -> Personal access tokens.
- Get the objectId for a user that the Logic App can use.
- Azure Portal -> Azure Active Directory -> Users -> User. This user will be used to grant access to the Key Vault secret.
- Deploy the ARM template and fill in the parameters
- https://github.com/Azure/Azure-Sentinel/tree/master/DataConnectors/GitHub. Click the Deploy to Azure Button.
- "PersonalAccessToken": This is the GitHub PAT from Step 1.
- "UserName": A user that will be granted access to the key vault to read the PAT.
- "principalId": The user object ID for the username above
- "workspaceId": The Azure Sentinel Workspace ID
- "workSpaceKey": The Azure Sentinel Workspace Key
- There are two json files (ORGS.json and lastrun-Audit.json).
- Download and edit the ORGS.json file and update "org": "sampleorg" and replace sampleorg with your org name. If you have additional orgs, add another line {"org": "sampleorg"} for each org you want to monitor.
- Upload the ORGS.json, and lastrun-Audit.json to the storage account githublogicapp container.
- Go to the keyvault - GitHubPlaybooks connection resource.
- Click Edit API Connection.
- Click Authorize. Sign in as the user which was provided in the parameters. Click Save.
- The playbooks are deployed as disabled since the json files and connection has to be authorized. Go to each playbook and click Enable.
GitHub data will now be ingested GitHub_CL, GitHubRepoLogs_CL in Sentinel
Monitoring GitHub
User identity is a key attack vector when it comes to GitHub and it should be protected and monitored. Since you can implement SSO using Azure Active Directory (Azure AD) for authentication you can collect Azure AD data into Azure Sentinel using the built in connector and use our detections and hunting queries to monitor for suspicious identity events with Azure Sentinel. But what about GitHub specific threats? There are a number of scenarios that an attacker could attempt exploit in order to gain access to your organizations sensitive data with GitHub that wouldn’t appear in Azure AD logs. Below we will look at some of these, as well as ideas of how to hunt and monitor for them.
Parsing the Data
Before building detections or hunting queries on the GitHub data we collected we can use a KQL Function to parse and normalize the data to make it easier to use. For more background on Functions please read this blog.
In the case of GitHub data we have a large number of fields returned from GitHub REST API & GraphQL API, so the parser is going to help us select a subset of the fields relevant to monitor GitHub. You can find our suggested parsers on but you can also modify this parser to fit your needs and preferences.
Since we separated the data into two tables we created two parsers.
GitHub_CL Table Parser:
GitHub_CL
| project TimeGenerated=node_createdAt_t,
Organization=columnifexists('node_organizationName_s', ""),
Action=node_action_s,
OperationType=node_operationType_s,
Repository=columnifexists('node_repositoryName_s',""),
Actor=node_actorLogin_s,
IPaddress=node_actorIp_s,
City=node_actorLocation_city_s,
Country=node_actorLocation_country_s,
ImpactedUser=columnifexists('node_userLogin_s', ""),
ImpactedUserEmail=columnifexists('node_user_email_s', ""),
InvitedUserPermission=node_permission_s,
Visability=columnifexists('node_visibility_s',""),
OauthApplication=columnifexists('node_oauthApplicationName_s',""),
OauthApplicationUrl=columnifexists('node_applicationUrl_s',""),
OauthApplicationState=columnifexists('node_state_s',""),
UserCanInviteCollaborators=('node_canInviteOutsideCollaboratorsToRepositories_b',""),
MembershipType=columnifexists('node_membershipTypes_s',""),
CurrentPermission=columnifexists('node_permission_s',""),
PreviousPermission=columnifexists('node_permissionWas_s',""),
TeamName=columnifexists('node_teamName_s',""),
Reason=columnifexists('node_reason_s',""),
BlockedUser=columnifexists('node_blockedUserName_s',""),
CanCreateRepositories=columnifexists('canCreateRepositories_b',"")
GitHubRepoLogs_CL Table Parser:
GitHubRepoLogs_CL
| project TimeGenerated = created_at_t,
Organization=columnifexists('Organization_s', ""),
Repository=columnifexists('Repository_s',""),
Action=columnifexists('LogType_s',""),
Actor=coalesce(login_s, owner_login_s),
ActorType=coalesce(owner_type_s, type_s),
IsPrivate=columnifexists('private_b',""),
ForksUrl=columnifexists('forks_url_s',""),
PushedAt=columnifexists('pushed_at_t',""),
IsDisabled=columnifexists('disabled_b',""),
AdminPermissions=columnifexists('permissions_admin_b',""),
PushPermissions=columnifexists('permissions_push_b',""),
PullPermissions=columnifexists('permissions_pull_b',""),
ForkCount=columnifexists('forks_count_d',""),
Count=columnifexists('count_d,',""),
UniqueUsersCount=columnifexists('uniques_d',""),
DismmisedAt=columnifexists('dismissedAt_t',""),
Reason=columnifexists('dismissReason_s',""),
vulnerableManifestFilename = columnifexists('vulnerableManifestFilename_s',""),
For the queries we will look at in the following sections, we are going to save those parsers with an alias of GitHubAudit. & GitHubRepo. Details on configuring and using a Function as a parser can be found in this blog.
Detection & Hunting Queries
The following queries are designed to help you find suspicious activity in your GitHub organization, and whilst many are likely to return legitimate activity as well as potentially malicious activity, they can be useful in guiding your hunting. If after running these queries you are confident with the results you could consider turning some or all of them into Azure Sentinel Analytics to alert on.
Brute Force Attack against GitHub Account
Mitre ATT&CK Tactic Credential Access technique T1110
Attackers who are trying to guess your users' passwords or use brute-force methods to get in. If your organization is using SSO with Azure Active Directory, authentication logs to GitHub.com will be generated. Using the following query can help you identify a sudden increase in failed logon attempt of users.
let LearningPeriod = 7d;
let BinTime = 1h;
let RunTime = 1h;
let StartTime = 1h;
let NumberOfStds = 3;
let MinThreshold = 10.0;
let EndRunTime = StartTime - RunTime;
let EndLearningTime = StartTime + LearningPeriod;
let GitHubFailedSSOLogins = (SigninLogs
| where AppDisplayName == "GitHub.com"
| where ResultType == 50056);
GitHubFailedSSOLogins
| where TimeGenerated between (ago(EndLearningTime) .. ago(StartTime))
| summarize FailedLoginsCountInBinTime = count() by User = Identity, bin(TimeGenerated, BinTime)
| summarize AvgOfFailedLoginsInLearning = avg(FailedLoginsCountInBinTime), StdOfFailedLoginsInLearning = stdev(FailedLoginsCountInBinTime) by User
| extend LearningThreshold = max_of(AvgOfFailedLoginsInLearning + StdOfFailedLoginsInLearning * NumberOfStds, MinThreshold)
| join kind=innerunique (
GitHubFailedSSOLogins
| where TimeGenerated between (ago(StartTime) .. ago(EndRunTime))
| summarize FailedLoginsCountInRunTime = count() by User = Identity
) on User
| where FailedLoginsCountInRunTime > LearningThreshold
| extend AccountCustomEntity = UserPrincipalName, IPCustomEntity = IPAddress
GitHub Activities from Infrequent Country
Mitre ATT&CK Tactic Initial Access technique T1078
This query Detect activities from a location that was never/not recently connected by any user in your organization.
// If you want to look at user added further than 7 days ago adjust this value
let LearningPeriod = 7d;
let RunTime = 1h;
let StartTime = 1h;
let EndRunTime = StartTime - RunTime;
let EndLearningTime = StartTime + LearningPeriod;
let GitHubCountryCodeLogs = (GitHubAudit
| where Country != "");
GitHubCountryCodeLogs
| where TimeGenerated between (ago(EndLearningTime) .. ago(StartTime))
| summarize makeset(Country) by Actor
| join kind=innerunique (
GitHubCountryCodeLogs
| where TimeGenerated between (ago(StartTime) .. ago(EndRunTime))
| distinct Country, Actor
) on Actor
| where set_Country !contains Country
| extend AccountCustomEntity = Actor, IPCustomEntity = IPaddressactorLocation_countryCode_s
Sign-in Burst from Multiple Locations
Mitre ATT&CK Tactic Credential Access technique T1110
This query over Azure Active Directory sign-in activity to GitHub.com highlights accounts associated with multiple authentications from different geographical locations in a short space of time.
let RunTime = 1h;
SigninLogs
| where TimeGenerated > ago(RunTime)
| where AppDisplayName == "GitHub.com"
| where ResultType == 0
| summarize CountOfLocations = dcount(Location), Locations = make_set(Location) by User = Identity
| where CountOfLocations > 1
| extend AccountCustomEntity = UserPrincipalName, IPCustomEntity = IPAddress
Threat Intel Matches to GitHub Audit Logs
Mitre Mitigation Threat Intelligence Program technique T1212
Azure Sentinel integrates with Microsoft Graph Security API data sources for ingesting threat intelligence indicators.We identifies a match in GitHub Audit Logs data from any IP address IOC from TI.
ThreatIntelligenceIndicator
| where TimeGenerated >= ago(24h)
| where Action == true
// Picking up only IOC's that contain the entities we want
| where isnotempty(NetworkIP) or isnotempty(EmailSourceIpAddress) or isnotempty(NetworkDestinationIP) or isnotempty(NetworkSourceIP)
// Taking the first non-empty value based on potential IOC match availability
| extend TI_ipEntity = iff(isnotempty(NetworkIP), NetworkIP, NetworkDestinationIP)
| extend TI_ipEntity = iff(isempty(TI_ipEntity) and isnotempty(NetworkSourceIP), NetworkSourceIP, TI_ipEntity)
| extend TI_ipEntity = iff(isempty(TI_ipEntity) and isnotempty(EmailSourceIpAddress), EmailSourceIpAddress, TI_ipEntity)
| join (
GitHubAudit
| where TimeGenerated >= ago(24h)
| extned GitHubAudit_TimeGenerated = TimeGenerated
)
on on $left.TI_ipEntity == $right.IPaddress
| summarize LatestIndicatorTime = arg_max(TimeGenerated, *) by IndicatorId
| project LatestIndicatorTime, Description, ActivityGroupNames, IndicatorId, ThreatType, Url, ExpirationDateTime, ConfidenceScore, GitHubAudit_TimeGenerated, TI_ipEntity, IPaddress, Actor, Action, Country, OperationType, NetworkIP, NetworkDestinationIP, NetworkSourceIP, EmailSourceIpAddress
| extend timestamp = GitHubAudit_TimeGenerated, IPCustomEntity = IPaddress, AccountCustomEntity = Actor
Two Factor Authentication Disabled
Mitre ATT&CK Tactic Defense Evasion technique T1089
Two-factor authentication is a process where a user is prompted during the sign-in process for an additional form of identification, such as to enter a code on their cellphone or to provide a fingerprint scan. Two factor authentication reduces the risk of account takeover. Attacker will want to disable such security tools in order to go undetected.
let timeframe = 14d;
GitHubAudit
| where TimeGenerated > ago(timeframe)
| where Action == "org.disable_two_factor_requirement"
| project TimeGenerated, Action, Actor, Country, IPaddress, Repository
| extend AccountCustomEntity = Actor, IPCustomEntity = IPaddress
Security Vulnerability in Repo
Mitre ATT&CK Tactic Execution technique T1203
This alerts when a new security vulnerability is discovered in a GitHub repository.
let timeframe = 14d;
let timeframe = 14d;
GitHubRepo
| where TimeGenerated > ago(timeframe)
| where Action == "vulnerabilityAlert"
| project TimeGenerated, DismmisedAt, Reason, vulnerableManifestFilename, Description, Link, PublishedAt, Severity, Summary
Inactive or New Account Usage
Mitre ATT&CK tactic Persistence technique T1136
This hunting query identifies Accounts that are new or inactive and have accessed or used GitHub that may be a sign of compromise.
// If you want to look at user added further than 7 days ago adjust this value
let LearningPeriod = 7d;
let RunTime = 1h;
let StartTime = 1h;
let EndRunTime = StartTime - RunTime;
let EndLearningTime = StartTime + LearningPeriod;
let GitHubActorLogin = (GitHub_CL
| where actorLogin_s != "");
let GitHubUser = (GitHub_CL
| where userLogin_s != "");
let GitHubNewActorLogin = ( GitHubActorLogin
| where TimeGenerated between (ago(EndLearningTime) .. ago(StartTime))
| summarize makeset(actorLogin_s)
| extend Dummy = 1
| join kind=innerunique
(
GitHubActorLogin
| where TimeGenerated between (ago(StartTime) .. ago(EndRunTime))
| distinct actorLogin_s
| extend Dummy = 1
)
on Dummy
| project-away Dummy
| where set_actorLogin_s !contains actorLogin_s);
let GitHubNewUser = ( GitHubUser
| where TimeGenerated between (ago(EndLearningTime) .. ago(StartTime))
| summarize makeset(userLogin_s)
| extend Dummy = 1
| join kind=innerunique
(
GitHubUser
| where TimeGenerated between (ago(StartTime) .. ago(EndRunTime))
| distinct userLogin_s
| extend Dummy = 1
)
on Dummy
| project-away Dummy
| where set_userLogin_s !contains userLogin_s);
union GitHubNewActorLogin, GitHubNewUser
Unusual Number of Repository Clones
Mitre ATT&CK Tactic Collection technique T1213
Attacker can exfiltrate data from you GitHub repository after gaining access to it by performing clone action. This hunting queries allows you to track the clones activities for each of your repositories. The visualization allow you to quickly identify anomalies/excessive clone, to further investigate repo access & permissions.
let min_t = toscalar(GitHubRepo
| summarize min(timestamp_t));
let max_t = toscalar(GitHubRepo
| summarize max(timestamp_t));
GitHubRepo
| where Action == "Clones"
| distinct TimeGenerated, Repository, Count
| make-series num=sum(Count) default=0 on TimeGenerated in range(min_t, max_t, 1h) by Repository
| extend (anomalies, score, baseline) = series_decompose_anomalies(num, 1.5, -1, 'linefit')
| render timechart t
User Grant Access and Grants Other Access
Mitre ATT&CK Tactic Persistence, Privilege Escalation technique T1098, T1078
Identifies when a new user is granted access and starts granting access to other users. This can help you identify rogue or malicious user behavior.
GitHubAudit
| where Action == "org.invite_member" or Action == "org.add_member" or Action == "team.add_member" or Action == "repo.add_member"
| distinct ImpactedUser, TimeGenerated, Actor
| project-rename firstUserAdded = ImpactedUser, firstEventTime = TimeGenerated, firstAdderUser = Actor
| join kind= innerunique (
GitHubAudit
| where ImpactedUser != ""
| where Action == "org.invite_member" or Action == "org.add_member" or Action == "team.add_member" or Action == "repo.add_member"
| distinct ImpactedUser, TimeGenerated, Actor
| project-rename secondUserAdded = ImpactedUser, secondEventTime = TimeGenerated, secondAdderUser = Actor
) on $right.secondAdderUser == $left.firstUserAdded
| where secondEventTime between (firstEventTime .. (firstEventTime + 1h))
Oauth App Restrictions Disabled
Mitre ATT&CK Tactic Defense Evasion, Persistence technique T1089, T1100
This hunting query identifies GitHub OAuth Apps that have restrictions disabled that may be a sign of compromise. Attacker will want to disable such security tools in order to go undetected.
let timeframe = 14d;
GitHubAudit
| where TimeGenerated > ago(timeframe)
| where Action == "org.disable_oauth_app_restrictions"
| project TimeGenerated, Action, Actor, Location
Mass Deletion of Repositories
Mitre ATT&CK Tactic Impact technique T1485
This hunting queries identify an unusual increase of repo deletion activities
adversaries may want to disrupt availability or compromise integrity by deleting business data.
let LearningPeriod = 7d;
let BinTime = 1h;
let RunTime = 1h;
let StartTime = 1h;
let NumberOfStds = 3;
let MinThreshold = 10.0;
let EndRunTime = StartTime - RunTime;
let EndLearningTime = StartTime + LearningPeriod;
let GitHubRepositoryDestroyEvents = (GitHubAudit
| where Action == "repo.destroy");
GitHubRepositoryDestroyEvents
| where TimeGenerated between (ago(EndLearningTime) .. ago(StartTime))
| summarize count() by bin(TimeGenerated, BinTime)
| summarize AvgInLearning = avg(count_), StdInLearning = stdev(count_)
| extend LearningThreshold = max_of(AvgInLearning + StdInLearning * NumberOfStds, MinThreshold)
| extend Dummy = 1
| join kind=innerunique (
GitHubRepositoryDestroyEvents
| where TimeGenerated between (ago(StartTime) .. ago(EndRunTime))
| summarize CountInRunTime = count() by bin(TimeGenerated, BinTime)
| extend Dummy = 1
) on Dummy
| project-away Dummy
| where CountInRunTime > LearningThreshold
Org Repositories Default Permissions Change
Mitre ATT&CK Tactic Defense Evasion, Persistence technique T1089, T1098
This hunting query identifies a global change to the organization permission policy for repositories. The default is read, attacker may change this permission in order to gain persistence or evade detection.
GitHubAudit
| where Action == "org.update_default_repository_permission"
| project TimeGenerated, Action, Actor, Country, Repository, PreviousPermission, CurrentPermission
Repository Permission Switched to Public
Mitre ATT&CK Tactic Collection technique T1213
This hunting query identifies a change to the visibility state of a repository. This will help identify issues of data breaches.
let timeframe = 14d;
GitHubAudit
| where TimeGenerated > ago(timeframe)
| where Action == "repo.access"
| where OperationType == "MODIFY"
| where node_visibility_s == "PUBLIC"
| project TimeGenerated, Action, Actor, Country, Repository, Visabilitys
Suspicious Fork Activity
Mitre ATT&CK Tactic Exfiltration technique T1537
This hunting query identifies a fork activity against a repository done by a user who is not the owner of the repo nor a contributor.
let RunTime = 1h;
let CollaboratorsUserToRepoMapping = (
GitHubRepo
| where TimeGenerated < ago(RunTime)
| where Action == "Collaborators"
| distinct Repository , Actor, Organization);
let UserCommitsInRepoMapping = (
GitHubRepo
| where Action == "Commits"
| where TimeGenerated < ago(RunTime)
| distinct Repository ,Actor, Organization);
union CollaboratorsUserToRepoMapping, UserCommitsInRepoMapping
| summarize ContributedToRepos = make_set(Repository) by Actor, Organization
| join kind=innerunique (
GitHubRepo
| where TimeGenerated > ago(RunTime)
| where Action == "Forks"
| distinct Repository , Actor, Organization
) on Actor, Organization
| project-away Actor1, Organization1
| where ContributedToRepos !contains Repository
User First Time Repository Delete Activity
Mitre ATT&CK Tactic Impact technique T1485
This hunting query identifies a change to the visibility state of a repository. This will help identify issues of data breaches.
let LearningPeriod = 7d;
let RunTime = 1h;
let StartTime = 1h;
let EndRunTime = StartTime - RunTime;
let EndLearningTime = StartTime + LearningPeriod;
let GitHubRepositoryDestroyEvents = (GitHubAudit
| where Action == "repo.destroy");
GitHubRepositoryDestroyEvents
| where TimeGenerated between (ago(EndLearningTime) .. ago(StartTime))
| distinct Actor
| join kind=rightanti (
GitHubRepositoryDestroyEvents
| where TimeGenerated between (ago(StartTime) .. ago(EndRunTime))
| distinct Actor
) on Actor
First Time User Invite and Add Member to Org
Mitre ATT&CK Tactic Persistence technique T1136
This hunting query identifies a user that add/invite a member to the organization for the first time. This technique can be leveraged by attackers to add stealth account access to the organization.
let LearningPeriod = 7d;
let RunTime = 1h;
let StartTime = 1h;
let EndRunTime = StartTime - RunTime;
let EndLearningTime = StartTime + LearningPeriod;
let GitHubOrgMemberLogs = (GitHubAudit
where Action == "org.invite_member" or Action == "org.update_member" or Action == "org.add_member Action == "repo.add_member" Action == "team.add_member");
GitHubOrgMemberLogs
| where TimeGenerated between (ago(EndLearningTime) .. ago(StartTime))
| distinct Actor
| join kind=rightanti (
GitHubOrgMemberLogs
| where TimeGenerated between (ago(StartTime) .. ago(EndRunTime))
| distinct Actor
) on Actor
The GitHub hunting queries detailed in this blog have been shared on the Azure Sentinel GitHub along with the parser, ARM template and a workbook. We will be continuing to develop detections and hunting queries for GitHub data over time so make sure you keep an eye on GitHub As always if you have your own ideas for queries or detections please feel free to contribute to the Azure Sentinel community.