Home

O365 Activity Management API - Performance for huge audit log streams

%3CLINGO-SUB%20id%3D%22lingo-sub-133723%22%20slang%3D%22en-US%22%3EO365%20Activity%20Management%20API%20-%20Performance%20for%20huge%20audit%20log%20streams%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-133723%22%20slang%3D%22en-US%22%3E%3CP%3E%3CSTRONG%3EThis%20question%20is%20regarding%20the%26nbsp%3B%3CA%20title%3D%22O365%20Activity%20Management%20API%22%20href%3D%22https%3A%2F%2Fmsdn.microsoft.com%2Fen-us%2Foffice-365%2Foffice-365-management-activity-api-reference%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%20noopener%20noreferrer%22%3EO365%20Activity%20Management%20API%3C%2FA%3E.%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EWe%20are%20using%20the%20API%20to%20retrieve%20audit%20log%20events%20from%20multiple%20channels%20(Azure%20AD%2C%20SharePoint%2C%20etc.)%20for%20a%20very%20large%20tenant%2C%20meaning%20that%20we%20need%20to%20retrieve%20potentially%20millions%20of%20events%20over%20a%20relatively%20short%20time%20span.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EO365%20gathers%20audit%20events%20into%20a%20series%20of%20%22blobs%22%20which%20then%20contain%20a%20number%20of%20individual%20event%20(JSON%20messages).%20To%20my%20understanding%2C%20which%20in%20part%20comes%20from%20correspondence%20with%20the%20API's%20dev.%20team%20and%20from%20reading%20the%20docs%2C%20these%20blobs%20should%20contain%20a%20%22considerable%22%20number%20events%20as%20to%20function%20as%20a%20sort%20of%20batch%20approach%20when%20doing%20the%20actual%20web%20requests.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EIn%20our%20approach%2C%20we%20request%20blobs%20URLs%20for%20an%20interval%20of%20an%20hour%2C%20and%20then%20do%20a%20request%20for%20the%20individual%20blobs.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EHowever%2C%20we%20have%20tested%20with%20a%20number%20of%20different%20tenants%20and%20different%20PublisherIdentifiers%2C%20but%20only%20seem%20to%20get%20around%202.5%20messages%20per%20blob%20on%20average%2C%20no%20matter%20the%20total%20number%20of%20events%20%22waiting%22%20to%20be%20fetched.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThis%20becomes%20a%20major%20issue%20for%20the%20larger%20tenants%20as%20is%20puts%20a%20strain%20on%20the%20SIEM%20solution%20running%20the%20fetcher%20logic%20(a%20Python%20service)%2C%20due%20to%20number%20of%20request%2Fseconds%2C%20and%20it%20also%20gives%20us%20with%20throttling%20issues%20with%20the%20API%20itself.%20In%20effect%2C%20we%20simply%20cannot%20fetch%20the%20audit%20events%20fast%20enough%20to%20keep%20up%20-%20within%20the%20retention%20period.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EA%20%22funny%22%20thing%20is%2C%20that%20if%20we%20use%20the%20visual%20query%20tool%20within%20the%20Admin%20Center%20of%20the%20tenant%2C%20it%20searches%20and%20retrieves%20the%20log%20messages%20very%20fast.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSTRONG%3EHas%20anyone%20had%20any%20experience%20with%20this%20issue%2C%20or%20perhaps%20a%20better%20%22batch%20performance%22%3F%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EAs%20mentioned%20we%20have%20been%20in%20direct%20contact%20with%20the%20dev%20team%20and%20the%20program%20manager%20in%20Redmond.%20They%20have%20been%20very%20helpful%20with%20other%20issues%20we%20had%2C%20but%20they%20referred%20us%20to%20support%20for%20this%20specific%20issue%20-%20who%20in%20turn%20referred%20us%20to%20the%20forums%20%2F%20community.%20We%20currently%20do%20not%20have%20access%20to%20premium%20support...%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSTRONG%3EExample%20request%20for%20content%20blobs%20for%20an%20hour%3A%3C%2FSTRONG%3E%20%3CA%20href%3D%22https%3A%2F%2Fmanage.office.com%2Fapi%2Fv1.0%2F%7Btenantid%7D%2Factivity%2Ffeed%2Fsubscriptions%2Fcontent%3FcontentType%3DAudit.Exchange%26amp%3BPublisherIdentifier%3D%7Bpub.id%7D%26amp%3BstartTime%3D2017-12-03T10%253A31%253A24%26amp%3BendTime%3D2017-12-03T11%253A31%253A24%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%20noopener%20noreferrer%22%3Ehttps%3A%2F%2Fmanage.office.com%2Fapi%2Fv1.0%2F%7Btenantid%7D%2Factivity%2Ffeed%2Fsubscriptions%2Fcontent%3FcontentType%3DAudit.Exchange%26amp%3BPublisherIdentifier%3D%7Bpub.id%7D%26amp%3BstartTime%3D2017-12-03T10%253A31%253A24%26amp%3BendTime%3D2017-12-03T11%253A31%253A24%3C%2FA%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSTRONG%3EExample%20request%20for%20an%20individual%20blobs%3A%3C%2FSTRONG%3E%3CBR%20%2F%3EHere%20we%20just%20use%20the%20URLs%20given%20to%20us%20in%20the%20request%20above.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E
Kristian Lindberg Vinther
Visitor

This question is regarding the O365 Activity Management API.

 

We are using the API to retrieve audit log events from multiple channels (Azure AD, SharePoint, etc.) for a very large tenant, meaning that we need to retrieve potentially millions of events over a relatively short time span.

 

O365 gathers audit events into a series of "blobs" which then contain a number of individual event (JSON messages). To my understanding, which in part comes from correspondence with the API's dev. team and from reading the docs, these blobs should contain a "considerable" number events as to function as a sort of batch approach when doing the actual web requests.

 

In our approach, we request blobs URLs for an interval of an hour, and then do a request for the individual blobs.

 

However, we have tested with a number of different tenants and different PublisherIdentifiers, but only seem to get around 2.5 messages per blob on average, no matter the total number of events "waiting" to be fetched.

 

This becomes a major issue for the larger tenants as is puts a strain on the SIEM solution running the fetcher logic (a Python service), due to number of request/seconds, and it also gives us with throttling issues with the API itself. In effect, we simply cannot fetch the audit events fast enough to keep up - within the retention period.

 

A "funny" thing is, that if we use the visual query tool within the Admin Center of the tenant, it searches and retrieves the log messages very fast.

 

Has anyone had any experience with this issue, or perhaps a better "batch performance"?

 

As mentioned we have been in direct contact with the dev team and the program manager in Redmond. They have been very helpful with other issues we had, but they referred us to support for this specific issue - who in turn referred us to the forums / community. We currently do not have access to premium support...

 

Example request for content blobs for an hour: https://manage.office.com/api/v1.0/{tenantid}/activity/feed/subscriptions/content?contentType=Audit....

 

Example request for an individual blobs:
Here we just use the URLs given to us in the request above.

 

Related Conversations
Stable version of Edge insider browser
HotCakeX in Discussions on
35 Replies
Tabs and Dark Mode
cjc2112 in Discussions on
30 Replies
flashing a white screen while open new tab
Deleted in Discussions on
14 Replies
Security Community Webinars
Valon_Kolica in Security, Privacy & Compliance on
7 Replies