Provide Better Error Code than 503 on Consistent timeouts

Provide Better Error Code than 503 on Consistent timeouts
3

Upvotes

Upvote

 Sep 23 2021
1 Comments (1 New)
New

We have an application that uses Microsoft Graph's delta query api.   Our overall use case is that we want the last year/next year of Calendar data for a given user. A very small set of users are seeing 503s with an "Unknown Error" string returned when using this query.

We raised this issue with Microsoft support and they  indicated this is a timeout on the Microsoft side due to too much data for some users and that we should reduce our window and also suggested raising the issue here. [Case #:27172037] 

Repro

1) authenticate with O365 using oauth
2) make a request to the API for a delta token that looks something like this:


curl -H "Accept: application/json" -H "Content-Type: application/json" -H 'Authorization: Bearer '"$USER_OAUTH_TOKEN"'' -X GET "https://graph.microsoft.com/v1.0/me/calendarView/delta?StartDateTime=2020-08-06&endDateTime=2022-08-06" 9`

Again, this type of request succeeds for 99.9% of our users, but will fail reliably for just a few. Normally when the delta token requests succeeds, and we have a delta token, we can page through data and use the data in our system without problems.

We would love to either see this issue resolved so that it works for all users or that failing at least a more reliable and informative error code. As 503s can also happen when there is intermittent internal networking issues, it's difficult to condition logic on this error. 

Comments
Copper Contributor

I would like to add to this issue:

 

A different error code will not be enough.

We have experienced this problem when the API tried to deliver too much data in a single response. A lot of data that can be delivered using paging is no problem, but the two main scenarios we experienced where the API does not use paging and will run into 503s reliably are:

  • long event series with many exceptions
  • events with veeery long participant lists

For the second case there is no possibility at all to reduce the window.
But also for the first case there is no feasible way to handle this. Even if we had a good estimate of how many events fit into a single response, we lack information about the recurrence data of the faulty series to adjust the window accordingly.

 

Our only "solution" to this is currently to have users manually check for potentially blocking events and delete/change them.

 

This is by far our biggest problem with the API. User are very annoyed. Unfortunately we have to keep telling them that this is a Microsoft issue and we can't do much about it.
It is easily reproducable too:

  1. Create a series with 250 repetitions
  2. patch all repetitions into series exceptions (i.e. change something)
  3. run a delta query for a window that will contain all 250 repetitions
  4. see 503s

Unfortunately I can only upvote once.

 

Regards
Demian