AATP sensors stopped communicating

Copper Contributor

Hi,

 

since today morning (7:30 AM CET) all sensors showing the Disconnected status message on the AATP portal. Checked the logs, it shows the following error message:

Error HttpResponseMessageExtension Microsoft.Tri.Infrastructure.ExtendedHttpRequestException: Response status code does not indicate success: 500 (Internal Server Error). ---> System.Net.Http.HttpRequestException: Response status code does not indicate success: 500 (Internal Server Error).

 

As far as I know, nobody changed any settings related to the firewalls in our environment. We have 35 sensors (US-East, US-West, EMEA, APAC) in two domains and all stopped working in he same time. Around 10:00 AM CET I tried to login to the AATP portal, but the site was not reachable, that issue is no longer persist. Checked the custom api URL in a browser, that shows the  HTTP Error 503. The service is unavailable. error message, about which I am not sure that it is expected or not in a browser. Tried to restart the sensor on a few DCs, no luck.

 

Appreciate any help!

Thanks,
David 

 

4 Replies

@dbalogh 
If your workspace is in the US East region, (Check the workspace about box) then there was a n issue that was mitigated only a few minutes ago. Do yo still see it now ?

@EliOfek 

Geolocation
North America / Central America / Caribbean

Still seeing the error. Tried to restart the sensor on one of the DCs, but the service cannot start (error 1067).

@dbalogh 
This is indeed the cluster that had issues.

Are you able to connect to the workspace portal without issues ?

@EliOfek Yes, the connection to the portal is working fine. Had problems with it earlier this morning and I noticed the sensor error after I was able to log in.

 

Edit: Seeing this error on the statuspage: Partial Connectivity issues for sensors in East US region Investigating - We are currently investigating this issue. Aug 6, 14:13 UTC