Why did my availability test fail while my website is still available?

Published Jan 07 2021 12:00 PM 2,665 Views
Microsoft

A common challenge for app developers, site reliability engineers (SREs), and DevOps engineers is that a synthetic availability test could fail while the application is still functioning perfectly. It can be extremely frustrating to identify if the root cause of the failure was due to your application or network issues.

 

Introducing the new Availability Troubleshooting Report

 

TroubleshooterGif.gif

 
NOTE: The troubleshooting report is only available for URL ping tests.

 

The Troubleshooting Report is intended to help you understand why your customers may have problems accessing your application or alert you to potential issues while all metrics indicate it is healthy.

 

It can be accessed through the portal by  selecting a test result from the scatter plot or Drill Into section. Each dependency will have an individual troubleshooting report attached.

 

casocha_0-1610049088239.png

If a step fails, then it will appear at the top of the availability result to give you instant insight into where the problem might be. If no step fails, then the troubleshooting report will be closed by default.

 

Common Test Failures & Potential Root Causes:

 
 

DNS.png

 

DNS lookup could fail because your record needs to be publicly available for the ping test to work.

 

If you need to test against a private DNS record, then use the TrackAvailability SDK. This enables you to run availability tests behind a firewall or in an isolated environment, expand your test region selection, and author more complex tests than are available in the portal UI.

 

ConnectionFailed.png

 

Connection Failed indicates that there might be a firewall blocking our service from accessing your endpoints.

 

You can add the Application Insights Availability service tag to your Network Security Group (NSG) or Azure Firewall to allow only inbound traffic from our testing engine. Service tags will automatically update the list of allowed IP addresses for specific services, minimizing the complexity and need for updating network security rules. You can also whitelist by individual IP addresses.

 

If you need to run tests without allowing any traffic into your virtual network, then we recommend using the TrackAvailability SDK.

 

StatusCode.png

 

Status Code & Content Validation ensures your webpage has specific content available and that it sends the correct response code.

 

The application owners should be contacted to investigate why their page returns an incorrect code or is missing content.

 

See more:

Troubleshoot your Azure Application Insights availability tests - Azure Monitor | Microsoft Docs

 

 

%3CLINGO-SUB%20id%3D%22lingo-sub-2041249%22%20slang%3D%22en-US%22%3EWhy%20did%20my%20availability%20test%20fail%20while%20my%20website%20is%20still%20available%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2041249%22%20slang%3D%22en-US%22%3E%3CP%3EA%20common%20challenge%20for%20app%20developers%2C%20site%20reliability%20engineers%20(SREs)%2C%20and%20DevOps%20engineers%20is%20that%20a%20synthetic%20availability%20test%20could%20fail%20while%20the%20application%20is%20still%20functioning%20perfectly.%20It%20can%20be%20extremely%20frustrating%20to%20identify%20if%20the%20root%20cause%20of%20the%20failure%20was%20due%20to%20your%20application%20or%20network%20issues.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CH4%20id%3D%22toc-hId-147198654%22%20id%3D%22toc-hId-147203547%22%3E%3CFONT%20size%3D%223%22%3EIntroducing%20the%20new%20Availability%20Troubleshooting%20Report%3C%2FFONT%3E%3C%2FH4%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22TroubleshooterGif.gif%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F244786i6A86E3D413D850EF%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22TroubleshooterGif.gif%22%20alt%3D%22TroubleshooterGif.gif%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CH5%20id%3D%22toc-hId-837760128%22%20id%3D%22toc-hId-837765021%22%3E%26nbsp%3B%3C%2FH5%3E%0A%3CH5%20id%3D%22toc-hId--969694335%22%20id%3D%22toc-hId--969689442%22%3E%3CSTRONG%3ENOTE%3A%3C%2FSTRONG%3E%20The%20troubleshooting%20report%20is%20%3CU%3Eonly%3C%2FU%3E%20available%20for%20URL%20ping%20tests.%3C%2FH5%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThe%20%3CA%20href%3D%22https%3A%2F%2Freview.docs.microsoft.com%2Fen-us%2Fazure%2Fazure-monitor%2Fapp%2Ftroubleshoot-availability%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3ETroubleshooting%20Report%3C%2FA%3E%20is%20intended%20to%20help%20you%20understand%20why%20your%20customers%20may%20have%20problems%20accessing%20your%20application%20or%20alert%20you%20to%20potential%20issues%20while%20all%20metrics%20indicate%20it%20is%20healthy.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EIt%20can%20be%20accessed%20through%20the%20portal%20by%20%26nbsp%3Bselecting%20a%20test%20result%20from%20the%20scatter%20plot%20or%20Drill%20Into%20section.%20%3CU%3EEach%20dependency%20will%20have%20an%20individual%20troubleshooting%20report%20attached.%3C%2FU%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22casocha_0-1610049088239.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F244792i2EC13BCB0620A1C0%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20role%3D%22button%22%20title%3D%22casocha_0-1610049088239.png%22%20alt%3D%22casocha_0-1610049088239.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3EIf%20a%20step%20fails%2C%20then%20it%20will%20appear%20at%20the%20top%20of%20the%20availability%20result%20to%20give%20you%20instant%20insight%20into%20where%20the%20problem%20might%20be.%20If%20no%20step%20fails%2C%20then%20the%20troubleshooting%20report%20will%20be%20closed%20by%20default.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CH4%20id%3D%22toc-hId--980197439%22%20id%3D%22toc-hId--980192546%22%3E%3CFONT%20size%3D%223%22%3ECommon%20Test%20Failures%20%26amp%3B%20Potential%20Root%20Causes%3A%3C%2FFONT%3E%3C%2FH4%3E%0A%3CDIV%20id%3D%22tinyMceEditorcasocha_1%22%20class%3D%22mceNonEditable%20lia-copypaste-placeholder%22%3E%26nbsp%3B%3C%2FDIV%3E%0A%3CDIV%20id%3D%22tinyMceEditorcasocha_5%22%20class%3D%22mceNonEditable%20lia-copypaste-placeholder%22%3E%26nbsp%3B%3C%2FDIV%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-center%22%20image-alt%3D%22DNS.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F244783i9957D74E642A3050%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22DNS.png%22%20alt%3D%22DNS.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CDIV%20id%3D%22tinyMceEditorcasocha_4%22%20class%3D%22mceNonEditable%20lia-copypaste-placeholder%22%3E%26nbsp%3B%3C%2FDIV%3E%0A%3CP%3E%3CSTRONG%3EDNS%20lookup%3C%2FSTRONG%3E%20could%20fail%20because%20your%20record%20needs%20to%20be%20%3CU%3Epublicly%20available%3C%2FU%3E%20for%20the%20ping%20test%20to%20work.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EIf%20you%20need%20to%20test%20against%20a%20private%20DNS%20record%2C%20then%20use%20the%20%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fazure-monitor%2Fapp%2Favailability-azure-functions%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3ETrackAvailability%20SDK%3C%2FA%3E.%20This%20enables%20you%20to%20run%20availability%20tests%20behind%20a%20firewall%20or%20in%20an%20isolated%20environment%2C%20expand%20your%20test%20region%20selection%2C%20and%20author%20more%20complex%20tests%20than%20are%20available%20in%20the%20portal%20UI.%3C%2FP%3E%0A%3CDIV%20id%3D%22tinyMceEditorcasocha_2%22%20class%3D%22mceNonEditable%20lia-copypaste-placeholder%22%3E%26nbsp%3B%3C%2FDIV%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-center%22%20image-alt%3D%22ConnectionFailed.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F244784i953B34586D3DA14D%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22ConnectionFailed.png%22%20alt%3D%22ConnectionFailed.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSTRONG%3EConnection%20Failed%3C%2FSTRONG%3E%20indicates%20that%20there%20might%20be%20a%20firewall%20blocking%20our%20service%20from%20accessing%20your%20endpoints.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EYou%20can%20add%20the%20%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fazure-monitor%2Fapp%2Fip-addresses%23service-tag%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3EApplication%20Insights%20Availability%3C%2FA%3E%20service%20tag%20to%20your%20Network%20Security%20Group%20(NSG)%20or%20Azure%20Firewall%20to%20allow%20only%20inbound%20traffic%20from%20our%20testing%20engine.%20%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fvirtual-network%2Fservice-tags-overview%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3EService%20tags%3C%2FA%3E%20will%20automatically%20update%20the%20list%20of%20allowed%20IP%20addresses%20for%20specific%20services%2C%20minimizing%20the%20complexity%20and%20need%20for%20updating%20network%20security%20rules.%20You%20can%20also%20whitelist%20by%20%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fazure-monitor%2Fapp%2Fip-addresses%23addresses-grouped-by-location%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Eindividual%20IP%20addresses%3C%2FA%3E.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EIf%20you%20need%20to%20run%20tests%20without%20allowing%20any%20traffic%20into%20your%20virtual%20network%2C%20then%20we%20recommend%20using%20the%20%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fazure-monitor%2Fapp%2Favailability-azure-functions%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3ETrackAvailability%20SDK%3C%2FA%3E.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-center%22%20image-alt%3D%22StatusCode.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F244785iEC0AEADDF30C5156%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22StatusCode.png%22%20alt%3D%22StatusCode.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSTRONG%3EStatus%20Code%20%26amp%3B%20Content%20Validation%3C%2FSTRONG%3E%20ensures%20your%20webpage%20has%20specific%20content%20available%20and%20that%20it%20sends%20the%20correct%20response%20code.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThe%20application%20owners%20should%20be%20contacted%20to%20investigate%20why%20their%20page%20returns%20an%20incorrect%20code%20or%20is%20missing%20content.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSTRONG%3ESee%20more%3A%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Freview.docs.microsoft.com%2Fen-us%2Fazure%2Fazure-monitor%2Fapp%2Ftroubleshoot-availability%3Fbranch%3Dmain%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3ETroubleshoot%20your%20Azure%20Application%20Insights%20availability%20tests%20-%20Azure%20Monitor%20%7C%20Microsoft%20Docs%3C%2FA%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-TEASER%20id%3D%22lingo-teaser-2041249%22%20slang%3D%22en-US%22%3E%3CP%3EThis%20new%20report%20gives%20you%20more%20insight%20into%20why%20a%20URL%20ping%20test%20failure%20may%20have%20occurred%20while%20your%20website%20is%20still%20available.%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-TEASER%3E%3CLINGO-LABS%20id%3D%22lingo-labs-2041249%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EUpdates%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Version history
Last update:
‎Jan 07 2021 11:58 AM
Updated by: