Microsoft Secure Tech Accelerator
Apr 03 2024, 07:00 AM - 11:00 AM (PDT)
Microsoft Tech Community
SOLVED

Cannot install a new Light Gateway: TaskCanceledException

Copper Contributor

I get the following error when I try to install a Light Gateway on a new Domain Controller:

[11B4:0ACC][2018-11-26T11:47:17]i000: 2018-11-26 11:47:17.0800 4532 5 Debug [\[]DeploymentModel[\]] [\[]DeploymentAction=Install[\]]
[11B4:0ACC][2018-11-26T11:47:17]i000: 2018-11-26 11:47:17.2377 4532 5 Debug [\[]DeploymentModel[\]] [\[]IsAfterRestartAndConfigured=False[\]]
[11B4:0F30][2018-11-26T11:49:02]i000: 2018-11-26 11:49:02.5491 4532 11 Error [\[]TaskAwaiter[\]] System.Threading.Tasks.TaskCanceledException: A task was canceled.
at async Microsoft.Tri.Infrastructure.Extensions.HttpClientExtension.GetAsync[\[][\]](?)
at async Microsoft.Tri.Common.Management.ManagementClient.<>c__DisplayClass9_0.<GetStatusAsync>b__0(?)
at async Microsoft.Tri.Infrastructure.Extensions.HttpClientExtension.RequestAsync[\[][\]](?)
at async Microsoft.Tri.Common.Management.ManagementClient.GetStatusAsync(?)
[11B4:0F30][2018-11-26T11:49:02]i000: 2018-11-26 11:49:02.5491 4532 11 Error [\[]DeploymentModel[\]] Failed management authentication [\[]CurrentlyLoggedOnUser=mydomain\myuseridStatus=Failed Exception=System.Threading.Tasks.TaskCanceledException: A task was canceled.
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Tri.Infrastructure.Extensions.HttpClientExtension.<GetAsync>d__0`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Tri.Common.Management.ManagementClient.<>c__DisplayClass9_0.<<GetStatusAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Tri.Infrastructure.Extensions.HttpClientExtension.<RequestAsync>d__4`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Tri.Common.Management.ManagementClient.<GetStatusAsync>d__9.MoveNext()[\]]
[11B4:0ACC][2018-11-26T11:49:02]i000: 2018-11-26 11:49:02.5491 4532 5 Debug [\[]GatewayBootstrapperApplication[\]] Engine.Quit [\[]deploymentResultStatus=1602 isRestartRequired=False[\]]
[11B4:16F4][2018-11-26T11:49:02]i500: Shutting down, exit code: 0x642

My setup is:

  1. Center server (v1.9.7312.32791) configured using 3rd Party certificate
  2. 2x Domain Controllers
  3. Group Policy containing the 3rd Party certificate public key + certificate chain, attached to the Domain Controllers

I had 2 Domain Controllers with the LGateway on for a year, and I've recently replaced one of the DCs with a new one. On the new one, I can't get the LGateway install to complete, I get the above error. They are both in the standard "Domain Controllers" OU, and have the same GPOs applied. The DCs have persisted through a few Center upgrades a few months ago from v1.8 to the latest v1.9.

 

I can reach the Center console from my new DC, and I can see all the config and timeline etc. To run the LGateway install, I have some Powershell that downloads it straight from the Center server to ensure I always get the latest version:

$dlfile = "https://${app_url}/api/management/softwareUpdates/gateways/deploymentPackage"
$creds = Get-Credential -Credential $env:USERNAME
$wc = New-Object System.Net.Webclient
$wc.Credentials = New-Object System.Net.NetworkCredential($env:USERNAME, $creds.Password)
$wc.DownloadFile($dlfile,$destfile)
Unblock-File -Path $destfile

This downloads the Gateway Setup zip file just fine, proving connectivity. My domain admin user ID is in the "Microsoft Advanced Threat Analytics Administrators" group on the Center server, and this is the same ID that I'm logged onto the DC with. If I don't provide credentials, I get a HTTP 401 as expected.

 

If I remove the GPO containing the 3rd Party cert from the new DC, I get the "Failed to validate certificate" error in the install log (and it lists my Center cert details).

 

I've checked the Troubleshooting page and that indicates that the ATA Lightweight Gateway could not successfully authenticate against the ATA Center, but there's no advice for what to do if you can access the Center console from the DC you're installing on. My existing DC is working perfectly with the Center server and is listed in the "Gateways" page as it has always been. I've checked the 3rd Party certificate installed in the GPO and the thumbprint, and the thumbprints in all the chain, match between the Center server, the working DC, and the non-working DC.

 

I'm wondering if the working DC has some remnants of the previous Gateway config installed which is causing it to work, because I know the cert requirements and processes have changed from our initial install of 1.7 last year to the current install. It might be that just installing the public Center cert + chain isn't enough?

 

What else can I check?

6 Replies

If the console UI already lists this GW in configuration, try to delete it from there first.

If it still does not work, most likely there is a problem with with authentication,

Is there anything different on this machine compared to the other machines in regards to kerberos/ntlm policy? 

Is the center domain joined?

Are you using the same user for deployment as the one you are using to authenticate to the console UI from this machine?

It's not in the GW screen in the Console UI at all.

There's no obvious difference between the new machine and the working one, they're both in the same OU and were built using the same automation.

The Center is domain-joined.

I'm using the same user for deployment as the one I authenticate with on this machine to the Console UI.

try to capture a netmon 3.4 trace while you run the deployment, and look in the trace to see what was broken exactly in the process, if there was really an authentication issue, or some other issue that blocked communication.

 

Also, try to browse the console from the machine using Internet Explorer, are there any warnings or errors that IE displays?

 

If still no luck, contact support for more in depth research. 

We found the issue, although it's a weird one. We have a GPO which does stuff with the proxy settings, which seems to interfere with the LGateway registration task. The GPO is also applied to the other DC, which is sat happily connected to the Center, so it appears that our settings were only causing issues for the Registration of the new LGateway, not for the normal Running operations.

 

Thanks for the suggestions. If you need any more info for development or for your docs, let me know and I can supply.

best response confirmed by R B (Copper Contributor)
Solution

This makes sense.

the initial registration will try to use negotiate authentication, using proxy settings from the current user.

The service run is using certificate authentication, and proxy settings from local service.

the policy might have been applied after the previous GWs were already installed, so it did not break them.

Yes, that makes sense. The DC that was working was already registered months before when the proxy GPO was added, so it wasn't affected. It's only when registering a new LGateway service on a new DC that we would see the issue.

 

Thanks again.

1 best response

Accepted Solutions
best response confirmed by R B (Copper Contributor)
Solution

This makes sense.

the initial registration will try to use negotiate authentication, using proxy settings from the current user.

The service run is using certificate authentication, and proxy settings from local service.

the policy might have been applied after the previous GWs were already installed, so it did not break them.

View solution in original post