OMS Gateway Errors in Event Log

%3CLINGO-SUB%20id%3D%22lingo-sub-145108%22%20slang%3D%22en-US%22%3EOMS%20Gateway%20Errors%20in%20Event%20Log%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-145108%22%20slang%3D%22en-US%22%3E%3CP%3EHi%2C%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EI%20am%20regularly%20seeing%20the%20following%20error%20in%20the%20OMS%20Gateway%20Event%20Log%3A%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E2018-01-15%2010%3A34%3A00%20%5B34%5D%20ERROR%20GatewayServer%20-%20Error%20in%20worker%20thread%3CBR%20%2F%3ESystem.IO.IOException%3A%20Unable%20to%20read%20data%20from%20the%20transport%20connection%3A%20An%20existing%20connection%20was%20forcibly%20closed%20by%20the%20remote%20host.%20---%26gt%3B%20System.Net.Sockets.SocketException%3A%20An%20existing%20connection%20was%20forcibly%20closed%20by%20the%20remote%20host%3CBR%20%2F%3E%20at%20System.Net.Sockets.Socket.Receive(Byte%5B%5D%20buffer%2C%20Int32%20offset%2C%20Int32%20size%2C%20SocketFlags%20socketFlags)%3CBR%20%2F%3E%20at%20System.Net.Sockets.NetworkStream.Read(Byte%5B%5D%20buffer%2C%20Int32%20offset%2C%20Int32%20size)%3CBR%20%2F%3E%20---%20End%20of%20inner%20exception%20stack%20trace%20---%3CBR%20%2F%3E%20at%20System.Net.Sockets.NetworkStream.Read(Byte%5B%5D%20buffer%2C%20Int32%20offset%2C%20Int32%20size)%3CBR%20%2F%3E%20at%20Microsoft.HttpForwarder.Library.HttpRequestParser.ReadLine(TcpConnection%20connection)%3CBR%20%2F%3E%20at%20Microsoft.HttpForwarder.Library.HttpRequestParser.ParseHttpRequestLine(TcpConnection%20connection)%3CBR%20%2F%3E%20at%20Microsoft.HttpForwarder.Library.GatewayLogic.%3CRUNASYNC%3Ed__0.MoveNext()%3CBR%20%2F%3E---%20End%20of%20stack%20trace%20from%20previous%20location%20where%20exception%20was%20thrown%20---%3CBR%20%2F%3E%20at%20System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()%3CBR%20%2F%3E%20at%20System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task%20task)%3CBR%20%2F%3E%20at%20Microsoft.HttpForwarder.Library.GatewayServer.%3CHANDLECLIENTASYNC%3Ed__e.MoveNext()%3C%2FHANDLECLIENTASYNC%3E%3C%2FRUNASYNC%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EI%20also%20have%20an%20issue%20with%20several%20agents%20not%20able%20send%20service%20map%20data%20to%20OMS%20and%20see%20this%20in%20the%20operations%20manager%20event%20log%3A%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EA%20subscriber%20data%20source%20in%20management%20group%26nbsp%3BXXXXXXXXX%20has%20posted%20items%20to%20the%20workflow%2C%20but%20has%20not%20received%20a%20response%20in%2017%20minutes.%20Data%20will%20be%20queued%20to%20disk%20until%20a%20response%20has%20been%20received.%20This%20indicates%20a%20performance%20or%20functional%20problem%20with%20the%20workflow.%3CBR%20%2F%3E%20Workflow%20Id%20%3A%20Microsoft.SystemCenter.CollectApplicationDependencyMonitorInformationToCloud%3CBR%20%2F%3E%20Instance%20%3A%26nbsp%3BXXXXXXXXXXX%3CBR%20%2F%3E%20Instance%20Id%20%3A%20%7BF7AA8097-51AA-E6A8-421C-99C975AE4FC9%7D%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EWe%20do%20have%20a%20large%20number%20of%20agents%20reporting%20correctly%20but%20a%20number%20with%20the%20above%20issue.%20I%20cant%20seem%20to%20work%20out%20why%20as%20they%20are%20configured%20the%20same%20and%20in%20the%20same%20subnet.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-145108%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAzure%20Log%20Analytics%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-147474%22%20slang%3D%22en-US%22%3ERe%3A%20OMS%20Gateway%20Errors%20in%20Event%20Log%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-147474%22%20slang%3D%22en-US%22%3E%3CP%3EHi%20Noa%2C%20many%20thanks%20for%20the%20reply.%20We%20still%20have%20the%20issues%20and%20I%20have%20a%20case%20open%20with%20support%20for%20it.%20I%20will%20update%20this%20thread%20when%20I%20get%20further%20news%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-147452%22%20slang%3D%22en-US%22%3ERe%3A%20OMS%20Gateway%20Errors%20in%20Event%20Log%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-147452%22%20slang%3D%22en-US%22%3E%3CP%3EHi%20Martin.%3C%2FP%3E%0A%3CP%3EI've%20been%20able%20to%20collect%20some%20information%20on%20the%20errors.%3C%2FP%3E%0A%3CP%3EThe%20first%20error%20could%20indicate%26nbsp%3Bthe%20message%20attempted%20to%20be%20sent%20is%20too%20large.%20There%20is%20a%20limit%20on%20the%20POST%20body%20size%26nbsp%3Bthe%20workflow%20on%20the%20non%20communicating%20agents%20are%20probably%20exceeding%20that.%3C%2FP%3E%0A%3CP%3EThe%20second%20warning%20happens%20when%20there%20are%20connectivity%20issues%20and%20we%20are%20queuing%20the%20data%20and%20retrying%20till%20we%20send%20it.%26nbsp%3B%3C%2FP%3E%0A%3CP%3Eput%20together%2C%20I%20would%20assume%20data%20has%20accumulated%20on%20some%20agents%20due%20to%20connectivity%20issues%2C%20and%20perhaps%20had%20reach%20a%20size%20limit.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EHas%20the%20issue%20been%20resolved%20yet%3F%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1119307%22%20slang%3D%22en-US%22%3ERe%3A%20OMS%20Gateway%20Errors%20in%20Event%20Log%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1119307%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F109014%22%20target%3D%22_blank%22%3E%40Martin%20Lamb%3C%2FA%3Edid%20you%20solved%20this%3F%20I'm%20facing%20the%20exact%20same%20problem%20now...%202%20years%20later...%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1119391%22%20slang%3D%22en-US%22%3ERe%3A%20OMS%20Gateway%20Errors%20in%20Event%20Log%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1119391%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F417296%22%20target%3D%22_blank%22%3E%40vitordias%3C%2FA%3E%26nbsp%3Bno%20we%20didn't%20solve%20it%20but%20in%20the%20end%20we%20removed%20the%20OS%20gateways%20as%20all%20servers%20were%20migrated%20to%20azure.%20we%20then%20controlled%20the%20traffic%20using%20NSG's%3C%2FP%3E%3C%2FLINGO-BODY%3E
Highlighted
New Contributor

Hi, 

 

I am regularly seeing the following error in the OMS Gateway Event Log:

 

2018-01-15 10:34:00 [34] ERROR GatewayServer - Error in worker thread
System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host
at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at Microsoft.HttpForwarder.Library.HttpRequestParser.ReadLine(TcpConnection connection)
at Microsoft.HttpForwarder.Library.HttpRequestParser.ParseHttpRequestLine(TcpConnection connection)
at Microsoft.HttpForwarder.Library.GatewayLogic.<RunAsync>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.HttpForwarder.Library.GatewayServer.<HandleClientAsync>d__e.MoveNext()

 

I also have an issue with several agents not able send service map data to OMS and see this in the operations manager event log:

 

A subscriber data source in management group XXXXXXXXX has posted items to the workflow, but has not received a response in 17 minutes. Data will be queued to disk until a response has been received. This indicates a performance or functional problem with the workflow.
Workflow Id : Microsoft.SystemCenter.CollectApplicationDependencyMonitorInformationToCloud
Instance : XXXXXXXXXXX
Instance Id : {F7AA8097-51AA-E6A8-421C-99C975AE4FC9}

 

We do have a large number of agents reporting correctly but a number with the above issue. I cant seem to work out why as they are configured the same and in the same subnet.

 

4 Replies
Highlighted

Hi Martin.

I've been able to collect some information on the errors.

The first error could indicate the message attempted to be sent is too large. There is a limit on the POST body size the workflow on the non communicating agents are probably exceeding that.

The second warning happens when there are connectivity issues and we are queuing the data and retrying till we send it. 

put together, I would assume data has accumulated on some agents due to connectivity issues, and perhaps had reach a size limit.

 

Has the issue been resolved yet?

Highlighted

Hi Noa, many thanks for the reply. We still have the issues and I have a case open with support for it. I will update this thread when I get further news

Highlighted

@Martin Lambdid you solved this? I'm facing the exact same problem now... 2 years later...

Highlighted

@vitordias no we didn't solve it but in the end we removed the OS gateways as all servers were migrated to azure. we then controlled the traffic using NSG's