Is there a solution for the vSphere-ESXi network for my cluster consisting of five virtual machines running on five Microsoft Windows Server 2012 (64-bit) servers, which are located on different physical machines running ESXi?
I am facing a problem where the servers randomly reboot and show a blue screen. I have checked the dump files (C:\Windows\MEMORY.DMP) and it shows "".
33: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
USER_MODE_HEALTH_MONITOR (9e)
One or more critical user mode components failed to satisfy a health check.
Hardware mechanisms such as watchdog timers can detect that basic kernel
services are not executing. However, resource starvation issues, including
memory leaks, lock contention, and scheduling priority misconfiguration,
may block critical user mode components without blocking DPCs or
draining the nonpaged pool.
Kernel components can extend watchdog timer functionality to user mode
by periodically monitoring critical applications. This BugCheck indicates
that a user mode health check failed in a manner such that graceful
shutdown is unlikely to succeed. It restores critical services by
rebooting and/or allowing application failover to other servers.
Arguments:
Arg1: ffffe001aae5f8c0, Process that failed to satisfy a health check within the
configured timeout
Arg2: 000000000000003c, Health monitoring timeout (seconds)
Arg3: 000000000000000a, WatchdogSourceClussvcIsAlive
Cluster service sends heartbeat to netft every 500 milliseconds.
By default netft expects at least 1 heartbeat per second.
If this watchdog was triggered that means clussvc is not getting
CPU to send heartbeats.
Arg4: 0000000000000000
Debugging Details:
------------------
*************************************************************************
*** ***
*** ***
*** Either you specified an unqualified symbol, or your debugger ***
*** doesn't have full symbol information. Unqualified symbol ***
*** resolution is turned off by default. Please either specify a ***
*** fully qualified symbol module!symbolname, or enable resolution ***
*** of unqualified symbols by typing ".symopt- 100". Note that ***
*** enabling unqualified symbol resolution with network symbol ***
*** server shares in the symbol path may cause the debugger to ***
*** appear to hang for long periods of time when an incorrect ***
*** symbol name is typed or the network symbol server is down. ***
*** ***
*** For some commands to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: netft!NETFT_WATCHDOG_SOURCE ***
*** ***
*************************************************************************
KEY_VALUES_STRING: 1
Key : Analysis.CPU.mSec
Value: 1203
Key : Analysis.Elapsed.mSec
Value: 1236
Key : Analysis.IO.Other.Mb
Value: 0
Key : Analysis.IO.Read.Mb
Value: 0
Key : Analysis.IO.Write.Mb
Value: 0
Key : Analysis.Init.CPU.mSec
Value: 484
Key : Analysis.Init.Elapsed.mSec
Value: 13536
Key : Analysis.Memory.CommitPeak.Mb
Value: 90
Key : Analysis.Version.DbgEng
Value: 10.0.27704.1001
Key : Analysis.Version.Description
Value: 10.2408.27.01 amd64fre
Key : Analysis.Version.Ext
Value: 1.2408.27.1
Key : Bugcheck.Code.KiBugCheckData
Value: 0x9e
Key : Bugcheck.Code.LegacyAPI
Value: 0x9e
Key : Bugcheck.Code.TargetModel
Value: 0x9e
Key : Failure.Bucket
Value: 0x9E_a_IMAGE_clussvc.exe
Key : Failure.Hash
Value: {987def74-31b2-9ff0-a34a-d22365e3b8e7}
Key : Hypervisor.Enlightenments.Value
Value: 17184
Key : Hypervisor.Enlightenments.ValueHex
Value: 4320
Key : Hypervisor.Flags.Value
Value: 17
Key : Hypervisor.Flags.ValueHex
Value: 11
Key : WER.OS.Branch
Value: winblue_ltsb_escrow
Key : WER.OS.Version
Value: 8.1.9600.21620
BUGCHECK_CODE: 9e
BUGCHECK_P1: ffffe001aae5f8c0
BUGCHECK_P2: 3c
BUGCHECK_P3: a
BUGCHECK_P4: 0
FILE_IN_CAB: MEMORY (1).DMP
VIRTUAL_MACHINE: VMware
FAULTING_THREAD: ffffd00144767280
PROCESS_NAME: clussvc.exe
IMAGE_NAME: clussvc.exe
MODULE_NAME: clussvc
FAULTING_MODULE: 0000000000000000
STACK_TEXT:
ffffd001`447751e8 fffff801`0a9de468 : 00000000`0000009e ffffe001`aae5f8c0 00000000`0000003c 00000000`0000000a : nt!KeBugCheckEx
ffffd001`447751f0 fffff801`0a9de0f2 : 00000000`00000000 00000000`00000001 ffffd001`44756180 00000000`00000000 : netft!NetftProcessWatchdogEvent+0xe4
ffffd001`44775230 fffff802`b4d02ac8 : ffffd001`447753a0 00000000`00000000 ffffe800`cc01d010 00000000`00000000 : netft!NetftWatchdogTimerDpc+0x36
ffffd001`44775260 fffff802`b4dd0c7a : ffffd001`44756180 ffffd001`44756180 ffffd001`44767280 ffffe800`cc665080 : nt!KiRetireDpcList+0x4f8
ffffd001`447754e0 00000000`00000000 : ffffd001`44776000 ffffd001`4476f000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x5a
STACK_COMMAND: .process /r /p 0xfffff802b4fd5300; .thread 0xffffd00144767280 ; kb
FAILURE_BUCKET_ID: 0x9E_a_IMAGE_clussvc.exe
OS_VERSION: 8.1.9600.21620
BUILDLAB_STR: winblue_ltsb_escrow
OSPLATFORM_TYPE: x64
OSNAME: Windows 8.1
FAILURE_ID_HASH: {987def74-31b2-9ff0-a34a-d22365e3b8e7}
Followup: MachineOwner
---------
.