SCOM Failover Questions

%3CLINGO-SUB%20id%3D%22lingo-sub-482100%22%20slang%3D%22en-US%22%3ESCOM%20Failover%20Questions%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-482100%22%20slang%3D%22en-US%22%3E%3CP%3EI%20have%20very%20little%20SCOM%20experience%20and%20I%20am%20trying%20to%20understand%20how%20failover%20works%20with%20the%20couple%20of%20different%20SCOM%20servers%20we%20have.%20We%20had%20an%20outage%20that%20took%20down%20a%20bunch%20of%20random%20servers%20and%20alerts%20didn't%20send%20so%20I%20want%20to%20understand%20how%20its%20supposed%20to%20work%20so%20I%20know%20what%20to%20fix.%26nbsp%3B%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3ECurrent%20setup%3C%2FP%3E%3CUL%3E%3CLI%3ETwo%20AD%20domains%20with%20a%20trust%20(in%20the%20process%20of%20migrating%20domains)%3C%2FLI%3E%3CLI%3ETwo%20SCOM%20management%20servers%20in%20domain%20A%3C%2FLI%3E%3CLI%3EOne%20SCOM%20gateway%20server%20in%20domain%20B%20that%20communicates%20with%20the%20management%20servers%20in%20domain%20A%3C%2FLI%3E%3CLI%3EOne%20SCOM%20sql%20database%20server%20in%20domain%20A%3C%2FLI%3E%3CLI%3EServers%20in%20domain%20A%20are%20set%20to%20the%20two%20management%20servers%20(primary%20and%20failover)%3C%2FLI%3E%3CLI%3EServers%20in%20domain%20B%20are%20set%20to%20the%20gateway%20server%20in%20domain%20B%3C%2FLI%3E%3C%2FUL%3E%3CP%3ECan%20someone%20help%20me%20understand%20the%20flow%20of%20alerts%3F%3C%2FP%3E%3CUL%3E%3CLI%3EIf%20management%20server%201%20goes%20down%20will%20management%20server%202%20send%20email%20alerts%3F%3C%2FLI%3E%3CLI%3EIf%20we%20lose%20the%20gateway%20server%20will%20alerts%20from%20domain%20B%20be%20sent%3F%3C%2FLI%3E%3CLI%3EIf%20the%20database%20server%20goes%20down%20will%20alerts%20be%20sent%3F%3C%2FLI%3E%3C%2FUL%3E%3CP%3EHopefully%20this%20makes%20sense.%26nbsp%3B%20Thank%20you%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-676459%22%20slang%3D%22en-US%22%3ERe%3A%20SCOM%20Failover%20Questions%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-676459%22%20slang%3D%22en-US%22%3E%3CP%3EHi%26nbsp%3B%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F271366%22%20target%3D%22_blank%22%3E%40brentmattson%3C%2FA%3E%2C%3C%2FP%3E%3CP%3E%3CBR%20%2F%3EThe%20failover%20for%20the%20SCOM%20management%20servers%20happens%20automatically%2C%20so%20if%20%3CSTRONG%3Emanagement%20server%201%3C%2FSTRONG%3E%20goes%20down%2C%20agents%20will%20automatically%20start%20communicating%20with%20%3CSTRONG%3Emanagement%20server%202%3C%2FSTRONG%3E%2C%20and%20management%20server%202%20will%20start%20sending%20notifications.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EHowever%20it%20works%20differently%20with%20%3CSTRONG%3Egateway%20servers%3C%2FSTRONG%3E%2C%20you%20need%20to%20configure%20a%20failover%20gateway%20server%20for%20each%20agent%2C%20otherwise%20automatic%20failover%20will%20not%20happen.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CU%3EHow%20to%20set%20the%20primary%20and%20failover%20management%20%2F%20gateway%20server%3A%3C%2FU%3E%3C%2FP%3E%3CPRE%3E%24PrimaryMS%20%3D%20Get-SCOMManagementServer%20%7C%20Where%20%7B%24_.Name%20%E2%80%93match%20%22MS1%22%7D%26nbsp%3B%0A%24FailoverMS%20%3D%20Get-SCOMManagementServer%20%7C%20Where%20%7B%24_.Name%20%E2%80%93match%20%22MS2%22%7D%26nbsp%3B%0A%24GatewayMS%20%3D%20Get-SCOMManagementServer%20%7C%20Where%20%7B%24_.IsGateway%20-eq%20%24True%7D%26nbsp%3B%0ASet-SCOMParentManagementServer%20-GatewayServer%3A%20%24GatewayMS%20-PrimaryServer%3A%20%24PrimaryMS%26nbsp%3B%0ASet-SCOMParentManagementServer%20-GatewayServer%3A%20%24GatewayMS%20-FailoverServer%3A%20%24FailoverMS%3C%2FPRE%3E%3CPRE%3E%23Agents%20reporting%20to%20%22Gateway%201%22%20%E2%80%93%20Failover%20to%20%22Gateway%202%22%0A%3CBR%20%2F%3E%24PrimaryMS%20%3D%20Get-SCOMManagementServer%20%7C%20Where%20%7B%24_.Name%20%E2%80%93eq%20%22GW1%22%7D%26nbsp%3B%0A%24FailoverMS%20%3D%20Get-SCOMManagementServer%20%7C%20Where%20%7B%24_.Name%20%E2%80%93eq%20%22GW2%22%7D%26nbsp%3B%0A%24Agent%20%3D%20Get-SCOMAgent%20%7C%20Where%20%7B%24_.PrimaryManagementServerName%20-eq%20%22GW1%22%7D%26nbsp%3B%0ASet-SCOMParentManagementServer%20-Agent%3A%20%24Agent%20-PrimaryServer%3A%20%24PrimaryMS%26nbsp%3B%0ASet-SCOMParentManagementServer%20-Agent%3A%20%24Agent%20-FailoverServer%3A%20%24FailoverMS%3C%2FPRE%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EFor%20the%20SCOM%20database%20and%20data%20warehouse%2C%20you%20should%20have%20a%20stretch%20cluster%20or%20use%20SQL%20Always%20On%20to%20provide%20high%20availability.%3C%2FP%3E%3CP%3EIf%20you%20only%20have%20one%20SCOM%20database%20%26amp%3B%20data%20warehouse%20server%2C%20and%20it%20goes%20down%2C%20you%20will%20not%20receive%20any%20alerts.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CBR%20%2F%3EBest%20regards%2CLeon%3C%2FP%3E%3C%2FLINGO-BODY%3E
Occasional Contributor

I have very little SCOM experience and I am trying to understand how failover works with the couple of different SCOM servers we have. We had an outage that took down a bunch of random servers and alerts didn't send so I want to understand how its supposed to work so I know what to fix.  

 

Current setup

  • Two AD domains with a trust (in the process of migrating domains)
  • Two SCOM management servers in domain A
  • One SCOM gateway server in domain B that communicates with the management servers in domain A
  • One SCOM sql database server in domain A
  • Servers in domain A are set to the two management servers (primary and failover)
  • Servers in domain B are set to the gateway server in domain B

Can someone help me understand the flow of alerts?

  • If management server 1 goes down will management server 2 send email alerts?
  • If we lose the gateway server will alerts from domain B be sent?
  • If the database server goes down will alerts be sent?

Hopefully this makes sense.  Thank you

1 Reply

Hi @brentmattson,


The failover for the SCOM management servers happens automatically, so if management server 1 goes down, agents will automatically start communicating with management server 2, and management server 2 will start sending notifications.

 

However it works differently with gateway servers, you need to configure a failover gateway server for each agent, otherwise automatic failover will not happen.

 

How to set the primary and failover management / gateway server:

$PrimaryMS = Get-SCOMManagementServer | Where {$_.Name –match "MS1"} 
$FailoverMS = Get-SCOMManagementServer | Where {$_.Name –match "MS2"} 
$GatewayMS = Get-SCOMManagementServer | Where {$_.IsGateway -eq $True} 
Set-SCOMParentManagementServer -GatewayServer: $GatewayMS -PrimaryServer: $PrimaryMS 
Set-SCOMParentManagementServer -GatewayServer: $GatewayMS -FailoverServer: $FailoverMS
#Agents reporting to "Gateway 1" – Failover to "Gateway 2"

$PrimaryMS = Get-SCOMManagementServer | Where {$_.Name –eq "GW1"}  $FailoverMS = Get-SCOMManagementServer | Where {$_.Name –eq "GW2"}  $Agent = Get-SCOMAgent | Where {$_.PrimaryManagementServerName -eq "GW1"}  Set-SCOMParentManagementServer -Agent: $Agent -PrimaryServer: $PrimaryMS  Set-SCOMParentManagementServer -Agent: $Agent -FailoverServer: $FailoverMS

 

For the SCOM database and data warehouse, you should have a stretch cluster or use SQL Always On to provide high availability.

If you only have one SCOM database & data warehouse server, and it goes down, you will not receive any alerts.

 


Best regards,
Leon