SQL pod may get stuck in "ContainerCreating" status when you stop the node instance on AKS

Published 12-17-2020 01:37 AM 1,677 Views
Microsoft

When we deploy SQL Server on AKS, sometimes we may find SQL HA is not working as expect.

 

For example, when we deploy AKS using our default sample with 2 nodes:

https://docs.microsoft.com/en-us/azure/aks/tutorial-kubernetes-deploy-cluster#create-a-kubernetes-cl...

 

 

 

az aks create \
    --resource-group myResourceGroup \
    --name myAKSCluster \
    --node-count 2 \
    --generate-ssh-keys \
    --attach-acr <acrName>

 

 

 

There should be 2 instances deployed in the AKS virtual machine scale set:

Untitled picture.png

 

According to the SQL document:

 

In the following diagram, the node hosting the mssql-server container has failed. The orchestrator starts the new pod on a different node, and mssql-server reconnects to the same persistent storage. The service connects to the re-created mssql-server.

 

Untitled picture1.png

 

However, this seems not always be true when we manually stop the AKS node instance from the portal.

 

Before we stop any nodes, we may see the status of the pod is running.

 

Untitled picture2.png

 

If we stop node 0, nothing will happen as SQL reside on node 1.

 

Untitled picture4.png

 

The status of SQL pod remains running.

Untitled picture5.png

 

However, if we stop node 1 instead of node 0, then there comes the issue.

Untitled picture6.png

We may see original sql remains in the status of Terminating while the new sql pod stucks in the middle of status ContainerCreating.

 

 

$ kubectl describe pod mssql-deployment-569f96888d-bkgvf
Name:           mssql-deployment-569f96888d-bkgvf
Namespace:      default
Priority:       0
Node:           aks-nodepool1-26283775-vmss000000/10.240.0.4
Start Time:     Thu, 17 Dec 2020 16:29:10 +0800
Labels:         app=mssql
                pod-template-hash=569f96888d
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/mssql-deployment-569f96888d
Containers:
  mssql:
    Container ID:
    Image:          mcr.microsoft.com/mssql/server:2017-latest
    Image ID:
    Port:           1433/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:
      MSSQL_PID:    Developer
      ACCEPT_EULA:  Y
      SA_PASSWORD:  <set to the key 'SA_PASSWORD' in secret 'mssql'>  Optional: false
    Mounts:
      /var/opt/mssql from mssqldb (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-jh9rf (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  mssqldb:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  mssql-data
    ReadOnly:   false
  default-token-jh9rf:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-jh9rf
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason              Age                  From                                        Message
  ----     ------              ----                 ----                                        -------
  Normal   Scheduled           <unknown>            default-scheduler                           Successfully assigned default/mssql-deployment-569f96888d-bkgvf to aks-nodepool1-26283775-vmss000000
  Warning  FailedAttachVolume  18m                  attachdetach-controller                     Multi-Attach error for volume "pvc-6e3d4aac-6449-4c9d-86d0-c2488583ec5c" Volume is already used by pod(s) mssql-deployment-569f96888d-d8kz7
  Warning  FailedMount         3m16s (x4 over 14m)  kubelet, aks-nodepool1-26283775-vmss000000  Unable to attach or mount volumes: unmounted volumes=[mssqldb], unattached volumes=[mssqldb default-token-jh9rf]: timed out waiting for the condition
  Warning  FailedMount         62s (x4 over 16m)    kubelet, aks-nodepool1-26283775-vmss000000  Unable to attach or mount volumes: unmounted volumes=[mssqldb], unattached volumes=[default-token-jh9rf mssqldb]: timed out waiting for the condition

 

 

This issue caused by an multi-attach error should be expected due to the current AKS internal design.

 

If you restart the node instance that was shutdown, the issue will be resolved.

1 Comment
Microsoft

Thanks for clarifying this. The tutorial includes a simple configuration where only 1 pod is configured.

However even though Kubernetes/AKS can add some more reliability by having the pod re-scheduled, this is not a silver bullet for availability solution.

 

Virtual Machines can have multiple failure patterns. In some of those the PV (disk) could also be impacted, and there's a risk of the disk being stuck on the failing node. The approach assumes the disk can be successfully detached/attached to another VM during the recovery, this is not always true. Adding to that the sqlserver instance wouldn't be zone redundant.

 

I'd suggest to consider active/backup deployment pattern for a more reliable deployment. It would require considerable effort to deploy in Kubernetes but it would be a great to have. See kubedb project for an example of deploying reliable databases in Kubernetes, which demonstrates all sorts of replication/standby patterns can also be implemented in Kubernetes, postgres-ha helm chart also demonstrates a HA setting.

 

%3CLINGO-SUB%20id%3D%22lingo-sub-1996735%22%20slang%3D%22en-US%22%3ESQL%20pod%20may%20stuck%20in%20%22ContainerCreating%22%20status%20when%20you%20stop%20the%20node%20instance%20on%20AKS%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1996735%22%20slang%3D%22en-US%22%3E%3CP%3EWhen%20we%20deploy%20SQL%20Server%20on%20AKS%2C%20sometimes%20we%20may%20find%20SQL%20HA%20is%20not%20working%20as%20expect.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EFor%20example%2C%20when%20we%20deploy%20AKS%20using%20our%20default%20sample%20with%202%20nodes%3A%3C%2FP%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Faks%2Ftutorial-kubernetes-deploy-cluster%23create-a-kubernetes-cluster%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ehttps%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Faks%2Ftutorial-kubernetes-deploy-cluster%23create-a-kubernetes-cluster%3C%2FA%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CPRE%20class%3D%22lia-code-sample%20language-bash%22%3E%3CCODE%3Eaz%20aks%20create%20%5C%0A%26nbsp%3B%26nbsp%3B%26nbsp%3B%20--resource-group%20myResourceGroup%20%5C%0A%26nbsp%3B%26nbsp%3B%26nbsp%3B%20--name%20myAKSCluster%20%5C%0A%26nbsp%3B%26nbsp%3B%26nbsp%3B%20--node-count%202%20%5C%0A%26nbsp%3B%26nbsp%3B%26nbsp%3B%20--generate-ssh-keys%20%5C%0A%26nbsp%3B%26nbsp%3B%26nbsp%3B%20--attach-acr%20%3CACRNAME%3E%3C%2FACRNAME%3E%3C%2FCODE%3E%3C%2FPRE%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThere%20should%20be%202%20instances%20deployed%20in%20the%20AKS%20virtual%20machine%20scale%20set%3A%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Untitled%20picture.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F241292i0FF7491935861DDD%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22Untitled%20picture.png%22%20alt%3D%22Untitled%20picture.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EAccording%20to%20the%20SQL%20document%3A%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%20class%3D%22lia-indent-padding-left-30px%22%3E%3CSTRONG%3E%3CEM%3EIn%20the%20following%20diagram%2C%20the%20node%20hosting%20the%20mssql-server%20container%20has%20failed.%20The%20orchestrator%20starts%20the%20new%20pod%20on%20a%20different%20node%2C%20and%20mssql-server%20reconnects%20to%20the%20same%20persistent%20storage.%20The%20service%20connects%20to%20the%20re-created%20mssql-server.%3C%2FEM%3E%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%20class%3D%22lia-indent-padding-left-30px%22%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Untitled%20picture1.png%22%20style%3D%22width%3A%20692px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F241293iDADFB1EE2F2E13D6%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22Untitled%20picture1.png%22%20alt%3D%22Untitled%20picture1.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EHowever%2C%20this%20seems%20not%20always%20be%20true%20when%20we%20manually%20stop%20the%20AKS%20node%20instance%20from%20the%20portal.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EBefore%20we%20stop%20any%20nodes%2C%20we%20may%20see%20the%20status%20of%20the%20pod%20is%20running.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Untitled%20picture2.png%22%20style%3D%22width%3A%20696px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F241294iA2EA84CE1485CC34%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22Untitled%20picture2.png%22%20alt%3D%22Untitled%20picture2.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EIf%20we%20stop%20node%200%2C%20nothing%20will%20happen%20as%20SQL%20reside%20on%20node%201.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Untitled%20picture4.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F241295iE55A5F10BAF0C535%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22Untitled%20picture4.png%22%20alt%3D%22Untitled%20picture4.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThe%20status%20of%20SQL%20pod%20remains%20running.%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Untitled%20picture5.png%22%20style%3D%22width%3A%20733px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F241296i953C05A4B1CBBF8C%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22Untitled%20picture5.png%22%20alt%3D%22Untitled%20picture5.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EHowever%2C%20if%20we%20stop%20node%201%20instead%20of%20node%200%2C%20then%20there%20comes%20the%20issue.%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Untitled%20picture6.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F241297i13A4B1146A4ED605%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22Untitled%20picture6.png%22%20alt%3D%22Untitled%20picture6.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3EWe%20may%20see%20original%20sql%20remains%20in%20the%20status%20of%20%3CSTRONG%3ETerminating%3C%2FSTRONG%3E%20while%20the%20new%20sql%20pod%20stucks%20in%20the%20middle%20of%20status%20%3CSTRONG%3EContainerCreating%3C%2FSTRONG%3E.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CPRE%20class%3D%22lia-code-sample%20language-markdown%22%3E%3CCODE%3E%24%20kubectl%20describe%20pod%20mssql-deployment-569f96888d-bkgvf%0AName%3A%20%20%20%20%20%20%20%20%20%20%20mssql-deployment-569f96888d-bkgvf%0ANamespace%3A%20%20%20%20%20%20default%0APriority%3A%20%20%20%20%20%20%200%0ANode%3A%20%20%20%20%20%20%20%20%20%20%20aks-nodepool1-26283775-vmss000000%2F10.240.0.4%0AStart%20Time%3A%20%20%20%20%20Thu%2C%2017%20Dec%202020%2016%3A29%3A10%20%2B0800%0ALabels%3A%20%20%20%20%20%20%20%20%20app%3Dmssql%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20pod-template-hash%3D569f96888d%0AAnnotations%3A%20%20%20%20%3CNONE%3E%0AStatus%3A%20%20%20%20%20%20%20%20%20Pending%0AIP%3A%0AIPs%3A%20%20%20%20%20%20%20%20%20%20%20%20%3CNONE%3E%0AControlled%20By%3A%20%20ReplicaSet%2Fmssql-deployment-569f96888d%0AContainers%3A%0A%20%20mssql%3A%0A%20%20%20%20Container%20ID%3A%0A%20%20%20%20Image%3A%20%20%20%20%20%20%20%20%20%20mcr.microsoft.com%2Fmssql%2Fserver%3A2017-latest%0A%20%20%20%20Image%20ID%3A%0A%20%20%20%20Port%3A%20%20%20%20%20%20%20%20%20%20%201433%2FTCP%0A%20%20%20%20Host%20Port%3A%20%20%20%20%20%200%2FTCP%0A%20%20%20%20State%3A%20%20%20%20%20%20%20%20%20%20Waiting%0A%20%20%20%20%20%20Reason%3A%20%20%20%20%20%20%20ContainerCreating%0A%20%20%20%20Ready%3A%20%20%20%20%20%20%20%20%20%20False%0A%20%20%20%20Restart%20Count%3A%20%200%0A%20%20%20%20Environment%3A%0A%20%20%20%20%20%20MSSQL_PID%3A%20%20%20%20Developer%0A%20%20%20%20%20%20ACCEPT_EULA%3A%20%20Y%0A%20%20%20%20%20%20SA_PASSWORD%3A%20%20%3CSET%20to%3D%22%22%20the%3D%22%22%20key%3D%22%22%3E%20%20Optional%3A%20false%0A%20%20%20%20Mounts%3A%0A%20%20%20%20%20%20%2Fvar%2Fopt%2Fmssql%20from%20mssqldb%20(rw)%0A%20%20%20%20%20%20%2Fvar%2Frun%2Fsecrets%2Fkubernetes.io%2Fserviceaccount%20from%20default-token-jh9rf%20(ro)%0AConditions%3A%0A%20%20Type%20%20%20%20%20%20%20%20%20%20%20%20%20%20Status%0A%20%20Initialized%20%20%20%20%20%20%20True%0A%20%20Ready%20%20%20%20%20%20%20%20%20%20%20%20%20False%0A%20%20ContainersReady%20%20%20False%0A%20%20PodScheduled%20%20%20%20%20%20True%0AVolumes%3A%0A%20%20mssqldb%3A%0A%20%20%20%20Type%3A%20%20%20%20%20%20%20PersistentVolumeClaim%20(a%20reference%20to%20a%20PersistentVolumeClaim%20in%20the%20same%20namespace)%0A%20%20%20%20ClaimName%3A%20%20mssql-data%0A%20%20%20%20ReadOnly%3A%20%20%20false%0A%20%20default-token-jh9rf%3A%0A%20%20%20%20Type%3A%20%20%20%20%20%20%20%20Secret%20(a%20volume%20populated%20by%20a%20Secret)%0A%20%20%20%20SecretName%3A%20%20default-token-jh9rf%0A%20%20%20%20Optional%3A%20%20%20%20false%0AQoS%20Class%3A%20%20%20%20%20%20%20BestEffort%0ANode-Selectors%3A%20%20%3CNONE%3E%0ATolerations%3A%20%20%20%20%20node.kubernetes.io%2Fnot-ready%3ANoExecute%20for%20300s%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20node.kubernetes.io%2Funreachable%3ANoExecute%20for%20300s%0AEvents%3A%0A%20%20Type%20%20%20%20%20Reason%20%20%20%20%20%20%20%20%20%20%20%20%20%20Age%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20From%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20Message%0A%20%20----%20%20%20%20%20------%20%20%20%20%20%20%20%20%20%20%20%20%20%20----%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20----%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20-------%0A%20%20Normal%20%20%20Scheduled%20%20%20%20%20%20%20%20%20%20%20%3CUNKNOWN%3E%20%20%20%20%20%20%20%20%20%20%20%20default-scheduler%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20Successfully%20assigned%20default%2Fmssql-deployment-569f96888d-bkgvf%20to%20aks-nodepool1-26283775-vmss000000%0A%20%20Warning%20%20FailedAttachVolume%20%2018m%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20attachdetach-controller%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20Multi-Attach%20error%20for%20volume%20%22pvc-6e3d4aac-6449-4c9d-86d0-c2488583ec5c%22%20Volume%20is%20already%20used%20by%20pod(s)%20mssql-deployment-569f96888d-d8kz7%0A%20%20Warning%20%20FailedMount%20%20%20%20%20%20%20%20%203m16s%20(x4%20over%2014m)%20%20kubelet%2C%20aks-nodepool1-26283775-vmss000000%20%20Unable%20to%20attach%20or%20mount%20volumes%3A%20unmounted%20volumes%3D%5Bmssqldb%5D%2C%20unattached%20volumes%3D%5Bmssqldb%20default-token-jh9rf%5D%3A%20timed%20out%20waiting%20for%20the%20condition%0A%20%20Warning%20%20FailedMount%20%20%20%20%20%20%20%20%2062s%20(x4%20over%2016m)%20%20%20%20kubelet%2C%20aks-nodepool1-26283775-vmss000000%20%20Unable%20to%20attach%20or%20mount%20volumes%3A%20unmounted%20volumes%3D%5Bmssqldb%5D%2C%20unattached%20volumes%3D%5Bdefault-token-jh9rf%20mssqldb%5D%3A%20timed%20out%20waiting%20for%20the%20condition%0A%3C%2FUNKNOWN%3E%3C%2FNONE%3E%3C%2FSET%3E%3C%2FNONE%3E%3C%2FNONE%3E%3C%2FCODE%3E%3C%2FPRE%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThis%20issue%20caused%20by%20an%20multi-attach%20error%20should%20be%20expected%20due%20to%20the%20current%20AKS%20internal%20design.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EIf%20you%20restart%20the%20node%20instance%20that%20was%20shutdown%2C%20the%20issue%20will%20be%20resolved.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-1996735%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAKS%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EAzure%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EHigh%20Availability%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3ELinux%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3ESQL%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2015636%22%20slang%3D%22en-US%22%3ERe%3A%20SQL%20pod%20may%20stuck%20in%20%22ContainerCreating%22%20status%20when%20you%20stop%20the%20node%20instance%20on%20AKS%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2015636%22%20slang%3D%22en-US%22%3E%3CP%3EThanks%20for%20clarifying%20this.%20The%20tutorial%20includes%20a%20simple%20configuration%20where%20only%201%20pod%20is%20configured.%3C%2FP%3E%0A%3CP%3EHowever%20even%20though%20Kubernetes%2FAKS%20can%20add%20some%20more%20reliability%20by%20having%20the%20pod%20re-scheduled%2C%20this%20is%20not%20a%20silver%20bullet%20for%20availability%20solution.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EVirtual%20Machines%20can%20have%20multiple%20failure%20patterns.%20In%20some%20of%20those%20the%20PV%20(disk)%20could%20also%20be%20impacted%2C%20and%20there's%20a%20risk%20of%20the%20disk%20being%20stuck%20on%20the%20failing%20node.%20The%20approach%20assumes%20the%20disk%20can%20be%20successfully%20detached%2Fattached%20to%20another%20VM%20during%20the%20recovery%2C%20this%20is%20not%20always%20true.%20Adding%20to%20that%20the%20sqlserver%20instance%20wouldn't%20be%20zone%20redundant.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EI'd%20suggest%20to%20consider%20active%2Fbackup%20deployment%20pattern%20for%20a%20more%20reliable%20deployment.%20It%20would%20require%20considerable%20effort%20to%20deploy%20in%20Kubernetes%20but%20it%20would%20be%20a%20great%20to%20have.%20See%20%3CA%20href%3D%22https%3A%2F%2Fgithub.com%2Fkubedb%2Foperator%22%20target%3D%22_self%22%20rel%3D%22noopener%20noreferrer%22%3Ekubedb%3C%2FA%3E%20project%20for%20an%20example%20of%20deploying%20reliable%20databases%20in%20Kubernetes%2C%20which%20demonstrates%20all%20sorts%20of%20replication%2Fstandby%20patterns%20can%20also%20be%20implemented%20in%20Kubernetes%2C%26nbsp%3B%3CA%20href%3D%22https%3A%2F%2Fgithub.com%2Fbitnami%2Fcharts%2Ftree%2Fmaster%2Fbitnami%2Fpostgresql-ha%22%20target%3D%22_self%22%20rel%3D%22noopener%20noreferrer%22%3Epostgres-ha%20helm%20chart%3C%2FA%3E%26nbsp%3Balso%20demonstraces%20a%20HA%20setting.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E
Co-Authors
Version history
Last update:
‎Feb 19 2021 10:20 AM
Updated by: