#azurepostgresql
3 TopicsAzure PostgreSQL Lesson Learned #13: Major Version Upgrade Failure Due to HypoPG Extension
Co-authored with angesalsaa Case Overview We investigated a customer case where a Major Version Upgrade (MVU) from PostgreSQL 11 to 17 on Azure Database for PostgreSQL Flexible Server consistently failed during the pre-check phase. The upgrade was triggered through the Azure Portal, but the process was blocked due to the presence of the HypoPG extension installed across multiple databases. Symptoms: How the Failure Appears When attempting a Major Version Upgrade in Azure Portal, customers typically encounter: Immediate failure during pre-check validation. Portal error indicating failed at pre-check. Upgrade does not proceed beyond validation. This is expected behavior because Azure validates extension compatibility before performing any in-place upgrade. Root Cause: HypoPG Extension Blocks MVU Azure Database for PostgreSQL Flexible Server supports only a specific list of extensions during major version upgrades. HypoPG is not part of the allow-listed extensions for MVU. If your source server contains HypoPG in any database, the upgrade will fail. Step-by-Step Troubleshooting & Resolution Guide STEP 1 — Audit All Extensions on the Server Azure PostgreSQL does not allow cross-database queries, so you must inspect each database individually. Option A: Check Extensions in Current Database SELECT current_database() AS database_name, extname AS extension_name, extversion AS version FROM pg_extension; 👉 In our case, HypoPG appeared in 4 databases, causing the pre-check failure. STEP 2 — Enable HypoPG Temporarily in Azure Portal Before removing the extension, we had to re-enable HypoPG in Server Parameters because it was previously disabled at Azure Portal level as customer disabled as precaution step but removing part was not enough here: Azure Portal → Server → Server Parameters → azure.extensions → Add hypopg STEP 3 — Remove HypoPG Extension from All Databases Once enabled the extension, remove the extension from each affected database using following command: DROP EXTENSION hypopg CASCADE; ⚠️ CASCADE will drop dependent objects verify impact before execution. After running this command on all affected databases, HypoPG was successfully removed. STEP 4 — Validate Allowed Extensions In Azure Portal: Server → Server Parameters → azure.extensions Ensure only supported extensions are listed. Remove unsupported entries. Make sure hypopg is removed STEP 5 — Re-Run MVU With HypoPG removed and supported extensions validated: Server was in Ready state. Trigger MVU again via Portal: Portal → Server → Overview → Upgrade → Select PostgreSQL 17 Upgrade completed successfully. a { text-decoration: none; color: #464feb; } tr th, tr td { border: 1px solid #e6e6e6; } tr th { background-color: #f5f5f5; } Final Outcome After removing HypoPG and validating the server parameters: MVU completed without errors. Server upgraded from PostgreSQL 11 to 17. Best Practices to Avoid Future Upgrade Failures Audit Extensions Regularly — Especially before major version upgrades. Limit Extensions via azure.extensions — Only enable what you actively use. Clean Up Deprecated Extensions — Remove unused or legacy extensions early. Align with Azure’s Supported Extension List — Check official documentation. Automate Extension Validation — Prevent last-minute surprises. Helpful References Major Version Upgrades | Microsoft Learn274Views1like0CommentsAzure PostgreSQL Lesson Learned #10: Why PITR Networking Rules Matter
Co‑authored with angesalsaa Symptoms Customer attempted to restore a server configured with public access into a private virtual network. Restore operation failed with an error indicating unsupported configuration. Root Cause Azure enforces strict networking rules during PITR to maintain security and consistency: Public access servers can only be restored to public access. Private access servers can be restored to the same virtual network or a different virtual network, but not to public access. Why This Happens Networking mode is tied to the original server configuration. Mixing public and private access during restore could expose sensitive data or break connectivity assumptions. Contributing Factors Customer assumed PITR could switch networking modes. No prior review of Azure documentation on restore limitations. Specific Conditions We Observed Source server: Private access with VNet integration. Target restore: Attempted to switch to public access. Operational Checks Before initiating PITR: Confirm the source server’s networking mode (Public vs Private). Review restore options in the Azure portal → Restore. Mitigation Goal: Align restore strategy with networking rules. If source is Public: Restore only to Public access. If source is Private: Restore to same or different VNet (within the same region). Post-Resolution Customer successfully restored to a different VNet after adjusting expectations. Prevention & Best Practices Document networking mode for all PostgreSQL servers. Train teams on PITR limitations before disaster recovery drills. Avoid assumptions always check official guidance. Why This Matters Ignoring these rules can delay recovery during critical incidents. Knowing the constraints upfront ensures faster restores and compliance with security policies. Key Takeaways Issue: PITR does not allow switching between Public and Private access. Fix: Restore within the same networking category as the source server. References Backup and Restore in Azure Database for PostgreSQL Flexible Server157Views0likes0CommentsAzure PostgreSQL Lesson Learned #8: Post-Upgrade Performance Surprises (The One-Step Fix)
Co‑authored with angesalsaa Symptoms Upgrade from PostgreSQL 12 → higher version succeeds. After migration, workloads show: Queries running slower than before. Unexpected CPU spikes during normal operations. No obvious errors in logs or connectivity issues. Root Cause Missing or stale statistics can lead to bad query plans, which in turn might degrade performance and consume excessive memory. After a major version upgrade, the query planner relies on outdated or default estimates because the optimizer’s learned patterns are not refreshed. This often results in: Sequential scans instead of index scans. Inefficient join strategies. Increased CPU and memory usage. Contributing Factors Large tables with skewed data distributions. Complex queries with multiple joins. Workloads dependent on accurate cost estimates. Specific Conditions We Observed Any source server version can be impacted once you upgrade to higher version. No ANALYZE or VACUUM run post-upgrade. Operational Checks Before troubleshooting, confirm: Query plans differ significantly from pre-upgrade. pg_stats indicates outdated or missing statistics. Mitigation Goal: Refresh statistics so the planner can optimize queries. Run ANALYZE on all tables: ANALYZE; Important Notes: These commands are safe and online. For very large datasets, consider running during low-traffic windows. We recommend running the ANALYZE command in each database to refresh the pg_statistic table. Post-Resolution Queries return to expected performance. CPU utilization stabilizes. Execution plans align with indexes and cost-based optimization. Prevention & Best Practices Always schedule ANALYZE immediately after major upgrades. Automate stats refresh in your upgrade runbook. Validate plans for critical queries before going live. Why This Matters Skipping this step can lead to: Hours of degraded performance. Emergency escalations and customer dissatisfaction. Misdiagnosis as engine regression when it’s just missing stats. Key Takeaways Issue: Post-upgrade query slowness and CPU spikes due to stale/missing statistics. Fix: Run ANALYZE immediately after upgrade. Pro Tip: Automate this in CI/CD or maintenance scripts for zero surprises. References Major Version Upgrades - Azure Database for PostgreSQL | Microsoft Learn383Views0likes0Comments