Connectivity
41 TopicsLesson Learned #515:Recommendations for Troubleshooting - Login failed for user 'XXX'. (18456)
During a recent support case, a customer encountered the error:pyodbc.InterfaceError: ('28000', "[28000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Login failed for user 'XXX'. (18456) (SQLDriverConnect); ") using a Python code. Following, I would like to share my lessons learned to fix this issue. The error code18456 is typically caused by login issues, such as incorrect or missing credentials, rather than connectivity or networking problems. In our investigation, we identified the root cause and suggested recommendations to prevent and resolve similar issues. Root Cause The application was configured to retrieve the database server and host name from environment variables. However: Missing Environment Variables: One or more variables were not set. Default Value Misconfiguration: The code defaulted to a hardcoded value when variables were missing. For example, the server defaulted to "localhost", which was not the intended database server. As a result, the application attempted to connect to an unintended server with incorrect or missing credentials, leading to the Login failed error. Recommendations 1. Validate Environment Variables Always validate critical environment variables like server, username, and password. If a required variable is missing or empty, the application should raise an explicit error or log a clear warning. 2. Avoid Misleading Defaults Use placeholder values, such as "NOT_SET", as defaults for critical variables. This approach ensures that misconfigurations are immediately visible and do not silently fail. 3. Log Connection Details Log critical details like the server and database being accessed. Ensure this information is included in application logs to make troubleshooting easier. Avoid logging sensitive information such as passwords. Python Solution I was thinking how to improve the Python code, implementing a validation of environment variables, handle errors and log critical connection details: import os def get_env_variable(var_name, default_value=None, allow_empty=False): """ Retrieves and validates an environment variable. :param var_name: The name of the environment variable. :param default_value: The default value if the variable is missing. :param allow_empty: If False, raises an error for empty variables. :return: The value of the environment variable or default_value. Example: server = get_env_variable("DB_SERVER", default_value="NOT_SET") """ value = os.getenv(var_name, default_value) if value is None or (not allow_empty and (value.strip() == "" or value.strip() == "NOT_SET" or default_value is None)): raise ValueError(f"Environment variable '{var_name}' is required but not set.") return value120Views0likes0CommentsLesson Learned #512: Handling Connection Reuse in ODBC After a Critical Error
Working on several connectivity cases with ODBC across different programming languages, I’ve noticed a behavior that I would like to share, as it can lead to incorrect conclusions about whether a connection has actually been closed. In one of these cases, I found that once a connection is established, if a query execution results in a code error, timeout, or any other critical issue, reusing the same connection to execute another query may produce Connection closed messages. This behavior occurs because, when a critical error happens, ODBC drivers set an internal flag indicating that the connection is no longer in a 'clean' state. This flag prevents further commands from being executed on the connection until it is fully closed and reopened. In these situations, the best approach is to open a new connection to avoid further issues. def TestConnectionErrorReuse(): try: thread_id = threading.get_ident() conn, db_name = ConnectToTheDB(querytimeout=10) if conn is None: logging.info(f'(TestConnectionErrorReuse) - Thread: {thread_id} - Failed to establish initial connection. Exiting...') return cursor = conn.cursor() # Step 1: Execute a query that will cause an error (division by zero) try: logging.info(f'(TestConnectionErrorReuse) - Thread: {thread_id} - Executing a query that will cause an error (SELECT 1/0)...') cursor.execute("SELECT 1/0") # This should trigger a division by zero error except pyodbc.Error as e: logging.error(f'(TestConnectionErrorReuse) - Thread: {thread_id} - Error during query execution: {e}') # Step 2: Attempt to reuse the connection by running another query try: logging.info(f'(TestConnectionErrorReuse) - Thread: {thread_id} - Attempting to execute a follow-up query after error...') cursor.execute("SELECT 1") # Simple query to test reuse of connection row = cursor.fetchone() print(f"(TestConnectionErrorReuse) - Thread: {thread_id} - Follow-up query result: {row[0]}") except pyodbc.Error as e: logging.error(f'(TestConnectionErrorReuse) - Thread: {thread_id} - Error reusing connection after previous failure: {e}') except Exception as e: logging.error(f'(TestConnectionErrorReuse) - Thread: {thread_id} - Unexpected error occurred: {e}') finally: if conn: conn.close() logging.info(f'(TestConnectionErrorReuse) - Thread: {thread_id} - Connection closed.')Lesson Learned #509: KeepAliveTime parameter in HikariCP
Today, I have been working on a service request where, at certain times, we observed that connections could be disconnected due to external factors such as firewalls or other components due to inactivity policies. For this reason, I would like to share my experience using the KeepAliveTime parameter.Lesson Learned #511: Timeout Attempting to Open the Connection in High-Thread Applications
Recently, I worked on a service request that a customer application reported the following error connecting to the database:"Timeout attempting to open the connection. The time period elapsed prior to attempting to open the connection has been exceeded. This may have occurred because of too many simultaneous non-pooled connection attempts.". Following, I would like to share the experience learned here.Lesson Learned #396: Fixing 'Invalid value for key 'authentication' using System.Data.SqlClient
Our customer is getting the following error message:Application Error System.ArgumentException: Invalid value for key 'authentication'. at System.Data.Common.DbConnectionStringBuilderUtil.ConvertToAuthenticationType(String keyword, Object value) at System.Data.SqlClient.SqlConnectionString.ConvertValueToAuthenticationType() at System.Data.SqlClient.SqlConnectionString..ctor(String connectionString) at System.Data.SqlClient.SqlConnectionFactory.CreateConnectionOptions(String connectionString, DbConnectionOptions previous) at System.Data.ProviderBase.DbConnectionFactory.GetConnectionPoolGroup(DbConnectionPoolKey key, DbConnectionPoolGroupOptions poolOptions, DbConnectionOptions& userConnectionOptions) at System.Data.SqlClient.SqlConnection.ConnectionString_Set(DbConnectionPoolKey key) using in theauthentication keyword in the connection string the valueActive Directory Managed Identity.Lesson Learned #503: Lessons Learned Using JDBC Logging with Microsoft JDBC Driver for SQL Server
A few days ago, our customer encountered connectivity issues while using HikariCP in their Java application. To troubleshoot the problem, we enabled detailed logging for the Microsoft JDBC Driver for SQL Server. This logging gave me a lot of information and details, following I would like to share my lessons learned here.1.5KViews1like0CommentsLesson Learned #499: HikariCP Retry Policy - The connection is closed
Today, I worked on a service request that our customer got the following error message: 19:28:43.232 [main] WARN com.zaxxer.hikari.pool.PoolBase - AppExample-ConnectionPooling - Failed to validate connection ConnectionID:3 ClientConnectionId: 8351bddc-acbb-4669-xxxx-xxxxxxxxxxxx (The connection is closed.). Possibly consider using a shorter maxLifetime value. 19:28:43.234 [AppExample-ConnectionPooling connection closer] DEBUG com.zaxxer.hikari.pool.PoolBase - DotNetExample-ConnectionPooling - Closing connection ConnectionID:3 ClientConnectionId: 8351bddc-acbb--xxxx-xxxxxxxxxxxx: (connection is dead). Following I would like to share my experience with this error.12KViews0likes0CommentsAzure SQL DB Private Link / Private Endpoint - Connectivity Troubleshooting
In the articleAzure SQL DB Connectivity Troubleshootingwe have explored regular connectivity troubleshooting to Azure SQL DB, on this one we will explore connectivity troubleshooting using Private Link In this article we are going to explore What is the Private Endpoint for Azure DB? Creation of Private Endpoint Azure VM > Private Link OnPrem VM > VPN (P2S) > Private Link (Hosts File) OnPrem VM > VPN (P2S) > Private Link (Custom DNS) Azure Function > VNET integration > Private Endpoint Failover Groups with Private Link 1 - What is the Private Endpoint for Azure DB? Azure Private Link enables you to access Azure PaaS Services (for example, Azure Storage and SQL Database) and Azure hosted customer-owned/partner services over aprivate endpointin your virtual network. Traffic between your virtual network and the service travels the Microsoft backbone network. Exposing your service to the public internet is no longer necessary Some important information Private Link service can be accessed from approved private endpoints in the same region. The private endpoint can be reached from the same virtual network, regionally peered VNets, globally peered VNets and on premises using private VPN or ExpressRoute connections. When creating a Private Link Service, a network interface is created for the lifecycle of the resource. This interface is not manageable by the customer. The Private Link Service must be deployed in the same region as the virtual network. A single Private Link Service can be accessed from multiple Private Endpoints belonging to different VNets, subscriptions and/or Active Directory tenants. The connection is established through a connection workflow. In the image, below we can see how the connection using private endpoint works. And as mentioned on the other articleAzure SQL DB Connectivity Troubleshootingthe Azure SQL DB clients by default will all go to the Shared public IP of the regional gateway. With the private endpoint you can close this public path and users can only connect from private endpoint. If you have multiple Azure VNETs you will need to use VNET peering or VNET VPN between two Azure VNETs, or P2S,S2S or Express Route to connect your onprem to Azure Network 2 - Creation of Private Endpoint To create a Private endpoint just follow up the procedure documented athttps://docs.microsoft.com/en-us/azure/private-link/create-private-endpoint-portal You may also want to close all public network access to make sure all connection must flow from Private Endpoint. Just need to go to Azure SQL DB instance and "Firewall and Virtual Networks" and Deny public network access. Find below the procedure I've used to create a Private Endpoint Just need to go to "Private Endpoint Connections" and then add a Private endpoint Select the region that should be the same as the VNET region as mentioned above. Select the resource type "Microsoft.Sql/servers" for Azure SQL DB instance Select the Azure SQL DB instance you want to connect Select the VNET / Subnet. Notice also that during the creation you can already create a private DNS zone, that will work for Azure resources that uses the Azure DNS. We will talk more about that when doing the tests 3 - Azure VM > Private Endpoint From an Azure VM deployed to same VNET, if we test command below on command prompt before you create the Private Endpoint. nslookup fonsecanet-westeu.database.windows.net You will get result like below that shows that this server is using public gateway IP:40.68.37.158. So NOT using any private IP Server: UnKnown Address: 168.63.129.16 Non-authoritative answer: Name: westeurope1-a.control.database.windows.net Address: 40.68.37.158 Aliases: fonsecanet-westeu.database.windows.net After you create the Private Endpoint using the same command above you are expected to see the results below Server: UnKnown Address: 168.63.129.16 Non-authoritative answer: Name: fonsecanet-westeu.privatelink.database.windows.net Address: 10.0.1.4 Aliases: fonsecanet-westeu.database.windows.net Be sure also to open outbound communication from Azure VM VNET to Private Endpoint on Local Firewall, Corporate Firewall, or Azure NSGs. For this test I've opened to allow all communication inside VNET. *Currently(Status on 2020-04-06) redirect is not supported, so only needed 1433 port You must use the FQDN to connect to Azure SQL DB as documented athttps://docs.microsoft.com/en-us/azure/sql-database/sql-database-private-endpoint-overview#check-connectivity-using-sql-server-management-studio-ssms Use theFully Qualified Domain Name (FQDN)of the server in connection strings for your clients. Any login attempts made directly to the IP address shall fail. This behavior is by design, since private endpoint routes traffic to the SQL Gateway in the region and the FQDN needs to be specified for logins to succeed. You can also check if connection is correct using TSQL below. And we will see the client IP is the private IP assigned to Azure VM. SELECT client_net_address FROM sys.dm_exec_connections where session_id = @@SPID You must use the FQDN to connect to Azure SQL DB. Azure SQL DB gateway use the name to route correctly your connection to the SQL host, when information is not provided it will fail If you try to connect using private endpoint IP you are going to get error like below =================================== Cannot connect to 10.0.1.4. =================================== A connection was successfully established with the server, but then an error occurred during the login process. (provider: SSL Provider, error: 0 - The target principal name is incorrect.) (.Net SqlClient Data Provider) ------------------------------ Server Name: 10.0.1.4 Error Number: -2146893022 Severity: 20 State: 0 You should also NOT use DB.privatelink.database.windows.net =================================== Cannot connect to fonsecanet-westeu.privatelink.database.windows.net. =================================== A connection was successfully established with the server, but then an error occurred during the login process. (provider: SSL Provider, error: 0 - The target principal name is incorrect.) (.Net SqlClient Data Provider) ------------------------------ Server Name: fonsecanet-westeu.privatelink.database.windows.net Error Number: -2146893022 Severity: 20 State: 0 4 - OnPrem VM > VPN (P2S) > Private Link (Hosts File) You can use some template like below to create the VPN Point to Site https://docs.microsoft.com/en-us/azure/sql-database/sql-database-managed-instance-configure-p2s https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-howto-point-to-site-resource-manager-portal After creating the VPN, downloading and installing the VPN client and connecting to it If I check name resolution nslookup fonsecanet-westeu.database.windows.net We can see that it is still using Microsoft internal Corporate DNS where this VM is located Server: XXXXXXXXXXX.corp.microsoft.com Address: 10.221.xx.xx Non-authoritative answer: Name: westeurope1-a.control.database.windows.net Address: 40.68.37.158 Aliases: fonsecanet-westeu.database.windows.net fonsecanet-westeu.privatelink.database.windows.net For this scenario you will need to use your corporate DNS to have the name resolution https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-overview#dns-configuration When connecting to a private link resource using a fully qualified domain name (FQDN) as part of the connection string, it's important to correctly configure your DNS settings to resolve to the allocated private IP address. Existing Azure services might already have a DNS configuration to use when connecting over a public endpoint. This needs to be overridden to connect using your private endpoint. The network interface associated with the private endpoint contains the complete set of information required to configure your DNS, including FQDN and private IP addresses allocated for a given private link resource. You can use the following options to configure your DNS settings for private endpoints: Use the Host file (only recommended for testing). You can use the host file on a virtual machine to override the DNS. Use a private DNS zone. You can use private DNS zones to override the DNS resolution for a given private endpoint. A private DNS zone can be linked to your virtual network to resolve specific domains. Use your custom DNS server. You can use your own DNS server to override the DNS resolution for a given private link resource. If yourDNS serveris hosted on a virtual network, you can create a DNS forwarding rule to use a private DNS zone to simplify the configuration for all private link resources. Important It's not recommended to override a zone that is actively in use to resolve public endpoints. Connections to resources won't be able to resolve correctly without DNS forwarding to the public DNS. To avoid issues, create a different domain name or follow the suggested name for each service below. Private Link resource type Subresource Zone name SQL DB (Microsoft.Sql/servers) Sql Server (sqlServer) privatelink.database.windows.net Azure will create a canonical name DNS record (CNAME) on the public DNS to redirect the resolution to the suggested domain names. You'll be able to override the resolution with the private IP address of your private endpoints. Your applications don't need to change the connection URL. When attempting to resolve using a public DNS, the DNS server will now resolve to your private endpoints. The process does not impact your applications. For this test will use HOST file solution And we are able to connect fine. And as before you will be able to check you get the VPN ip range using TSQL below SELECT client_net_address FROM sys.dm_exec_connections where session_id = @@SPID 5 - OnPrem VM > VPN (P2S) > Private Link (Custom DNS) If you do not want to use local hosts file as mentioned above you can add Azure DNS (168.63.129.16) as a forwarder on your custom DNS For more information about more complex DNS relationships follow up the official documentationhttps://docs.microsoft.com/en-us/azure/private-link/private-endpoint-dns On Azure Virtual Network settings I've replaced default Azure DNS with your custom DNS (On my tests 10.0.0.5). You can also add168.63.129.16 (Azure DNS) as secondary DNS just in case 1st is off If your client lives in Azure, the DNS on VM can be a forwarder to Azure DNS - Go to DNS > Domain > Properties > Forwarders - AddAzure DNS (168.63.129.16) Do not forget to add virtual network link from Private Zone to the VNET where your Azure VM with DNS lives. If you are have onprem DNS you may need a Conditional forwarder Go to DNS > Domain > Conditional Forwarders > New DNS Domain should be database.windows.net. DO NOT USE privatelink.database.windows.net And you can check if name resolution works fine nslookup fonsecanet-westeu.database.windows.net Server: fonsecanetad.internal.cloudapp.net Address: 10.0.0.5 Non-authoritative answer: Name: fonsecanet-westeu.privatelink.database.windows.net Address: 10.0.1.4 Aliases: fonsecanet-westeu.database.windows.net If do not want to use forwarder you can also create aforward lookup zone and added manually the hostto match the FQDN 6 - Azure Function > VNET integration > Private Endpoint To make Azure Function to connect to private endpoint you will need to use VNET integration https://docs.microsoft.com/en-us/azure/azure-functions/functions-create-vnet https://docs.microsoft.com/en-us/azure/app-service/web-sites-integrate-with-vnet *Status on (2020-04-27) There was some limitation on webapp + private DBS that are now gone and just need to set some configurationsas documented athttps://docs.microsoft.com/en-us/azure/app-service/web-sites-integrate-with-vnet#azure-dns-private-zones After your app integrates with your VNet, it uses the same DNS server that your VNet is configured with. By default, your app won't work with Azure DNS Private Zones. To work with Azure DNS Private Zones you need to add the following app settings: WEBSITE_DNS_SERVER with value 168.63.129.16 WEBSITE_VNET_ROUTE_ALL with value 1 These settings will send all of your outbound calls from your app into your VNet in addition to enabling your app to use Azure DNS private zones. And I've also created a sample Azure Function app that accept the connection string as parameter and was able to connect fine You can check the source code athttps://github.com/FonsecaSergio/AzureFunctionAppTestConnectivity 7 - Failover Groups with Private Link For this scenario I will just reference link from Product Group articlehttps://techcommunity.microsoft.com/t5/azure-sql-database/using-failover-groups-with-private-link-for-azure-sql-database/ba-p/1272902 Be sure to config private zone network link to both networks https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-dns#virtual-network-workloads-without-custom-dns-server This model can be extended to multiple peered virtual networks that are associated to the same private endpoint. This can be done byadding new virtual network linksto the private DNS zone for all peered virtual networks. Result for my tests C:\Users\FonsecaSergio>nslookup SERVER1.database.windows.net Server: UnKnown Address: 168.63.129.16 Non-authoritative answer: Name: SERVER1.privatelink.database.windows.net Address: 12.1.1.6 Aliases: SERVER1.database.windows.net C:\Users\FonsecaSergio>nslookup SERVER2-westus.database.windows.net Server: UnKnown Address: 168.63.129.16 Non-authoritative answer: Name: SERVER2-westus.privatelink.database.windows.net Address: 11.0.1.4 Aliases: SERVER2-westus.database.windows.net C:\Users\FonsecaSergio>nslookup FAILOVERGROUP.database.windows.net Server: UnKnown Address: 168.63.129.16 Non-authoritative answer: Name: SERVER1.privatelink.database.windows.net Address: 12.1.1.6 Aliases: FAILOVERGROUP.database.windows.net SERVER1.database.windows.net ForDNS on more complex environments, please follow up other scenarios as mentioned at https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-dns REF and LINKS https://docs.microsoft.com/en-us/azure/private-link/private-link-service-overview https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-overview https://docs.microsoft.com/en-us/azure/sql-database/sql-database-private-endpoint-overview https://docs.microsoft.com/en-us/azure/private-link/ https://docs.microsoft.com/en-us/azure/app-service/web-sites-integrate-with-vnet#enable-vnet-integration https://channel9.msdn.com/Shows/Data-Exposed/Private-Link-for-Azure-SQL-Database-Part-1?term=Private%20Link%20for%20Azure%20SQL%20Database&lang-en=true https://channel9.msdn.com/Shows/Data-Exposed/Private-Link-for-Azure-SQL-Database-Part-2?term=Private%20Link%20for%20Azure%20SQL%20Database&lang-en=true82KViews4likes8CommentsLesson Learned #359: TCP Provider: Error code 0x68 (104) (SQLExecDirectW)
Today, we got a service request that our customer faced the following error message connecting to the database:(pyodbc.OperationalError) ('08S01', '[08S01] [Microsoft][ODBC Driver 18 for SQL Server]TCP Provider: Error code 0x68 (104) (SQLExecDirectW)'). This customer is using Python in a Linux environment. Following I would like to share my lessons learned about this error message.20KViews2likes3CommentsLesson Learned #455:Optimizing Connection Pooling for Application Workloads: Thread Contention
This week, I have been working on a support case where our client experienced a 5-second delay in database connectivity. They were measuring this using StopWatch. As they informed me, they were launching a series of processes, 50-100 concurrent processes to be precise. I would like to share the experience gained in understanding the reasons behind this 5-second delay.3.2KViews1like0Comments