In today's data-driven landscape, we are presented with numerous alternatives like Elastic Queries, Data Sync, Geo-Replication, ReadScale, etc., for distributing data across multiple databases. However, in this approach, I'd like to explore a slightly different path: creating two separate databases containing data from the years 2021 and 2022, respectively, and querying them simultaneously to fetch results. This method introduces a unique perspective in data distribution — partitioning by database, which could potentially lead to more efficient resource utilization and enhanced performance for each database. While partitioning within a single database is a common practice, this idea ventures into partitioning across databases.
Background:
As data accumulates over time, the strain on a single database intensifies, often leading to slower query responses and potential bottlenecks. Load balancing, a critical concept in database management, offers a remedy by evenly distributing the data load across multiple servers or databases, thus enhancing performance and ensuring scalability.
Solution Overview:
Our approach entails a distributed database architecture where separate Azure SQL databases are designated for different years (e.g., 2021 and 2022). This setup not only simplifies data management but also strategically distributes the read load. The technology stack for this solution involves C#, Azure SQL Database, and the .NET framework.
Implementation:
-
Database Setup: We established two Azure SQL databases, each storing data for a specific year. This division allows for a focused and organized data structure.
-
C# Application Structure: The core of our application is the
ClsLoadData
class, which is responsible for connecting to and querying the databases. This class demonstrates effective organization and clarity in its purpose. -
Connection String Mapping: A crucial aspect of our implementation is the mapping of different years to specific database connection strings within the C# application. This mapping ensures that queries are directed to the correct database.
-
Asynchronous Data Retrieval: We employed asynchronous programming in C#, using
async
andawait
, for efficient and non-blocking data retrieval. This approach is particularly beneficial in maintaining application responsiveness. -
Retry Logic: To enhance robustness, we implemented a retry logic mechanism, which is vital in handling transient failures and ensuring reliable database connectivity.
Load Balancing in Action:
-
Parallel Execution: By querying both databases simultaneously, our application effectively distributes the read load. This parallel execution is key in maximizing performance and reducing the overall strain on each database.
-
Performance Benefits: The observed performance benefits were significant. We noted faster response times and a marked reduction in load on each individual database, confirming the efficacy of our load balancing strategy.
Lessons Learned:
-
Scalability: This approach scales seamlessly with additional databases and larger datasets. It proves that load balancing is not just a theoretical concept but a practical solution for growing data demands.
-
Maintainability: The ease of maintaining and updating separate databases was another critical takeaway. This architecture simplifies data management and enhances overall system maintainability.
-
Best Practices: Key best practices include thorough exception handling and secure management of connection strings. These practices are essential in safeguarding the application and ensuring its smooth operation.
Conclusion:
In summary, this implementation of load balancing using C# and Azure SQL Database not only addressed the challenge of managing large datasets but also offered insights into scalable, maintainable database architecture. The lessons learned and best practices identified here serve as valuable guides for similar scenarios.
using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.Data.SqlClient;
using System.Threading.Tasks;
using System.Threading;
namespace DockerHealth
{
class ClsLoadData
{
public async Task Main()
{
// Define a dictionary mapping years to specific connection strings
var yearConnectionMappings = new Dictionary<int, string>
{
{ 2021, "Server=tcp:servername1.database.windows.net,1433; User Id=usrname;Password=Pwd1!;Initial Catalog=db1;Persist Security Info=False;MultipleActiveResultSets=False;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;Pooling=true;Max Pool size=100;Min Pool Size=1;ConnectRetryCount=3;ConnectRetryInterval=10;Application Name = ConnTest 2021"},
{ 2022, "Server=tcp:servername2.database.windows.net,1433; User Id=usrname;Password=Pwd2!;Initial Catalog=db2;Persist Security Info=False;MultipleActiveResultSets=False;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;Pooling=true;Max Pool size=100;Min Pool Size=1;ConnectRetryCount=3;ConnectRetryInterval=10;Application Name = ConnTest 2022" }
};
var tasks = new List<Task<List<MyEntity>>>();
foreach (var mapping in yearConnectionMappings)
{
int year = mapping.Key;
string connectionString = mapping.Value;
tasks.Add(FilterByYear(year, connectionString));
}
var results = await Task.WhenAll(tasks);
// Combine all results into a single set
var combinedResults = new List<MyEntity>();
foreach (var result in results)
{
combinedResults.AddRange(result);
}
// Display the results
DisplayResults(combinedResults);
Console.WriteLine("end");
}
static async Task<List<MyEntity>> FilterByYear(int year, string connectionString)
{
string query = "SELECT [Year], ID, Cost, Unit FROM [Values] WHERE [Year] = @year";
List<MyEntity> results = new List<MyEntity>();
int maxRetries = 3; // Maximum number of retries
int delay = 1000; // Initial delay in milliseconds (1 second)
for (int retry = 0; retry < maxRetries; retry++)
{
try
{
using (SqlConnection connection = new SqlConnection(connectionString))
{
await connection.OpenAsync();
using (SqlCommand command = new SqlCommand(query, connection))
{
command.Parameters.AddWithValue("@year", year);
using (SqlDataReader reader = await command.ExecuteReaderAsync())
{
while (await reader.ReadAsync())
{
int readYear = (int)reader["Year"];
int id = (int)reader["ID"];
decimal cost = (decimal)reader["Cost"];
int unit = (int)reader["Unit"];
var entity = new MyEntity(readYear, id, cost, unit);
results.Add(entity);
}
}
}
}
break; // Break the loop on successful execution
}
catch (Exception ex)
{
if (retry == maxRetries - 1) break; // Rethrow the exception on the last retry
await Task.Delay(delay); // Wait before retrying
delay *= 2; // Double the delay for the next retry
}
}
return results;
}
static void DisplayResults(List<MyEntity> results)
{
Console.WriteLine("Year | ID | Cost | Unit | TotalCost");
Console.WriteLine("-----------------------------------");
foreach (var entity in results)
{
Console.WriteLine($"{entity.Year} | {entity.ID} | {entity.Cost} | {entity.Unit} | {entity.TotalCost}");
}
}
}
class MyEntity
{
public int Year { get; set; }
public int ID { get; set; }
public decimal Cost { get; set; }
public int Unit { get; set; }
public decimal TotalCost
{
get { return Cost * Unit; }
}
// Constructor for initializing properties
public MyEntity(int year, int id, decimal cost, int unit)
{
Year = year;
ID = id;
Cost = cost;
Unit = unit;
}
}
}
TSQL Scripts
CREATE TABLE [Values] (
[Year] INT,
[ID] INT IDENTITY(1,1) PRIMARY KEY,
[Cost] DECIMAL(10, 2),
[Unit] INT
);
DECLARE @i INT = 0;
WHILE @i < 100
BEGIN
INSERT INTO [Values] ([Year], [Cost], [Unit])
VALUES (2021 + (@i % 2), CAST(RAND() * 100 AS DECIMAL(10, 2)), CAST(RAND() * 10 AS INT));
SET @i = @i + 1;
END;
DECLARE @i INT = 0;
WHILE @i < 100
BEGIN
INSERT INTO [Values] ([Year], [Cost], [Unit])
VALUES (2022 + (@i % 2), CAST(RAND() * 100 AS DECIMAL(10, 2)), CAST(RAND() * 10 AS INT));
SET @i = @i + 1;
END;