azure cache for redis
56 TopicsRedis Keys Statistics
Redis Keys statistics including Key Time-to-Live (TTL) statistics and Key sizes are useful for troubleshooting cache usage and performance, from client side. Key Time-to-Live (TTL): TTL may have impact on memory usage and memory available on Redis services. Data Loss on Redis services may happened unexpectedly due to some issue on backend, but may also happen due to Memory eviction policy, or Time-to-Live (TTL) expired. Memory eviction policy may remove some keys from Redis service, but only when used capacity (the space used by Redis keys) reach 100% on memory available. Not having any unexpected issue on Redis backend side or not reaching the maximum memory available, the only reason for having some keys removed from cache is due to TTL value. TTL may not be defined at all, and in that case the key remains in the cache forever (persistent) TTL can be set while setting a new key TTL can be set / re-set later after key creation TTL is defined in seconds or milliseconds, or with a negative value: -1, the key exists but has no expiration (it’s persistent); this happens when the TTL was not defined or it was removed using PERSIST command -2, if the key does not exist. any other value Related commands: SET key1 value1 EX 60 - defines TTL as 60 seconds SET key1 value1 PX 60000 - defines TTL as 60000 milliseconds (60 seconds) EXPIRE key1 60 - Set a timeout of 60 seconds on key1 TTL key1 - returns the current TTL value, in seconds PTTL key1 - returns the current TTL value, in milliseconds PERSIST key1 removes TTL from that key and make the key persistent Notes: TTL counts down in real time, but Redis expiration is lazy + active, so exact timing isn’t guaranteed to the millisecond. A TTL of 0 is basically a race condition, that usually are not seen, it because the key expires immediately. EXPIRE key 0 deletes the key right away. There is no guarantee the deletion happens exactly at expiration time. Redis lazy + active expiration means the key is checked only when someone touches it (lazy), but to avoid memory filling up with expired junk, Redis also runs a background job to periodically scan a subset of keys and delete the expired ones (active). So, some expired keys may survive a bit longer, not accessible anymore but still im memory. Example Redis lazy: at 11:59:00 SET key1 value1 EX 60 - 60 seconds expiration time key1 expires at 12:00:00 no one accesses it until 12:00:05 - when someone try to access key1 at 12:00:05, Redis identify the key1 expired and delete it. Example Redis active: for the same key1, after 12:00:00. if during the periodically background job Redis scan the subset of keys containing key1, that key1 will be actively deleted. For that reason, we may see some higher memory usage than the real memory used by active keys in the cache. For more information about Redis commands, check Redis Inc - Commands Key Sizes: Large key value sizes in the cache, may have high impact on Redis performance. Redis service is designed to 1KB response size, and Microsoft recommends to use up to 100KB on Azure Redis services, to get a better performance. Redis response size may not be exactly the same as key size, as Response size is the sum of the response from each operation sent to Redis. While the response size can be the size of only one key requested (like GET), we can see very often response size being a sum of more than one key, as result of multikey operations (like MGET and others). The scope of this article is the focus on each key size; so, we will not discuss on this article the implications of multikey commands. By design Redis service is a single thread system per shard, and this is not a Microsoft/Azure limitation but a Redis design feature. To be very quick on processing requests, Redis is optimized to work and process small keys, and for that is more efficient using a single thread instead of the need of context switching. In a multi threaded system, context switching happens when the processor stops executing one thread and starts executing another. When that happens, the OS saves the current thread’s state (registers, program counter, stack pointer, etc.) and restores the state of the next thread. To save time on that process, Redis service is designed to run in a single thread system. Due to the single thread nature, all operations sent to Redis service, are waiting in a queue to be processed. To minimize latency, all keys must remain small so they can be processed efficiently and responses can be transmitted to the client quickly over the network. For that reason, it's important to understand the key sizes we have on our Redis service, and maintain all keys as small as possible. Scripts Provided To help on identifying some specific TTL values and Keys sizes in a Redis cache, two solutions are provided below: 1. Get Key statistics - that scans all cache and return only the amount of Redis keys with: Number of keys with TTL no set Number of keys with TTL higher or equal to a user defined TTL threshold Number of keys with TTL lower than a user defined TTL threshold Number of keys with value size higher or equal than a user defined Size threshold Number of keys with value size lower than a user defined Size threshold Total number of keys in the cache. It also includes start and end time, and the total time spent on the keys scan. 2. List Key Names - this script returns a list of Redis Keys names, based on parameters provided: No TTL set, or TTL higher or equal to a user defined TTL threshold, or TTL lower than to a user defined TTL threshold Key value size higher or equal than a user defined Size threshold, or Key value size lower than a user defined Size threshold Total number of keys in the cache It also includes start and end time, and the total time spent on the keys scan. WARNING: Due to the need to read all keys in the cache, both solutions can cause high workload on Redis side, specially for high datasets on the cache, with high number of keys. Both solutions are using LUA script that runs on Redis side, and depending on the amount of keys in the cache, may block all other commands to be processed, while the script is running. The duration time on the output from each script run, may help to identify the impact of the scripts to run. Run it carefully and do some tests first on your developing environment, before using in a production. 1- Get Key statistics To get Redis key statistics, we use Linux Bash shell and Redis-cli tool to run LUA script on Redis side, to get TTL values and sizes from each key. This solution is very fast, but needs to scan all keys in the cache during the LUA script run. This may block Redis to process other requests, due to the single-thread nature of Redis service. The below script scans all cache and return only the amount of Redis keys with: Number of keys with TTL no set Number of keys with TTL higher or equal to a user defined TTL threshold Number of keys with TTL lower than a user defined TTL threshold Number of keys with value size higher or equal than a user defined Size threshold Number of keys with value size lower than a user defined Size threshold Total number of keys in the cache. It also includes start and end time, and the total time spent on the keys scan. Output: ======================================================== Scanning number of keys with TTL threshold 100 Seconds, and Key size threshold 500 Bytes Start time: dd-mm-YY 18:12:15 ----------------------- Total keys scanned: 1227 ------------ TTL not set : 2 TTL >= 100 seconds: 1225 TTL < 100 seconds: 0 TTL invalid/error : 0 Non existent key : 0 ------------ Keys with Size >= 500 Bytes: 1225 Keys with Size < 500 Bytes: 2 Keys with invalid Size : 0 ------------------------ End time: dd-mm-YY 19:12:16 Duration : 0 days 00:00:00.630 ======================================================== How to run: create the below getKeyStats.sh and getKeyStats.lua files on same folder, on your Linux environment (Ubuntu 20.04.6 LTS used) give permissions to run Shell script, with command chmod 700 getKeyStats.sh Call the script using the syntax: ./getKeyStats.sh host password [port] [ttl_threshold] [size_threshold] Script parameters: host (mandatory) : the URI for the cache password (mandatory) : the Redis access key from the cache port (optional - default 10000) : TCP port used to access the cache ttl_threshold (optional - default 600 - 10 minutes) : Key TTL threshold (in seconds) to be used on the results (use -1 to 1 to get Keys with no TTL set) size_threshold (optional - default 102400 - 100KB) : Key Size threshold to be used on the results Tested with: Ubuntu 20.04.6 LTS redis-cli -v redis-cli 7.4.2 Redis services: Azure Managed Redis Balanced B0 OSSMode Azure Cache for Redis Standard C1 getKeyStats.sh #!/usr/bin/env bash #============================== LUA script version ================= # Linux Bash Script to get statistics from Redis Keys TTL values and Key value sizes # It returns the Number of: # - keys with TTL no set # - keys with TTL higher or equal to TTL_treshold # - keys with TTL lower TTL_threshold # - keys with value size higher or equal than Size_threshold # - keys with value size lower than Size_threshold # - total number of keys in the cache. #------------------------------------------------------- # WARNING: # It uses LUA script to run on Redis server side. # Use it carefully, during low Redis workoads. # Do your tests first on a Dev environment, before use it on production. #------------------------------------------------------- # It requires : # redis-cli v7 or above #-------------------------------------------------------- # Usage: # getRedisTTL.sh <cacheuri> <cacheaccesskey> [<accessport>(10000)] [<ttl_treashold>(600)] [<size_threshold>(102400)] #======================================================== #------------------------------------------------------ # To use non-ssl port requites to remove --tls parameter from Redis-cli command below #------------------------------------------------------ # Parameters REDIS_HOST="${1:?Usage: $0 <host> <password> [port] [ttl_threshold] [Size_Threshold]}" REDISCLI_AUTH="${2:?Usage: $0 <host> <password> [port] [ttl_threshold] [Size_Threshold]}" REDIS_PORT="${3:-10000}" # 10000 / 6380 / 6379 REDIS_TTL_THRESHOLD="${4:-600}" # 10 minutes REDIS_SIZE_THRESHOLD="${5:-102400}" # 100KB # Port number must be numeric if ! [[ "$REDIS_PORT" =~ ^[0-9]+$ ]]; then echo "ERROR: Redis Port must be numeric" exit 1 fi # TTL threshold must be numeric if ! [[ "$REDIS_TTL_THRESHOLD" =~ ^[0-9]+$ ]]; then echo "ERROR: TTL threshold must be numeric" exit 1 fi # Size threshold must be numeric if ! [[ "$REDIS_SIZE_THRESHOLD" =~ ^[0-9]+$ ]]; then echo "ERROR: Size threshold must be numeric" exit 1 fi echo "" echo "========================================================" echo "Scaning number of keys with TTL threshold $REDIS_TTL_THRESHOLD Seconds, and Key size threshold $REDIS_SIZE_THRESHOLD Bytes" # Start time start_ts=$(date +%s.%3N) echo "Start time: $(date "+%d-%m-%Y %H:%M:%S")" echo "------------------------" echo "" # Procesing result=$(redis-cli \ -h "$REDIS_HOST" \ -a "$REDISCLI_AUTH" \ -p "$REDIS_PORT" \ --tls \ --no-auth-warning \ --raw \ --eval getKeyStats.lua , "$REDIS_TTL_THRESHOLD" "$REDIS_SIZE_THRESHOLD" \ | tr '\n' ' ') read no_ttl nonexist ttl_high ttl_low ttl_invalid size_high size_low size_nil total <<< "$result" if [[ $result == ERR* ]]; then echo "Redis Lua error:" echo "$result" else echo "Total keys scanned: $total" echo "------------" echo "TTL not set : $no_ttl" echo "TTL >= $REDIS_TTL_THRESHOLD seconds: $ttl_high" echo "TTL < $REDIS_TTL_THRESHOLD seconds: $ttl_low" echo "TTL invalid/error : $ttl_invalid" echo "Non existent key : $nonexist" echo "------------" echo "Keys with Size >= $REDIS_SIZE_THRESHOLD Bytes: $size_high" echo "Keys with Size < $REDIS_SIZE_THRESHOLD Bytes: $size_low" echo "Keys with invalid Size : $size_nil" fi echo "" echo "------------------------" end_ts=$(date +%s.%3N) echo "End time: $(date "+%d-%m-%Y %H:%M:%S")" # Duration - Extract days, hours, minutes, seconds, milliseconds duration=$(awk "BEGIN {print $end_ts - $start_ts}") days=$(awk "BEGIN {print int($duration/86400)}") hours=$(awk "BEGIN {print int(($duration%86400)/3600)}") minutes=$(awk "BEGIN {print int(($duration%3600)/60)}") seconds=$(awk "BEGIN {print int($duration%60)}") milliseconds=$(awk "BEGIN {printf \"%03d\", ($duration - int($duration))*1000}") echo "Duration : ${days} days $(printf "%02d" "$hours"):$(printf "%02d" "$minutes"):$(printf "%02d" "$seconds").$milliseconds" echo "========================================================" getKeyStats.lua local ttl_threshold = tonumber(ARGV[1]) local size_threshold = tonumber(ARGV[2]) local cursor = "0" -- Counters local no_ttl = 0 local nonexist = 0 local ttl_high = 0 local ttl_low = 0 local ttl_invalid = 0 local size_high = 0 local size_low = 0 local size_nil = 0 local total = 0 repeat local scan = redis.call("SCAN", cursor, "COUNT", 1000) cursor = scan[1] local keys = scan[2] for _, key in ipairs(keys) do local ttl = redis.call("TTL", key) local size = redis.call("MEMORY","USAGE", key) total = total + 1 if ttl == -1 then no_ttl = no_ttl + 1 elseif ttl == -2 then nonexist = nonexist + 1 elseif type(ttl) ~= "number" then ttl_invalid = ttl_invalid + 1 elseif ttl >= ttl_threshold then ttl_high = ttl_high + 1 else ttl_low = ttl_low + 1 end if size == nil then size_nil = size_nil + 1 elseif size >= size_threshold then size_high = size_high + 1 else size_low = size_low + 1 end end until cursor == "0" return { no_ttl, nonexist, ttl_high, ttl_low, ttl_invalid, size_high, size_low, size_nil, total } Performance: Redis service used: Azure Managed Redis - Balanced B0 - OSSMode Scanning number of keys with TTL threshold 600 Seconds, and Key size threshold 102400 Bytes Total keys scanned: 46161 TTL not set : 0 TTL >= 600 seconds: 46105 TTL < 600 seconds: 56 TTL invalid/error : 0 Non existent key : 0 Keys with Size >= 102400 Bytes: 0 Keys with Size < 102400 Bytes: 46161 Keys with invalid Size : 0 Duration : 0 days 00:00:00.602 # ------------------ Redis service used: Azure Cache for Redis - Standard - C1 Scanning number of keys with TTL threshold 100 Seconds, and Key size threshold 500 Bytes Total keys scanned: 1227 TTL not set : 2 TTL >= 100 seconds: 1225 TTL < 100 seconds: 0 TTL invalid/error : 0 Non existent key : 0 Keys with Size >= 500 Bytes: 1225 Keys with Size < 500 Bytes: 2 Keys with invalid Size : 0 Duration : 0 days 00:00:00.630 # ------------------ WARNING: The above scripts uses LUA script, that runs on Redis side, and may block you normal workload. Use it carefully when have a large number of keys in the cache, and during low workload times. 2 - List Key Names Once we identify some amount of keys in the cache with some specific threshold, we may want to list that key names. The below script can help on that, and returns a list of Redis Keys names with: No TTL set TTL higher or equal to a user defined TTL threshold TTL lower than to a user defined TTL threshold Key value size higher or equal than a user defined Size threshold Key value size lower than a user defined Size threshold Total number of keys in the cache It also includes start and end time, and the total time spent on the keys scan. Output: List all key names with TTL above 100 Seconds, and Key size larger 500 Bytes Start time: dd-mm-YY 18:30:22 ------------------------ 1) "--------------------------------------" 2) "Key_1787_1022: TTL: 461837 seconds, Size: 1336 Bytes" (...) 1551) "Key_1173_1022: TTL: 389795 seconds, Size: 1336 Bytes" 1552) "--------------------------------------" 1553) "Scan completed." 1554) "Total of 1550 keys scanned." 1555) "1225 keys found with TTL >= 100 seconds, and size larger than 500 Bytes" 1556) "--------------------------------------" End time: dd-mm-YY 18:30:22 Duration : 0 days 00:00:00.545 ======================================================== How to run: create the below listKeys.sh file under some folder, on your Linux environment (Ubuntu 20.04.6 LTS used) give permissions to run Shell script, with command chmod 700 listKeys.sh Call the script using the syntax: ./listKeys.sh host password [port] [+/-][ttl_threshold] [+/-][size_threshold] Script parameters: host (mandatory) : the URI for the cache password (mandatory) : the Redis access key from the cache port (optional - default 10000) : TCP port used to access the cache [+/-] (optional) before ttl_threshold: indicates if we want return keys with lower "-", or higher TTL "+" or "" than ttl_threshold ttl_threshold (optional - default 600 - 10 minutes) : Key TTL threshold (in seconds) to be used on the results (use -1 to get Keys with no TTL set) [+/-] (optional) before size_threshold: indicates if we want return keys with small size "-", or large size "+" or "" than size_threshold size_threshold (optional - default 102400 - 100KB) : Key Size threshold to be used on the results Tips: use ttl_threshold = -1 to return key names with no TTL (ex: /listKeys.sh [port] -1 [+/-][size_Threshold]) use ttl_threshold = 0 to return key names with any TTL (ex: /listKeys.sh [port] 0 [+/-][size_Threshold]) use ttl_threshold = -500 to return key names with TTL below 500 seconds (ex: /listKeys.sh [port] -500 [+/-][size_Threshold]) use ttl_threshold = 500 to return key names with TTL above or equal to 500 seconds (ex: /listKeys.sh [port] 500 [+/-][size_Threshold]) use size_threshold = 0 to return key names with any size in the cache (ex: /listKeys.sh [port] [+/-][ttl_threshold] 0) use size_threshold = -1000 to return key names with size below 1000 Bytes (ex: /listKeys.sh [port] [+/-][ttl_threshold] -1000) use size_threshold = 1000 to return key names with size above or equal to 1000 Bytes (ex: /listKeys.sh [port] [+/-][ttl_threshold] 1000) use ttl_threshold = 0 AND size_threshold = 0 to return all key names with any TTL and any size in the cache (ex: /listKeys.sh [port] 0 0) use ttl_threshold = -1 AND size_threshold = 0 to return all key names with no TTL and any size in the cache (ex: /listKeys.sh [port] -1 0) Tested with: Ubuntu 20.04.6 LTS redis-cli -v redis-cli 7.4.2 Redis services: Azure Managed Redis Balanced B0 OSSMode Azure Cache for Redis Standard C1 listKeys.sh #!/usr/bin/env bash set -euo pipefail #============================== LUA script version ================= # Linux Bash Script to list Redis Keys names # It returns key names with: # - No TTL set # - with TTL higher or equal to TTL_treshold # - with TTL lower TTL_threshold # - with value size higher or equal than Size_threshold # - with value size lower than Size_threshold # - total number of keys in the cache. #------------------------------------------------------- # WARNING: # It uses LUA script (included on Bash code) to run on Redis server side. # Use it carefully, during low Redis workoads. # Do your tests first on a Dev environment, before use it on production. #------------------------------------------------------- # It requires : # redis-cli v7 or above #-------------------------------------------------------- # Usage: # listKeys.sh <cacheuri> <cacheaccesskey> [<accessport>(10000)] [+/-][<ttl_treashold>(-1)] [+/-][<size_treashold>(102400)] #======================================================== #------------------------------------------------------ # Using non-ssl port requires to remove --tls parameter on Redis-cli command below #------------------------------------------------------ sintax="<redis_host> <password> [redis_port] [+/-][ttl_threshold] [+/-][size_threshold]" REDIS_HOST="${1:?Usage: $0 $sintax}" REDISCLI_AUTH="${2:?Usage: $0 $sintax}" REDIS_PORT="${3:-10000}" # Redis port (10000, 6380, 6379) KEYTTL_THRESHOLD=${4:-"-1"} # -1, +TTL_threshold, TTL_threashold, -TTL_threshold KEYSIZE_THRESHOLD="${5:-102400}" # +Size_threshold, Size_threashold, -Size_threshold # Port number must be numeric if ! [[ "$REDIS_PORT" =~ ^[0-9]+$ ]]; then echo "ERROR: Redis Port must be numeric" exit 1 fi # Check if KEYTTL_THRESHOLD is a valid integer if ! [[ "$KEYTTL_THRESHOLD" =~ ^[-+]?[0-9]+$ ]]; then echo "Error: ttl_threshold $KEYTTL_THRESHOLD is not an integer" exit 1 fi # Check if KEYSIZE_THRESHOLD is a valid integer if ! [[ "$KEYSIZE_THRESHOLD" =~ ^[-+]?[0-9]+$ ]]; then echo "Error: Size_threshold $KEYSIZE_THRESHOLD is not an integer" exit 1 fi # Check if TTL Threasold is positive (or zero), or negative if [ "$KEYTTL_THRESHOLD" -ge 0 ]; then TTLSIGN="+" else TTLSIGN="-" fi # Check if Size Threshold is positive (or zero), or negative if [ "$KEYSIZE_THRESHOLD" -ge 0 ]; then SIZESIGN="+" size_text="larger" else SIZESIGN="-" size_text="smaler" fi # specific with no TTL set if [ "$KEYTTL_THRESHOLD" -eq -1 ]; then ttl_text="No TTL set" fi if [ "$KEYTTL_THRESHOLD" -ge 0 ]; then ttl_text="TTL above $KEYTTL_THRESHOLD Seconds" fi if [ "$KEYTTL_THRESHOLD" -lt -1 ]; then ttl_text="TTL below ${KEYTTL_THRESHOLD#[-+]} Seconds" fi # remove any sign KEYTTL_THRESHOLD="${KEYTTL_THRESHOLD#[-+]}" KEYSIZE_THRESHOLD="${KEYSIZE_THRESHOLD#[-+]}" echo "========================================================" echo "List all key names with $ttl_text, and Key size $size_text $KEYSIZE_THRESHOLD Bytes" # Start time start_ts=$(date +%s.%3N) echo "Start time: $(date "+%d-%m-%Y %H:%M:%S")" echo "------------------------" echo "" # Procesing redis-cli -h "$REDIS_HOST" -p "$REDIS_PORT" -a "$REDISCLI_AUTH" --tls --no-auth-warning EVAL " local cursor = '0' local ttl_threshold = tonumber(ARGV[1]) -- KEYTTL_THRESHOLD local ttl_sign = ARGV[2] -- TTLSIGN local size_threshold = tonumber(ARGV[3]) -- KEYSIZE_THRESHOLD local size_sign = ARGV[4] -- SIZESIGN local output = {} local count = 0 local totalKeys = 0 local strKeyTTL = '' local strKeySize = '' -- Scanning keys in the cache table.insert(output, '--------------------------------------') repeat local res = redis.call('SCAN', cursor, 'COUNT', 100) cursor = res[1] for _, k in ipairs(res[2]) do local ttl = redis.call('TTL', k) local size = redis.call('MEMORY','USAGE', k) totalKeys = totalKeys + 1 if (size_sign == '+' and size >= size_threshold) or (size_sign == '-' and size < size_threshold) then -- TTL == -1 → no expiration if ttl_sign == '-' and ttl_threshold == 1 then if ttl == -1 then table.insert(output, k .. ': TTL: -1, Size: ' .. size .. ' Bytes') count = count + 1 end -- TTL comparisons (exclude -1 and -2) else if ttl >= 0 then table.insert(output, k .. ': TTL: ' .. ttl .. ' seconds, Size: ' .. size .. ' Bytes') if ttl_sign == '-' and ttl < ttl_threshold then count = count + 1 elseif ttl_sign == '+' and ttl >= ttl_threshold then count = count + 1 end end end end end until cursor == '0' -- Adding summary to output table.insert(output, '--------------------------------------') if (size_sign == '+') then strKeySize = 'larger' else strKeySize = 'smaler' end strKeySize = 'size ' .. strKeySize .. ' than ' .. size_threshold .. ' Bytes' if ttl_sign == '-' and ttl_threshold == 1 then strKeyTTL = 'No TTL' elseif ttl_sign == '-' then strKeyTTL = 'TTL < ' .. ttl_threshold .. ' seconds' elseif ttl_sign == '+' then strKeyTTL = 'TTL >= ' .. ttl_threshold .. ' seconds' end strKeyTTL = ' keys found with ' .. strKeyTTL table.insert(output, 'Scan completed.') table.insert(output, 'Total of ' .. totalKeys .. ' keys scanned.') table.insert(output, count .. strKeyTTL .. ', and ' .. strKeySize) table.insert(output, '--------------------------------------') return output " 0 "$KEYTTL_THRESHOLD" "$TTLSIGN" "$KEYSIZE_THRESHOLD" "$SIZESIGN" echo " " end_ts=$(date +%s.%3N) echo "End time: $(date "+%d-%m-%Y %H:%M:%S")" # Duration - Extract days, hours, minutes, seconds, milliseconds duration=$(awk "BEGIN {print $end_ts - $start_ts}") days=$(awk "BEGIN {print int($duration/86400)}") hours=$(awk "BEGIN {print int(($duration%86400)/3600)}") minutes=$(awk "BEGIN {print int(($duration%3600)/60)}") seconds=$(awk "BEGIN {print int($duration%60)}") milliseconds=$(awk "BEGIN {printf \"%03d\", ($duration - int($duration))*1000}") echo "Duration : ${days} days $(printf "%02d" "$hours"):$(printf "%02d" "$minutes"):$(printf "%02d" "$seconds").$milliseconds" echo "========================================================" Performance: This script is much cleaner and more connection-efficient than the previous one, for the same results. It creates only one connection to Redis service, and all processing is made on Redis side on LUA script. Despite much more efficient, LUA script may block normal workload on Redis, namely having a large dataset, with high number of keys in the cache. Redis service used: Azure Managed Redis Balanced B0 OSSMode # ------------------ Scan completed. Total keys listed: 46005 Duration : 0 days 00:00:01.437 # ------------------ Redis service used: Azure Cache for Redis - Standard - C1 Scan completed. Total keys listed: 1225 Duration : 0 days 00:00:00.545 # ------------------ WARNING: The above script uses LUA script, that runs on Redis side, and may block you normal workload. Use it carefully when have a large number of keys in the cache, and during low workload times. References Azure Managed Redis Azure Best Practice for Development Redis Inc - Commands Redis LUA - Lua API reference Redis Inc - How Redis expires keys Redis CLI Bash Script xargs man page awk man page I hope this can be useful !!!89Views0likes0CommentsFind the Alerts You Didn't Know You Were Missing with Azure SRE Agent
I had 6 alert rules. CPU. Memory. Pod restarts. Container errors. OOMKilled. Job failures. I thought I was covered. Then my app went down. I kept refreshing the Azure portal, waiting for an alert. Nothing. That's when it hit me: my alerts were working perfectly. They just weren't designed for this failure mode. Sound familiar? The Problem Every Developer Knows If you're a developer or DevOps engineer, you've been here: a customer reports an issue, you scramble to check your monitoring, and then you realize you don't have the right alerts set up. By the time you find out, it's already too late. You set up what seems like reasonable alerting and assume you're covered. But real-world failures are sneaky. They slip through the cracks of your carefully planned thresholds. My Setup: AKS with Redis I love to vibe code apps using GitHub Copilot Agent mode with Claude Opus 4.5. It's fast, it understands context, and it lets me focus on building rather than boilerplate. For this project, I built a simple journal entry app: AKS cluster hosting the web API Azure Cache for Redis storing journal data Azure Monitor alerts for CPU, memory, pod restarts, container errors, OOMKilled, and job failures Seemed solid. What could go wrong? The Scenario: Redis Password Rotation Here's something that happens constantly in enterprise environments: the security team rotates passwords. It's best practice. It's in the compliance checklist. And it breaks things when apps don't pick up the new credentials. I simulated exactly this. The pods came back up. But they couldn't connect to Redis (as expected). The readiness probes started failing. The LoadBalancer had no healthy backends. The endpoint timed out. And not a single alert fired. Using SRE Agent to Find the Alert Gaps Instead of manually auditing every alert rule and trying to figure out what I missed, I turned to Azure SRE Agent. I asked it a simple question: "My endpoint is timing out. What alerts do I have, and why didn't any of them fire?" Within minutes, it had diagnosed the problem. Here's what it found: My Existing Alerts Why They Didn't Fire High CPU/Memory No resource pressure,just auth failures Pod Restarts Pods weren't restarting, just unhealthy Container Errors App logs weren't being written OOMKilled No memory issues Job Failures No K8s jobs involved The gaps SRE Agent identified: ❌ No synthetic URL availability test ❌ No readiness/liveness probe failure alerts ❌ No "pods not ready" alerts scoped to my namespace ❌ No Redis connection error detection ❌ No ingress 5xx/timeout spike alerts ❌ No per-pod resource alerts (only node-level) SRE Agent didn't just tell me what was wrong, it created a GitHub issue with : KQL queries to detect each failure type Bicep code snippets for new alert rules Remediation suggestions for the app code Exact file paths in my repo to update Check it out: GitHub Issue How I Built It: Step by Step Let me walk you through exactly how I set this up inside SRE Agent. Step 1: Create an SRE Agent I created a new SRE Agent in the Azure portal. Since this workflow analyzes alerts across my subscription (not just one resource group), I didn't configure any specific resource groups. Instead, I gave the agent's managed identity Reader permissions on my entire subscription. This lets it discover resources, list alert rules, and query Log Analytics across all my resource groups. Step 2: Connect GitHub to SRE Agent via MCP I added a GitHub MCP server to give the agent access to my source code repository.MCP (Model Context Protocol) lets you bring any API into the agent. If your tool has an API, you can connect it. I use GitHub for both source code and tracking dev tickets, but you can connect to wherever your code lives (GitLab, Azure DevOps) or your ticketing system (Jira, ServiceNow, PagerDuty). Step 3: Create a Subagent inside SRE Agent for managing Azure Monitor Alerts I created a focused subagent with a specific job and only the tools it needs: Azure Monitor Alerts Expert Prompt: " You are expert in managing operations related to azure monitor alerts on azure resources including discovering alert rules configured on azure resources, creating new alert rules (with user approval and authorization only), processing the alerts fired on azure resources and identifying gaps in the alert rules. You can get the resource details from azure monitor alert if triggered via alert. If not, you need to ask user for the specific resource to perform analysis on. You can use az cli tool to diagnose logs, check the app health metrics. You must use the app code and infra code (bicep files) files you have access to in the github repo <insert your repo> to further understand the possible diagnoses and suggest remediations. Once analysis is done, you must create a github issue with details of analysis and suggested remediation to the source code files in the same repo." Tools enabled: az cli – List resources, alert rules, action groups Log Analytics workspace querying – Run KQL queries for diagnostics GitHub MCP – Search repositories, read file contents, create issues Step 4: Ask the Subagent About Alert Gaps I gave the agent context and asked a simple question: "@AzureAlertExpert: My API endpoint http://132.196.167.102/api/journals/john is timing out. What alerts do I have configured in rg-aks-journal, and why didn't any of them fire? The agent did the analysis autonomously and summarized findings with suggestions to add new alert rules in a GitHub issue. Here's the agentic workflow to perform azure monitor alert operations Why This Matters Faster response times. Issues get diagnosed in minutes, not hours of manual investigation. Consistent analysis. No more "I thought we had an alert for that" moments. The agent systematically checks what's covered and what's not. Proactive coverage. You don't have to wait for an incident to find gaps. Ask the agent to review your alerts before something breaks. The Bottom Line Your alerts have gaps. You just don't know it until something slips through. I had 6 alert rules and still missed a basic failure. My pods weren't restarting, they were just unhealthy. My CPU wasn't spiking, the app was just returning errors. None of my alerts were designed for this. You don't need to audit every alert rule manually. Give SRE Agent your environment, describe the failure, and let it tell you what's missing. Stop discovering alert gaps from customer complaints. Start finding them before they matter. A Few Tips Give the agent Reader access at subscription level so it can discover all resources Use a focused subagent prompt, don't try to do everything in one agent Test your MCP connections before running workflows What Alert Gaps Have Burned You? What's the alert you wish you had set up before an incident? Credential rotation? Certificate expiry? DNS failures? Let us know in the comments.277Views0likes0CommentsReimagining AI Ops with Azure SRE Agent: New Automation, Integration, and Extensibility features
Azure SRE Agent offers intelligent and context aware automation for IT operations. Enhanced by customer feedback from our preview, the SRE Agent has evolved into an extensible platform to automate and manage tasks across Azure and other environments. Built on an Agentic DevOps approach - drawing from proven practices in internal Azure operations - the Azure SRE Agent has already saved over 20,000 engineering hours across Microsoft product teams operations, delivering strong ROI for teams seeking sustainable AIOps. An Operations Agent that adapts to your playbooks Azure SRE Agent is an AI powered operations automation platform that empowers SREs, DevOps, IT operations, and support teams to automate tasks such as incident response, customer support, and developer operations from a single, extensible agent. Its value proposition and capabilities have evolved beyond diagnosis and mitigation of Azure issues, to automating operational workflows and seamless integration with the standards and processes used in your organization. SRE Agent is designed to automate operational work and reduce toil, enabling developers and operators to focus on high-value tasks. By streamlining repetitive and complex processes, SRE Agent accelerates innovation and improves reliability across cloud and hybrid environments. In this article, we will look at what’s new and what has changed since the last update. What’s New: Automation, Integration, and Extensibility Azure SRE Agent just got a major upgrade. From no-code automation to seamless integrations and expanded data connectivity, here’s what’s new in this release: No-code Sub-Agent Builder: Rapidly create custom automations without writing code. Flexible, event-driven triggers: Instantly respond to incidents and operational changes. Expanded data connectivity: Unify diagnostics and troubleshooting across more data sources. Custom actions: Integrate with your existing tools and orchestrate end-to-end workflows via MCP. Prebuilt operational scenarios: Accelerate deployment and improve reliability out of the box. Unlike generic agent platforms, Azure SRE Agent comes with deep integrations, prebuilt tools, and frameworks specifically for IT, DevOps, and SRE workflows. This means you can automate complex operational tasks faster and more reliably, tailored to your organization’s needs. Sub-Agent Builder: Custom Automation, No Code Required Empower teams to automate repetitive operational tasks without coding expertise, dramatically reducing manual workload and development cycles. This feature helps address the need for targeted automation, letting teams solve specific operational pain points without relying on one-size-fits-all solutions. Modular Sub-Agents: Easily create custom sub-agents tailored to your team’s needs. Each sub-agent can have its own instructions, triggers, and toolsets, letting you automate everything from outage response to customer email triage. Prebuilt System Tools: Eliminate the inefficiency of creating basic automation from scratch, and choose from a rich library of hundreds of built-in tools for Azure operations, code analysis, deployment management, diagnostics, and more. Custom Logic: Align automation to your unique business processes by defining your automation logic and prompts, teaching the agent to act exactly as your workflow requires. Flexible Triggers: Automate on Your Terms Invoke the agent to respond automatically to mission-critical events, not wait for manual commands. This feature helps speed up incident response and eliminate missed opportunities for efficiency. Multi-Source Triggers: Go beyond chat-based interactions, and trigger the agent to automatically respond to Incident Management and Ticketing systems like PagerDuty and ServiceNow, Observability Alerting systems like Azure Monitor Alerts, or even on a cron-based schedule for proactive monitoring and best-practices checks. Additional trigger sources such as GitHub issues, Azure DevOps pipelines, email, etc. will be added over time. This means automation can start exactly when and where you need it. Event-Driven Operations: Integrate with your CI/CD, monitoring, or support systems to launch automations in response to real-world events - like deployments, incidents, or customer requests. Vital for reducing downtime, it ensures that business-critical actions happen automatically and promptly. Expanded Data Connectivity: Unified Observability and Troubleshooting Integrate data, enabling comprehensive diagnostics and troubleshooting and faster, more informed decision-making by eliminating silos and speeding up issue resolution. Multiple Data Sources: The agent can now read data from Azure Monitor, Log Analytics, and Application Insights based on its Azure role-based access control (RBAC). Additional observability data sources such as Dynatrace, New Relic, Datadog, and more can be added via the Remote Model Context Protocol (MCP) servers for these tools. This gives you a unified view for diagnostics and automation. Knowledge Integration: Rather than manually detailing every instruction in your prompt, you can upload your Troubleshooting Guide (TSG) or Runbook directly, allowing the agent to automatically create an execution plan from the file. You may also connect the agent to resources like SharePoint, Jira, or documentation repositories through Remote MCP servers, enabling it to retrieve needed files on its own. This approach utilizes your organization’s existing knowledge base, streamlining onboarding and enhancing consistency in managing incidents. Azure SRE Agent is also building multi-agent collaboration by integrating with PagerDuty and Neubird, enabling advanced, cross-platform incident management and reliability across diverse environments. Custom Actions: Automate Anything, Anywhere Extend automation beyond Azure and integrate with any tool or workflow, solving the problem of limited automation scope and enabling end-to-end process orchestration. Out-of-the-Box Actions: Instantly automate common tasks like running azcli, kubectl, creating GitHub issues, or updating Azure resources, reducing setup time and operational overhead. Communication Notifications: The SRE Agent now features built-in connectors for Outlook, enabling automated email notifications, and for Microsoft Teams, allowing it to post messages directly to Teams channels for streamlined communication. Bring Your Own Actions: Drop in your own Remote MCP servers to extend the agent’s capabilities to any custom tool or workflow. Future-proof your agentic DevOps by automating proprietary or emerging processes with confidence. Prebuilt Operations Scenarios Address common operational challenges out of the box, saving teams time and effort while improving reliability and customer satisfaction. Incident Response: Minimize business impact and reduce operational risk by automating detection, diagnosis, and mitigation of your workload stack. The agent has built-in runbooks for common issues related to many Azure resource types including Azure Kubernetes Service (AKS), Azure Container Apps (ACA), Azure App Service, Azure Logic Apps, Azure Database for PostgreSQL, Azure CosmosDB, Azure VMs, etc. Support for additional resource types is being added continually, please see product documentation for the latest information. Root Cause Analysis & IaC Drift Detection: Instantly pinpoint incident causes with AI-driven root cause analysis including automated source code scanning via GitHub and Azure DevOps integration. Proactively detect and resolve infrastructure drift by comparing live cloud environments against source-controlled IaC, ensuring configuration consistency and compliance. Handle Complex Investigations: Enable the deep investigation mode that uses a hypothesis-driven method to analyze possible root causes. It collects logs and metrics, tests hypotheses with iterative checks, and documents findings. The process delivers a clear summary and actionable steps to help teams accurately resolve critical issues. Incident Analysis: The integrated dashboard offers a comprehensive overview of all incidents managed by the SRE Agent. It presents essential metrics, including the number of incidents reviewed, assisted, and mitigated by the agent, as well as those awaiting human intervention. Users can leverage aggregated visualizations and AI-generated root cause analyses to gain insights into incident processing, identify trends, enhance response strategies, and detect areas for improvement in incident management. Inbuilt Agent Memory: The new SRE Agent Memory System transforms incident response by institutionalizing the expertise of top SREs - capturing, indexing, and reusing critical knowledge from past incidents, investigations, and user guidance. Benefit from faster, more accurate troubleshooting, as the agent learns from both successes and mistakes, surfacing relevant insights, runbooks, and mitigation strategies exactly when needed. This system leverages advanced retrieval techniques and a domain-aware schema to ensure every on-call engagement is smarter than the last, reducing mean time to resolution (MTTR) and minimizing repeated toil. Automatically gain a continuously improving agent that remembers what works, avoids past pitfalls, and delivers actionable guidance tailored to the environment. GitHub Copilot and Azure DevOps Integration: Automatically triage, respond to, and resolve issues raised in GitHub or Azure DevOps. Integration with modern development platforms such as GitHub Copilot coding agent increases efficiency and ensures that issues are resolved faster, reducing bottlenecks in the development lifecycle. Ready to get started? Azure SRE Agent home page Product overview Pricing Page Pricing Calculator Pricing Blog Demo recordings Deployment samples What’s Next? Give us feedback: Your feedback is critical - You can Thumbs Up / Thumbs Down each interaction or thread, or go to the “Give Feedback” button in the agent to give us in-product feedback - or you can create issues or just share your thoughts in our GitHub repo at https://github.com/microsoft/sre-agent. We’re just getting started. In the coming months, expect even more prebuilt integrations, expanded data sources, and new automation scenarios. We anticipate continuous growth and improvement throughout our agentic AI platforms and services to effectively address customer needs and preferences. Let us know what Ops toil you want to automate next!3.3KViews1like0Comments- 16KViews3likes0Comments
AI Resilience: Strategies to Keep Your Intelligent App Running at Peak Performance
Stay Online Reliability. It's one of the 5 pillars of Azure Well-Architect Framework. When starting to implement and go-to-market any new product witch has any integration with Open AI Service you can face spikes of usage in your workload and, even having everything scaling correctly in your side, if you have an Azure Open AI Services deployed using PTU you can reach the PTU threshold and them start to experience some 429 response code. You also will receive some important information about the when you can retry the request in the header of the response and with this information you can implement in your business logic a solution. Here in this article I will show how to use the API Management Service policy to handle this and also explore the native cache to save some tokens! Architecture Reference The Azure Function in the left of the diagram just represent and App request and can be any kind of resource (even in an On-Premisse environment). Our goal in this article is to show one in n possibilities to handle the 429 responses. We are going to use API Management Policy to automatically redirect the backend to another Open AI Services instance in other region in the Standard mode, witch means that the charge is going to be only what you use. First we need to create an API in our API Management to forward the requests to your main Open AI Services (region 1 in the diagram). Now we are going to create this policy in the API call request: <policies> <inbound> <base /> <set-backend-service base-url="<your_open_ai_region1_endpoint>" /> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <retry condition="@(context.Response.StatusCode == 429)" count="1" interval="5" /> <set-backend-service base-url="<your_open_ai_region2_endpoint>" /> </on-error> </policies> The first part of our job is done! Now we have an automatically redirect to our OpenAI Services deployed at region 2 when our PTU threshold is reached. Cost consideration So now you can ask me: and about my cost increment for using API Management? Even if you don't want to use any other feature on API Management you can leverage of the API Management native cache and, once again using policy and AI, put some questions/answers in the built-in Redis* cache using semantic cache for Open AI services. Let's change our policy to consider this: <policies> <inbound> <base /> <azure-openai-semantic-cache-lookup score-threshold="0.05" embeddings-backend-id ="azure-openai-backend" embeddings-backend-auth ="system-assigned" > <vary-by>@(context.Subscription.Id)</vary-by> </azure-openai-semantic-cache-lookup> <set-backend-service base-url="<your_open_ai_region1_endpoint>" /> </inbound> <backend> <base /> </backend> <outbound> <base /> <azure-openai-semantic-cache-store duration="60" /> </outbound> <on-error> <retry condition="@(context.Response.StatusCode == 429)" count="1" interval="5" /> <set-backend-service base-url="<your_open_ai_region2_endpoint>" /> </on-error> </policies> Now, API Management will handle the tokens inputted and use semantic equivalence and decide if its fit with cached information or redirect the request to your OpenAI endpoint. And, sometime, this can help you to avoid reach the PTU threshold as well! * Check the tier / cache capabilities to validate your business solution needs with the API Management cache feature: Compare API Management features across tiers and cache size across tiers. Conclusion API Management offers key capabilities for AI that we are exploring in this article and also others that you can leverage for your intelligent applications. Check it out on this awesome AI Gateway HUB repository At least but not less important, dive in API Management features with experts in the field inside the API Management HUB. Thanks for reading and Happy Coding!598Views4likes1CommentCut Costs and Speed Up AI API Responses with Semantic Caching in Azure API Management
This article is part of a series of articles on API Management and Generative AI. We believe that adding Azure API Management to your AI projects can help you scale your AI models, make them more secure and easier to manage. We previously covered the hidden risks of AI APIs in today's AI-driven technological landscape. In this article, we dive deeper into one of the supported Gen AI policies in API Management, which allows you to minimize Azure OpenAI costs and make your applications more performant by reducing the number of calls sent to your LLM service. How does it currently work without the semantic caching policy? For simplicity, let's look at a scenario where we only have a single client app, a single user, and a single model deployment. This of course does not represent most real-world use-cases, as you often have multiple users talking to different services. Take the following cases into consideration: - A user lands on your application and sends in a query (query 1), They then send the exact same query again, with similar verbiage, in the same session (query 2), The user changes the wording of the query, but it is still relevant and related to the original query (query 3) The last query, (query 4), is completely different and unrelated to the previous queries. In a normal implementation, all these queries will cost you tokens (TPM), resulting in higher cuts in your billing. Your users are also likely to experience some latency as they wait for the LLM to build a response with each call. As the user base grows, you anticipate that the expenses will grow exponentially, making it more expensive to run your system eventually. How does Semantic caching in Azure API Management fix this? Let's look at the same scenario as described above (at a high level first), with a flow diagram representing how you can cut costs and boost your app's performance with the semantic cache policy. When the user sends in the first query, the LLM will be used to generate a response, which will then be stored in the cache. Queries 2 and 3 are somewhat related to query 1, which could be a semantic similarity, or exact match, or could contain a specified keyword, i.e.. price. In all these cases, a lookup will be performed, and the appropriate response will be retrieved from the cache, without waiting on the LLM to regenerate a response. Query 4, which is different from the previous prompts, will require the call to be passed through to the LLM, then grabs the generated response and stores it in the cache for future searches. Okay. Tell me more - How does this work and how do I set it up? Think about this - What would be the likelihood of your users asking related questions or exactly comparable questions in your app? I'd argue that the odds are quite high. Semantic caching for Azure OpenAI API requests To start, you will need to add Azure OpenAI Service APIs to your Azure API Management instance with semantic caching enabled. Luckily, this step has been reduced to just a one-click step. I'll link a tutorial on this in the 'Resources' section. Before you get to configure the policies, you first need to set up a backend for the embeddings API. Oh yes, as part of your deployments, you will need an embedding model to convert your input to the corresponding vector representation, allowing Azure Redis cache to perform the vector similarity search. This step also allows you to set a score_threshold, a parameter used to determine how similar user queries need to be to retrieve responses from the cache. Next, is to add the two policies that you need: azure-openai-semantic-cache-store/ llm-semantic-cache-store and azure-openai-semantic-cache-lookup/ llm-semantic-cache-lookup The azure-openai-semantic-cache-store policy will cache the completions and requests to the configured cache service. You can use the internal Azure Redis enterprise or any another external cache as long as it's a Redis-compatible cache in Azure API Management. The second policy, azure-openai-semantic-cache-lookup, based on the proximity result of the similarity search and the score_threshold, will perform a cache lookup through the compilation of cached requests and completions. In addition to the score_threshold attribute, you will also specify the id of the embeddings backend created in an earlier step and can choose to omit the system messages from the prompt at this step. These two policies enhance your system's efficiency and performance by reusing completions, increasing response speed, and making your API calls much cheaper. Alright, so what should be my next steps? This article just introduced you to one of the many Generative AI supported capabilities in Azure API Management. We have more policies that you can use to better manage your AI APIs, covered in other articles in this series. Do check them out. Do you have any resources I can look at in the meantime to learn more? Absolutely! Check out: - Using external Redis-compatible cache in Azure API Management documentation Use Azure Cache for Redis as a semantic cache tutorial Enable semantic caching for Azure OpenAI APIs in Azure API Management article Improve the performance of an API by adding a caching policy in Azure API Management Learn moduleTake full control of your AI APIs with Azure API Management Gateway
This article is part of a series of articles on API Management and Generative AI. We believe that adding Azure API Management to your AI projects can help you scale your AI models, make them more secure and easier to manage. In this article, we will shed some light on capabilities in API Management, which are designed to help you govern and manage Generative AI APIs, ensuring that you are building resilient and secure intelligent applications. But why exactly do I need API Management for my AI APIs? Common challenges when implementing Gen AI-Powered solutions include: - Quota, (calculated in tokens-per-minute (TPM)), allocation across multiple client apps, How to control and track token consumption for all users, Mechanisms to attribute costs to specific client apps, activities, or users, Your systems resiliency to backend failures when hitting one or more limits And the list goes on with more challenges and questions. Well, let’s find some answers, shall we? Quota allocation Take a scenario where you have more than one client application, and they are talking to one or more models from Azure OpenAI Service or Azure AI Foundry. With this complexity, you want to have control over the quota distribution for each of the applications. Tracking Token usage & Security I bet you agree with me that it would be unfortunate if one of your applications (most likely that which gets the highest traffic), hogs up all the TPM quota leaving zero tokens remaining for your other applications, right? If this occurs though, there is a high chance that it might be a DDOS Attack, with bad actors trying to bombard your system with purposeless traffic causing service downtime. Yet another reason why you will need more control and tracking mechanisms to ensure this doesn’t happen. Token Metrics As a data-driven company, having additional insights with flexibility to dissect and examine usage data down to dimensions like subscription ID or API ID level is extremely valuable. These metrics go a long way in informing capacity and budget planning decisions. Automatic failovers This is a common one. You want to ensure that your users experience zero service downtime, so if one of your backends is down, does your system architecture allow automatic rerouting and forwarding to healthy services? So, how will API Management help address these challenges? API Management has a set of policies and metrics called Generative AI (Gen AI) gateway capabilities, which empower you to manage and have full control of all these moving pieces and components of your intelligent systems. Minimize cost with Token-based limits and semantic caching How can you minimize operational costs for AI applications as much as possible? By leveraging the `llm-token limit` policy in Azure API Management, you can enforce token-based limits per user on identifiers such as subscription keys and requesting IP addresses. When a caller surpasses their allocated tokens-per-minute quota, they receive a HTTP "Too Many Requests" error along with ‘retry-after’ instructions. This mechanism ensures fair usage and prevents any single user from monopolizing resources. To optimize cost consumption for Large Language Models (LLMs), it is crucial to minimize the number of API calls made to the model. Implementing the `llm-semantic-cache-store` policy and `llm-semantic-cache-lookup` policies allow you to store and retrieve similar completions. This method involves performing a cache lookup for reused completions, thereby reducing the number of calls sent to the LLM backend. Consequently, this strategy helps in significantly lowering operational costs. Ensure reliability with load balancing and circuit breakers Azure API Management allows you to leverage load balancers to distribute the workload across various prioritized LLM backends effectively. Additionally, you can set up circuit breaker rules that redirect requests to a responsive backend if the prioritized one fails, thereby minimizing recovery time and enhancing system reliability. Implementing the semantic-caching policy not only saves costs but also reduces system latency by minimizing the number of calls processed by the backend. Okay. What Next? This article mentions these capabilities at a high level, but in the coming weeks, we will publish articles that go deeper into each of these generative AI capabilities in API Management, with examples of how to set up each policy. Stay tuned! Do you have any resources I can look at in the meantime to learn more? Absolutely! Check out: - Manage your Azure OpenAI APIs with Azure API Management http://aka.ms/apimloveCustom scaling on Azure Container Apps based on Redis Streams
ACA's autoscaling feature internally leverages KEDA and gives you the ability to configure the number of replicas to deploy based on rules (event triggers). Apart from HTTP and TCP rule based scaling, container apps also support custom rules giving it the flexibility and opening up a lot more configuration options since all of KEDA's event-based scalers are supported.7.9KViews3likes2CommentsUnlock New AI and Cloud Potential with .NET 9 & Azure: Faster, Smarter, and Built for the Future
.NET 9, now available to developers, marks a significant milestone in the evolution of the .NET platform, pushing the boundaries of performance, cloud-native development, and AI integration. This release, shaped by contributions from over 9,000 community members worldwide, introduces thousands of improvements that set the stage for the future of application development. With seamless integration with Azure and a focus on cloud-native development and AI capabilities, .NET 9 empowers developers to build scalable, intelligent applications with unprecedented ease. Expanding Azure PaaS Support for .NET 9 With the release of .NET 9, a comprehensive range of Azure Platform as a Service (PaaS) offerings now fully support the platform’s new capabilities, including the latest .NET SDK for any Azure developer. This extensive support allows developers to build, deploy, and scale .NET 9 applications with optimal performance and adaptability on Azure. Additionally, developers can access a wealth of architecture references and sample solutions to guide them in creating high-performance .NET 9 applications on Azure’s powerful cloud services: Azure App Service: Run, manage, and scale .NET 9 web applications efficiently. Check out this blog to learn more about what's new in Azure App Service. Azure Functions: Leverage serverless computing to build event-driven .NET 9 applications with improved runtime capabilities. Azure Container Apps: Deploy microservices and containerized .NET 9 workloads with integrated observability. Azure Kubernetes Service (AKS): Run .NET 9 applications in a managed Kubernetes environment with expanded ARM64 support. Azure AI Services and Azure OpenAI Services: Integrate advanced AI and OpenAI capabilities directly into your .NET 9 applications. Azure API Management, Azure Logic Apps, Azure Cognitive Services, and Azure SignalR Service: Ensure seamless integration and scaling for .NET 9 solutions. These services provide developers with a robust platform to build high-performance, scalable, and cloud-native applications while leveraging Azure’s optimized environment for .NET. Streamlined Cloud-Native Development with .NET Aspire .NET Aspire is a game-changer for cloud-native applications, enabling developers to build distributed, production-ready solutions efficiently. Available in preview with .NET 9, Aspire streamlines app development, with cloud efficiency and observability at its core. The latest updates in Aspire include secure defaults, Azure Functions support, and enhanced container management. Key capabilities include: Optimized Azure Integrations: Aspire works seamlessly with Azure, enabling fast deployments, automated scaling, and consistent management of cloud-native applications. Easier Deployments to Azure Container Apps: Designed for containerized environments, .NET Aspire integrates with Azure Container Apps (ACA) to simplify the deployment process. Using the Azure Developer CLI (azd), developers can quickly provision and deploy .NET Aspire projects to ACA, with built-in support for Redis caching, application logging, and scalability. Built-In Observability: A real-time dashboard provides insights into logs, distributed traces, and metrics, enabling local and production monitoring with Azure Monitor. With these capabilities, .NET Aspire allows developers to deploy microservices and containerized applications effortlessly on ACA, streamlining the path from development to production in a fully managed, serverless environment. Integrating AI into .NET: A Seamless Experience In our ongoing effort to empower developers, we’ve made integrating AI into .NET applications simpler than ever. Our strategic partnerships, including collaborations with OpenAI, LlamaIndex, and Qdrant, have enriched the AI ecosystem and strengthened .NET’s capabilities. This year alone, usage of Azure OpenAI services has surged to nearly a billion API calls per month, illustrating the growing impact of AI-powered .NET applications. Real-World AI Solutions with .NET: .NET has been pivotal in driving AI innovations. From internal teams like Microsoft Copilot creating AI experiences with .NET Aspire to tools like GitHub Copilot, developed with .NET to enhance productivity in Visual Studio and VS Code, the platform showcases AI at its best. KPMG Clara is a prime example, developed to enhance audit quality and efficiency for 95,000 auditors worldwide. By leveraging .NET and scaling securely on Azure, KPMG implemented robust AI features aligned with strict industry standards, underscoring .NET and Azure as the backbone for high-performing, scalable AI solutions. Performance Enhancements in .NET 9: Raising the Bar for Azure Workloads .NET 9 introduces substantial performance upgrades with over 7,500 merged pull requests focused on speed and efficiency, ensuring .NET 9 applications run optimally on Azure. These improvements contribute to reduced cloud costs and provide a high-performance experience across Windows, Linux, and macOS. To see how significant these performance gains can be for cloud services, take a look at what past .NET upgrades achieved for Microsoft’s high-scale internal services: Bing achieved a major reduction in startup times, enhanced efficiency, and decreased latency across its high-performance search workflows. Microsoft Teams improved efficiency by 50%, reduced latency by 30–45%, and achieved up to 100% gains in CPU utilization for key services, resulting in faster user interactions. Microsoft Copilot and other AI-powered applications benefited from optimized runtime performance, enabling scalable, high-quality experiences for users. Upgrading to the latest .NET version offers similar benefits for cloud apps, optimizing both performance and cost-efficiency. For more information on updating your applications, check out the .NET Upgrade Assistant. For additional details on ASP.NET Core, .NET MAUI, NuGet, and more enhancements across the .NET platform, check out the full Announcing .NET 9 blog post. Conclusion: Your Path to the Future with .NET 9 and Azure .NET 9 isn’t just an upgrade—it’s a leap forward, combining cutting-edge AI integration, cloud-native development, and unparalleled performance. Paired with Azure’s scalability, these advancements provide a trusted, high-performance foundation for modern applications. Get started by downloading .NET 9 and exploring its features. Leverage .NET Aspire for streamlined cloud-native development, deploy scalable apps with Azure, and embrace new productivity enhancements to build for the future. For additional insights on ASP.NET, .NET MAUI, NuGet, and more, check out the full Announcing .NET 9 blog post. Explore the future of cloud-native and AI development with .NET 9 and Azure—your toolkit for creating the next generation of intelligent applications.9.8KViews2likes1CommentAzure Managed Redis (Preview): The Next Generation of Redis on Azure at Microsoft Ignite 2024
Azure Managed Redis (Preview): The Next Generation of Redis on Azure, announced at Microsoft Ignite 2024 We were excited to announce the preview of Azure Managed Redis at Microsoft Ignite 2024, a first party, in-memory database solution designed for developers building the next generation of GenAI applications.