Blog Post

Azure PaaS Blog
1 MIN READ

[Azure Service Bus] JMS messages getting dead-lettered

davidqiu's avatar
davidqiu
Icon for Microsoft rankMicrosoft
Aug 09, 2024

The article discusses a problem where numerous messages end up in the dead letter queue (DLQ) when the JMS service bus consumer consumes message from the Azure Service Bus queues or topic/subscriptions. The reason for the messages being dead-lettered is that they have reached the maximum delivery count.

 

The root cause stems from message prefetching. Prefetch is enabled by the Qpid lib by default. When it is turned on, Qpid utilizes a local buffer to prefetch messages from the Azure Service Bus, storing them prior to delivery to the consumer. The issue occurs when Qpid prefetches an excessive number of messages that the consumer is unable to process within the lock duration. Consequently, the consumer is unable to acknowledge or finalize the processing of these messages before the lock expires. Those messages will move to the DLQ when the maximum delivery count is exceeded.

 

To address this problem, you can either turn off prefetching or modify the prefetch count. Disabling prefetching is achievable by setting jms.prefetchPolicy.all=0 in the JMS client. This configuration allows the JMS client to directly consume messages from the Azure Service Bus, circumventing Qpid's local buffer. Consequently, the consumer can process messages at a suitable pace, guaranteeing smooth processing and issue-free completion.

 

Why is Prefetch not the default option in Microsoft .NET/Java/Python libs?

 

 

Updated Aug 19, 2024
Version 3.0
  • CSharpArtisan's avatar
    CSharpArtisan
    Copper Contributor

    Hi David. I am not familiar with the JMS service bus connection, but I do know my way around Azure Service Bus reasonably well and I wanted to share a few thoughts with you that might be helpful....

     

    When a client of any sort receives messages off of an Azure Queue, you can specify a "Receive Mode" of either Peek-Lock (which is the default) or Receive-and-Delete. With Peek-Lock, as you have experienced, the receive operation takes out a transient lock on the message and bad things can happen when this lock expires. You can control the timeout of this lock with a setting on the Queue:

    This can be set for hours or even days. So, one option you might have to simply extend this timeout thereby giving your app more time to process the batch of messages.

    Another thing to be aware of is that when a message lock times out it doesn't go to the Dead Letter Queue immediately. Instead, it goes back into the Active Queue where it is retried a set number of times. After it fails that many times, then it does to Dead Letter. This, too, is configurable:

    My worry with this is that not only are your messages timing out, but your app may be churning away trying them multiple times before they end up in the Dead Letter queue. Setting Max Delivery Count to 1 would at least prevent churn. I guess it depends on if you have legitimate reasons to retry the messgaes.

    If you really don't need to retry messages, you might then, consider using Receive-and-Delete. With this option, messages are deleted from the queue immediately. There is no lock at all and no need to Complete the message. It is simple and clean. The challenge for you may be that this receiver option is set in the client code, not through configuration, so you'd have to dig around and see if this JMS service bus consumer thing supports it. Here is a snippet of C# code to show what it would look like if you were writing your own queue reader:

     

    I hope that helps! Please let me know if you have follow-up questions. -Matthew

  • CSharpArtisan, thanks for your comments. You are correct on the peek-lock mode, lock duration and max deliver count. Users are more likely to see messages in the DLQ when prefetch is turned on, especially when the consumer has slow performance. Many users are not aware of the fact that prefetch is enabled by default in Qpid. You can read more on prefetch following the link in the article to see my points.