Increase System Responsiveness and Reliability With Queueing
An intermediate message broker makes it easy to offload and retry work
Have you ever tried to add features to your product only to find it degraded user experience as you wait for the server to return an answer? Unacceptable delay in your systems can be solved by separating the collection of what needs to be done and the actual execution. There are many approaches to distributing work in computer systems and today’s article will focus on using message queues as an intermediate step between publishers and subscribers.
This article is part of a series of articles that will end with a detailed, working example to do long running image processing using a message queue.
Terminology
Responsiveness - How quickly a system provides user feedback
Reliability - How a system responds to failure in a part
Producer / Publisher - sends work onto the queue.
Consumer / Subscriber / Worker - pulls work off and processes it.
Queue - the buffer holding pending work.
Topic - a named channel work is published to (pub/sub).
Broker - the middleware that stores and routes messages (e.g., RabbitMQ, Kafka, SQS).
Message / Job / Task - a single unit of work.
Payload - the actual data carried by a message.
When to Consider a Queue
A queue is a great option when an operation does not have to happen immediately.
Requests to perform work come in faster than they can be processed
A surge in submissions before a deadline
Bulk or batch data processing
Requests from AI agents
Operations that can be slow
Complex LLM prompts
Media processing
Report generation
Operations that can fail and be retried
Interactions with third party systems
Downtime in other systems
Errors in other systems
Core Ideas
Publishers working over HTTP typically follow a flow like the following:
Any large files are stored before a request begins
A publisher submits a request
A message is added to a queue managed by a message broker
The request is accepted, typically with a 202 Accepted HTTP response. Usually the publisher is given some type of ID or token so other parts of the system can check on in progress work
Subscribers working over HTTP typically follow a flow like the following
Subscribers check for new messages
If there are messages, the message broker returns one or more messages to the subscriber and reserves those messages from being seen by other subscribers for a certain period of time
The subscriber attempts to complete the work in the message, potentially accessing large files associated with the message.
The subscriber tells the message broker to delete the message when the operation is successful and keep the message.
The message broker does the following
Collects messages from publishers
Distributes messages to subscribers
Determines how long a message is reserved after being distributed
Determines retry logic
Determines when a message is deemed impossible to process and what to do with the failed message
HTTP is not the only protocol that can be used in queueing. Socket approaches that combine bidirectional traffic with custom protocols are common in mature message brokers.
Key Decisions
Payload Contents
Many message brokers have caps on message payload size in order to maintain high performance. If all the information that is required to process a message can be included in the message itself the subscribers will have fewer round trips to retrieve data. In other cases for content like media and large data sets the content must be stored in file storage and the message payload only contains a path or pointer to the data needed to complete the request.
Idempotent Operations
The nature of distributed systems means that multiple subscribers could potentially receive the same message, for example if a request comes in before a lease on a message can be established. System authors need to think about what happens if the same message is processed more than once. For some systems this may not be a problem because operations are idempotent. An idempotent operation is one that produces the same result or system state, no matter how many times it is applied. For example, liking a social media post multiple times still only produces one like for that person.
Other operations are difficult to make idempotent like physical transformations, certain financial transactions or hardware operations. If a message must always be processed once and only once, additional effort needs to be applied at the message broker level to ensure always once delivery of messages. These extra measures typically add overhead in favor of reliability.
Retry Conditions
Some interruptions in a system are temporary. Calls to a third party service can get a rate limit response with a retry time. Individual pieces in a system can experience temporary down time that causes a failure at one point in time but success at a later point in time. The message broker needs to be configured with policy for when and how many times to retry distributing a message. An operations team might want to know if many retries are happening so they can tune infrastructure and prevent internal system from accidentally triggering a denial of service with other systems.
Dead Letter Handling
Other types of interruptions are long lived. An internal system with a deployment problem, an expired subscription, a third party outage. Messages in this failure state can be sent to another queue called a dead letter queue. Operations teams may need to be alerted when fatal errors are happing with message processing. The team may decide to process a dead letter queue when service is restored or do some type of manual processing.
Conclusion
We’ll talk about more topics related to queues in future articles. What workloads do you have that could benefit from introducing a queue?


