Optimizing Google Cloud Pub/Sub Messaging with Batching Techniques
Written on
Google's Pub/Sub is an effective solution for developing loosely-coupled cloud services, facilitating asynchronous communication. However, utilizing cloud solutions like Pub/Sub presents specific challenges, especially related to network interactions.
Challenges such as latency, network partitions, and potential message loss can seriously hinder performance. These issues may often be obscured by the abstraction levels of client libraries, leading to inconsistent and unpredictable service performance.
Thus, it is essential to grasp and address the effects of network interactions to uphold optimal system performance. One viable approach is to reduce the frequency of these calls, particularly during high-traffic periods.
In this article, we will examine how message batching in Pub/Sub can improve performance, leading to a more dependable, efficient, and cost-effective messaging process.
No Premature Optimization
Before diving into optimization, it is crucial to identify whether a genuine problem exists.
To evaluate the performance of Pub/Sub without batching, I created a Kotlin script, which is accessible on GitHub. This script sends a set number of messages to Pub/Sub, waits for their processing, and logs the time taken.
To better mimic a real-world environment, the script adds a delay of 0 to 5 milliseconds between each publish, generally leaning towards less than 2 milliseconds.
These tests have their limitations, influenced by various factors such as my computer's performance and internet connection quality, resulting in variable outcomes. However, they provide a general sense of performance and highlight potential areas for improvement.
Results for No Batching: | Msgs | Med. Time (ms) | Msgs/sec | Time to Publish 1M | | ---- | -------------- | -------- | ------------------ | | 1K | 4,873 | 205 | ~1h 21m | | 10K | 27,773 | 360 | ~46m 17s | | 25K | 67,342 | 371 | ~44m 54s |
Performance Metrics Breakdown: - Msgs (Messages): Indicates the number of messages published in each trial. - Med(ian) Time (ms): Represents the median time taken to publish all messages in a single trial, based on 25 attempts. - Messages per Second: Shows the rate of successful message publications each second. - Time to Publish One Million: Estimated time to publish one million messages, based on the median time, assuming consistent publishing rates.
This analysis assumes a steady flow of messages; however, if your system encounters traffic bursts, be prepared for variations in actual results.
Notably, a warming-up phenomenon seems to occur. When the number of messages sent increases (10,000 instead of 1,000), the messages per second show a significant rise. However, the performance difference between 10,000 and 25,000 messages is less pronounced, indicating diminishing returns at higher loads. This observation aligns with Google Pub/Sub’s architecture, which is designed to be horizontally scalable, allowing it to handle more topics, subscriptions, or messages by increasing server instances.
This implies that the system likely allocates additional resources as message loads increase. To account for this, I conducted each experiment 25 times to minimize the warming-up phase's influence.
Maintaining a steady message flow in Pub/Sub ensures better resource allocation compared to sporadic bursts of 1,000 messages, which may result in resource scaling down.
For further insights, refer to the Google Cloud Pub/Sub architecture documentation.
Batching
Setting Up Batching
To send messages in batches, we need to configure our Publisher with suitable BatchSettings. Here's an example setup using Google’s Java client library:
val batchingSettings =
BatchingSettings.newBuilder()
.setIsEnabled(true)
.setDelayThreshold(...)
.setElementCountThreshold(...)
.setRequestByteThreshold(...)
.build()
val builder =
Publisher.newBuilder("projects/$GCP_PROJECT/topics/$GCP_TOPIC")
.setBatchingSettings(batchingSettings)
.build()
Key settings include: - isEnabled: Activates the batching functionality. - elementCountThreshold: Defines the maximum messages allowed in a single batch. - requestByteThreshold: Sets the maximum batch size in bytes. - The delayThreshold determines the maximum wait time before sending an incomplete batch if neither the elementCountThreshold nor requestByteThreshold is met.
These settings are generally consistent across all client libraries, though naming may vary slightly.
Important Note for Spring Boot Users with `PubSubTemplate`: It is essential to include all configuration properties in your application.yaml file, as defaults are not provided. Omitting any property will lead to PubSubTemplate operating without batching, which may be misleading since no error messages or warnings will indicate this. Here’s an example configuration:
spring:
cloud:
gcp:
pubsub:
enabled: true
publisher:
batching:
enabled: true
element-count-threshold: 10
delay-threshold-seconds: 1
request-byte-threshold: 100000
Does Batching Help?
Let’s rerun the previous experiment with batching enabled, focusing on the scenario with 1000 messages.
I performed a grid search to optimize the batch parameters, testing various combinations to identify the most effective settings. Here are the results:
Best Performance: - Settings: delayThreshold 10ms, elementCountThreshold 5, requestByteThreshold 4096 - Time to Publish One Million: Approximately 52 minutes and 24 seconds - Performance Improvement: About 35.79% faster than the baseline
Worst Performance: - Settings: delayThreshold 1000ms, elementCountThreshold 50, requestByteThreshold 2048 - Time to Publish One Million: Approximately 1 day, 2 hours, and 43 minutes - Performance Decline: About 200.47% slower than the baseline
While a more comprehensive grid search may yield better parameters, this level of analysis serves our demonstration needs.
It is vital to carefully manage batch parameter settings, as incorrect configurations can significantly degrade performance, leading to inefficiencies and extended processing times.
The challenge lies in balancing these three parameters to achieve the best trade-off between latency and network traffic. Setting the delay threshold too low may result in frequent dispatches of incomplete batches, increasing requests, while a high delay threshold could lead to idle periods and unnecessary message-sending delays.
Managing Message Flow in Batching
To optimize message batching in Pub/Sub, implementing flow control is crucial, particularly with high message generation rates. Without effective flow control, you risk overwhelming memory or having outdated messages, which may trigger a DEADLINE_EXCEEDED error if messages are produced faster than the publisher can send them.
Flow control can be fine-tuned using three main parameters: 1. Buffer Size in Bytes (`setMaxOutstandingRequestBytes`): Defines the maximum buffer size for batching messages in bytes, helping to manage memory usage. 2. Buffer Size in Terms of Events (`setMaxOutstandingElementCount`): Caps the number of messages in the buffer, ensuring it doesn’t exceed a specified count. 3. Behavior When Limits are Exceeded (`setLimitExceededBehavior`): Determines the action when limits are surpassed, with options to ignore the limit, throw an exception, or block until sufficient space becomes available.
These parameters collectively enhance the batching process's efficiency, preventing bottlenecks and ensuring a smooth flow of messages.
Conclusion
Batching messages in Pub/Sub can significantly improve application performance, but it requires careful parameter adjustments. Understanding your application’s performance is crucial before enabling batching. Ensure that no other constraints affect performance before implementing batching settings.
Finding optimal batching settings in my test scenario was a time-consuming endeavor, and real-world situations can be notably more complex.
Additionally, batching is generally more effective with consistent event streams; it may not perform as well during sudden traffic spikes.