Show HN: Fluvio 38.8x faster than Kafka
11 points by debadyutirc 4 months ago | 2 comments- lvboudre 4 months agoHi, I appreciate the effort you put into running benchmarks and writing a blog post about it.
However, I wanted to share some issues I found with your benchmarking approach that I believe are worth addressing:
1. Testing on a MacBook laptop is not a good idea due to thermal throttling. At some point, the numbers become meaningless.
2. I am not very familiar with Graviton CPUs, and after checking the AWS website, it is not clear to me whether they are virtualized. Since they are labeled as "vCPUs," I assume they are virtualized. Virtualized CPUs are not ideal for benchmarking because they can suffer from work-stealing and noisy neighbor effects.
3. The replication factor in Kafka's "Getting Started" guide is set to 1, which is also the case for Fluvio. However, in real-world scenarios, RF=3 is typically used. A more representative benchmark should include RF=3.
4. You mentioned: "Given that Apache Kafka is the standard in distributed streaming, and it’s possible for intelligent builders to extrapolate the comparable RedPanda performance." However, this is not accurate. RedPanda uses a one-thread-per-core model with Direct I/O, which results in significantly better performance.
How to Address These Issues:
1. It would be preferable to test on a bare-metal server-grade CPU rather than virtualized hardware, such as i3.metal instances on AWS. 2. Run the benchmark with RF=3 to reflect real-world usage more accurately. 3. It would be more insightful to compare against RedPanda, as both Fluvio and RedPanda use non-garbage-collected programming languages. The goal should be to evaluate how well Fluvio scales with increasing CPU counts.
Cheers.
- debadyutirc 4 months agoAgree with you 100%. We are working on more elaborate benchmarking on bare metal instances. This was just an initial run to utilize the benchmarking tool which is usable by all fluvio users.
We will do a full setup and benchmarks comparing Kafka, Pulsar, RedPanda using a real dataset on barmetal servers soon.
- debadyutirc 4 months ago
- 4 months ago
- 4 months ago