We just released Apache Kafka 0.8.1. There are a couple of interesting things in this release.
The biggest new feature is log compaction. This allows de-duplicating the data in the partitions of a Kafka topic by primary key. This makes Kafka an excellent backend for really massive partitioned event sourcing, database data replication, and other log-centric architectures that model mutable data. This feature also acts as the basis for stateful stream processing in Samza. More details on how you can use this here.
This release also included a big cleanup of the log layer. Most noticeably this does a much better job of managing fsyncs. For those who like smooth latency graphs under load with no performance tuning this is a big win.
We also improved a lot of operational activities that previously required bouncing the brokers. We added commands for adding partitions and deleting topics online. Documentation for using this is here.
We also made all per-topic configurations dynamically managed. This means you can change the retention on a topic, or change it’s segment file size with a simple command and no bouncing brokers. These configs are documented here.
We added functionality to automatically balance leadership amongst brokers (previously you had to run a special command to make this happen). You can enabled this by setting
auto.leader.rebalance.enable=true. We also added code to have leaders proactively transfer leadership during intentional shutdowns, this is more graceful than the transfer that happens with a hard kill. You can enable this with
controlled.shutdown.enable=true. We will be enabling both by default once we have a little more experience using them.
There are also dozens of bug fixes and minor improvements.
This is an in-place, no-downtime release. You shouldn’t need to do much other than push out the updated code and do a rolling bounce on your servers. However you may want to glance over the new configs and tools first.
As usual we have been running pre-release versions of this code at LinkedIn for several months now so it should be pretty stable. But if you see anything unexpected please let us know.
So what’s next?
The Kafka committers have been working on a bunch of exciting new stuff. There will probably be a 0.8.2 release in the next month or so with improved consumer offset management (built on top of our new log compaction support) as well as a beta version of a completely rewritten Kafka producer. The final version of the producer and consumer will be in 0.9. If you have thoughts on the api or feature set for the producer or consumer we have been actively discussing them on the mailing list and would love to hear people’s thoughts.