kafka-0.8.2 新特性

生产上逐渐开始使用kafka-0.8.2.2,看了下release notes,没有大的变化,主要是bug的修复和功能的完善。学习下Neha Narkhede两年前分享的文章“whats coming in apache kafka-0.8.2”,一些新特性如下,同时也浏览了下0.8.2.x的文档

1、New Producer

producer不再区分同步(sync)和异步方式(async),所有的请求以异步方式发送,这样提升了客户端效率。producer请求会返回一个应答对象,包括偏移量或者错误信。这种异步方地批量的发送消息到kafka broker节点,因而可以减少server端资源的开销。新的producer和所有的服务器网络通信都是异步地,在ack=-1模式下需要等待所有的replica副本完成复制时,可以大幅减少等待时间。

2、Delete Topic

在0.8.2之前,kafka删除topic的功能存在bug。具体的操作和配置如下,

/apps/svr/kafka/bin/kafka-topics.sh –zookeeper chenqun-zookeeper-001.idc.vip.com:/vdpkafka –delete –topic chenqun
Topic chenqun is marked for deletion.
Note: This will have no impact if delete.topic.enable is not set to true.

3、Offset Management

在0.8.2之前,comsumer定期提交已经消费的kafka消息的offset位置到zookeeper中保存。对zookeeper而言,每次写操作代价是很昂贵的,而且zookeeper集群是不能扩展写能力的。

在0.8.2开始,可以吧comsumer提交的offset记录在compacted topic(__comsumer_offsets)中,该topic设置最高级别的持久化保证,即ack=-1。

__consumer_offsets由一个三元组< comsumer group, topic, partiotion> 组成的key和offset值组成,在内存也维持一个最新的视图view,所以读取很快。

新的功能使得kafka可以频繁的对offset做检查点checkpoint,即使每消费一条消息提交一次offset。

4、Automated Leader Rebalancing

在0.8.1中,已经实验性的加入这个功能,0.8.2中可以广泛使用。auto rebalancing的功能主要解决broker节点重启后,leader partition在broker节点上分布不均匀,比如会导致部分节点网卡流量过高,负载比其他节点高出很多。auto rebalancing主要配置如下,

auto.leader.rebalance.enable(false), If this is enabled the controller will automatically try to balance leadership for partitions among the brokers by periodically returning leadership to the “preferred” replica for each partition if it is available.

leader.imbalance.per.broker.percentage(10), The percentage of leader imbalance allowed per broker. The controller will rebalance leadership if this ratio goes above the configured value per broker.

leader.imbalance.check.interval.seconds(300), The frequency with which to check for leader imbalance.

5、Controlled Shutdown

controlled.shutdown.enable ,是否在在关闭broker时主动迁移leader partition。

基本思想是每次kafka接收到关闭broker进程请求时,主动把leader partition迁移到其存活节点上,即follow replica提升为新的leader partition。

如果没有开启这个参数,集群等到replica会话超时,controller节点才会重现选择新的leader partition,这些leader partition在这段时间内也不可读写。如果集群非常大或者partition 很多,partition不可用的时间将会比较长。


6、Stronger Durability Guarantee

主要是两方面的增强:

1)可以关闭unclean leader election,也就是不在ISR(IN-Sync Replica)列表中的replica,不会被提升为新的leader partition。unclean.leader.election=false时,kafka集群的持久化力大于可用性,如果ISR中没有其它的replica,会导致这个partition不能读写。

2)设置min.isr(默认值1)和 producer使用ack=-1,提高数据写入的持久性。当producer设置了ack=-1,如果broker发现ISR中的replica个数小于min.isr的值,broker将会拒绝producer的写入请求。

min.insync.replicas, When a producer sets acks to “all” (or “-1″), min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend).When used together, min.insync.replicas and acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with acks of “all”. This will ensure that the producer raises an exception if a majority of replicas do not receive a write.

7、Connection Quotas

max.connections.per.ip限制每个客户端ip发起的连接数,避免broker节点文件句柄被耗光。

兼容性和升级
Upgrading from 0.8.1 to 0.8.2.0
0.8.2.0 is fully compatible with 0.8.1. The upgrade can be done one broker at a time by simply bringing it down, updating the code, and restarting it.

Upgrading from 0.8.0 to 0.8.1
0.8.1 is fully compatible with 0.8. The upgrade can be done one broker at a time by simply bringing it down, updating the code, and restarting it.

此条目发表在kafka分类目录,贴了, 标签。将固定链接加入收藏夹。

发表评论

电子邮件地址不会被公开。 必填项已用*标注

您可以使用这些HTML标签和属性: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>