Message with Custom Partition Logic

August 11, 2014, 1:21 pm

HI Kafka Dev Team,

We have to aggregate events (count) per DC and across DCs for one of topic.
We have standard Linked-in data pipe line producers --> Local Brokers -->
MM --> Center Brokers.

So I would like to know How MM handles messages when custom partitioning
logic is used as below and number of partition in target DC is SAME vs
different
than the source DC ?

If we have key based messages and custom partitioning logic ( hash(key) %
number of partition per topic source topic) we want to count event similar
event by hashing to same partition and count events, and but when same
event is MM to target DC will it go to same partition even though number of
partition is different in target DC (meaning does MM will use hash(key
message) % number of partition) ?

According to this reference, I do not have way to configure this or to
control which partitioning logic to use when MM data ?

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330

Thanks,

Bhavesh

↧

Issue with 240 topics per day

August 11, 2014, 2:19 pm

≫ Next: Consumer Parallelism

≪ Previous: Message with Custom Partition Logic

Folks,
Is there any potential issue with creating 240 topics every day? Although
the retention of each topic is set to be 2 days, I am a little concerned
that since right now there is no delete topic api, the zookeepers might be
overloaded.
Thanks,
Chen

↧

Consumer Parallelism

August 11, 2014, 2:33 pm

≫ Next: kafka TestUtils createBrokerConfig issue

≪ Previous: Issue with 240 topics per day

Hi,

We are using the following method on ConsumerConnector to get multiple
streams per topic, and we have multiple partitions per topic. It looks like
only one of the runnable is active through a relative long time period. Is
there anything we could possible missed?

public <K,V> Map<String, List<KafkaStream<K,V>>>
createMessageStreams(Map<String, Integer> topicCountMap, Decoder<K>
keyDecoder, Decoder<V> valueDecoder);

Then we loop through the streams and create multiple runnable to consumer
the data.

for (KafkaStream<Object, Object> kafkaStream : streams) {
ConsumerRunnable consumerRunnable = runnableBuilder.get()
.setMessageConsumer(consumer)
.setKafkaStream(kafkaStream)
.build();

executor.submit(consumerRunnable);

Best Regards,
Mingtao

↧

kafka TestUtils createBrokerConfig issue

August 12, 2014, 8:14 am

≫ Next: Using the kafka dissector in wireshark/tshark 1.12

≪ Previous: Consumer Parallelism

Trying to write a unit test case for Kafka, and stuck with strange
createBrokerConfig issue.
on TestUtils.createBrokerConfigs(1) it gives me compilation error

The method createBrokerConfigs(int, boolean) in the type TestUtils is
not applicable for the arguments (int)

When I looked into the scala source code, it contains only two methods:

createBrokerConfigs(int) //
https://github.com/apache/kafka/blob/0.8.1/core/src/test/scala/unit/kafka/utils/TestUtils.scala#L125
createBrokerConfigs(int, int)
//https://github.com/apache/kafka/blob/0.8.1/core/src/test/scala/unit/kafka/utils/TestUtils.scala#L137

Dont understand where this is coming from.

↧

Using the kafka dissector in wireshark/tshark 1.12

August 12, 2014, 11:03 am

≫ Next: Blocking Recursive parsing from kafka.consumer.TopicCount$.constructTopicCount

≪ Previous: kafka TestUtils createBrokerConfig issue

I'd seen references to there being a Kafka protocol dissector built into wireshark/tshark 1.12, but what I could find on that was a bit light on the specifics as to how to get it to do anything -- at least for someone (like me) who might use tcpdump a lot but who doesn't use tshark a lot.

I got this working, so I figured I'd post a few pointers here on the off-chance that they save someone else a bit of time.

Note that I'm using tshark, not wireshark; this might be easier and/or different in wireshark, but I don't feel like moving many gigabytes of data to a place where I can use wireshark. (-:

If you're reading traffic live, you'll want to do something like this:

tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -f 'dst port 9092' -Y (kafka options)

For example, if you want to see output only for ProduceRequest and ProduceResponses, and only for the topic "mytopic", you can do:

tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -f 'dst port 9092' -Y 'kafka.topic_name==mytopic & kafka.request_key==0'

You can get a complete list of Kafka-related fields by doing:

tshark -G fields | grep -i kafka

There is a very significant downside to processing packets live: tshark uses dumpcap to generate the actual packets, and unless I'm missing some obscure tshark option (which is possible!) it won't toss old data. So if you run this for a few hours, you'll end up with a ginormous file.

By default (under Linux, at least) tshark is going to put that file in /tmp, so if your /tmp is small and/or a tmpfs that can make things a little exciting. You can get around that by doing:

(export TMPDIR=/big/damn/filesystem ; tshark bla bla bla)

which I figure given typical Kafka data volumes is probably pretty important to know, and which doesn't seem to be documented in the tshark man pages. It is at least not all that hard to search for.

In theory, you can use the tshark "-b" option to specify a ring buffer of files, even for real-time processing, though:

* adding -b anything (e.g., "-b files:1 -b filesize:1024") seems to want to force you to use -w (filename)

* just adding -b and -w to the invocation above gets a warning about display filters not being supported when capturing and saving packets

* changing -Y to -2 -R and/or adding -P doesn't seem to help

(though again someone with more tshark experience might know the magic combination of arguments to get this to do what it's told).

So instead, you can capture packets somewhere, e.g.:

tcpdump -n -s 0 -w /var/tmp/kafka.tcpd -i eth1 'port 9092'

and then decode them later:

tshark -V -r /var/tmp/kafka.tcpd -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -R 'kafka.topic_name==mytopic & kafka.request_key==0' -2

Anyway, if you're seeing protocol-related weirdness, hopefully this will be at least of some help to you.

-Steve
(Yes, the email address is a joke. Just not on you! It does work.)

↧

Blocking Recursive parsing from kafka.consumer.TopicCount$.constructTopicCount

August 12, 2014, 11:48 am

≫ Next: Correct way to handle ConsumerTimeoutException

≪ Previous: Using the kafka dissector in wireshark/tshark 1.12

Hi All,
We have a typical cluster of 3 kafka instances backed by 3 zookeeper instances (kafka version 0.8.1.1, scala version 2.10.3, java version 1.7.0_65). On consumer end, when some of our consumers were getting recycled, we found a troubling recursion which was taking a busy lock and blocking our consumers thread pool. I'll appreciate if anyone can provide an insight on how to mitigate the blocking logic. On the zookeeper util front (ZkUtils.scala) is there a possibility to switch to a better JSON parser based upon this finding http://engineering.ooyala.com/blog/comparing-scala-json-libraries?
Thanks,jsh
-------------------------Recursive BLOCKING thread-------------------------------"Sa863f22b1e5hjh6788991800900b34545c_profile-a-prod1-s-140789080845312-c397945e8_watcher_executor" prio=10 tid=0x00007f24dc285800 nid=0xda9 runnable [0x00007f249e40b000] java.lang.Thread.State: RUNNABLE at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.p$7(Parsers.scala:722) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.continue$1(Parsers.scala:726) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:737) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:721) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.lexical.Scanners$Scanner.<init>(Scanners.scala:49) at scala.util.parsing.combinator.lexical.Scanners$Scanner.rest(Scanners.scala:60) at scala.util.parsing.combinator.lexical.Scanners$Scanner.rest(Scanners.scala:44) at scala.util.parsing.combinator.Parsers$$anonfun$acceptIf$1.apply(Parsers.scala:608) at scala.util.parsing.combinator.Parsers$$anonfun$acceptIf$1.apply(Parsers.scala:606) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:736) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:721) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at scala.util.parsing.json.JSON$.parseRaw(JSON.scala:54) at scala.util.parsing.json.JSON$.parseFull(JSON.scala:68) at kafka.utils.Json$.liftedTree1$1(Json.scala:37) at kafka.utils.Json$.parseFull(Json.scala:36) - locked <0x00000000c5a7cdd8> (a java.lang.Object) at kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:56) at kafka.utils.ZkUtils$$anonfun$getConsumersPerTopic$1.apply(ZkUtils.scala:678) at kafka.utils.ZkUtils$$anonfun$getConsumersPerTopic$1.apply(ZkUtils.scala:677) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.utils.ZkUtils$.getConsumersPerTopic(ZkUtils.scala:677) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:437) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:408) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:402) - locked <0x00000000d3a7e1d0> (a java.lang.Object)

at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:355)
-------------many BLOCKED Threads had stack trace-------------------"application-my-context-105" prio=10 tid=0x00007f24e45aa800 nid=0x3494 waiting for monitor entry [0x00007f24afe26000] java.lang.Thread.State: BLOCKED (on object monitor) at kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:161) - waiting to lock <0x00000000d3a7e1d0> (a java.lang.Object) at kafka.javaapi.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:110) at com.custom1.Consumer.cancel(Consumer.java:312) at com.custom1.Consumer.run(Consumer.java:302) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

↧

Correct way to handle ConsumerTimeoutException

August 12, 2014, 2:27 pm

≫ Next: Consuming messages from Kafka and pushing on to a JMS queue

≪ Previous: Blocking Recursive parsing from kafka.consumer.TopicCount$.constructTopicCount

Folks,
I am using consumer.timeout.ms to force a consumer jump out hasNext call,
which will throw ConsumerTimeoutException. It seems that upon receiving
this exception, the consumer is no longer usable and I need to call
.shutdown, and recreate:

try{
} catch (ConsumerTimeoutException ex) {

logger.info("consumer timeout, we consider the topic is drained");

this.consumer.shutdown();

this.consumer = kafka.consumer.Consumer

.createJavaConsumerConnector(new ConsumerConfig(

this.consumerProperties));

Is this the expected behavior? I call

this.consumer = kafka.consumer.Consumer

.createJavaConsumerConnector(new ConsumerConfig(

this.consumerProperties));

in the thread initialization phase, and hope to reuse it upon
ConsumerTimeoutException

Thanks,

Chen

↧

Consuming messages from Kafka and pushing on to a JMS queue

August 13, 2014, 6:06 am

≫ Next: consumer read from specific partition

≪ Previous: Correct way to handle ConsumerTimeoutException

Hi,

We have an application that currently uses Camel to forward a JMS message
from HornetQ to multiple consumers (JMS queues).

The one challenge we have is handling failure of one of the consumers.
Kafka seems to provide a solution here by allowing different consumers to
keep track of their own offset.

However we'd prefer to ultimately "push" the messages to each end point and
not use Kafka's "pull" model as it requires implementing the Kafka's
Consumer API. One approach could be to write intermediary consumers which
forward each message onto the appropriate JMS queue. Is this a good
approach or is there a better way to do this (e.g. use Apache Storm)?

Thanks

Andrew

↧

consumer read from specific partition

August 14, 2014, 1:46 am

≫ Next: How many threads should I use per topic

≪ Previous: Consuming messages from Kafka and pushing on to a JMS queue

Hi,

Suppose I have N partitions. I would like to have X different consumer
threads ( X < N) read from a specified set of partitions. How can I achieve
this?

Thanks,

Josh

↧

How many threads should I use per topic

August 14, 2014, 10:39 am

≫ Next: erlang client for 0.8?

≪ Previous: consumer read from specific partition

Hey, Guys,
I am using the high level consumer. I have a daemon process that checks the
lag for a topic.
Suppose I have a topic with 5 partitions, and partition 0, 1 has lag of 0,
while the other 3 all have lags. In this case, should I best start 3
threads, or 5 threads to read from this topic again to achieve best
performance?
I am currently using 3 threads in this case, but it seems that each thread
still first try to get hold of partition0, 1 first.(which seems unnecessary
in my case)

Another question is that I am currently using a signal thread to spawn
different thread to read from kafka. So if a topic has 5 partitions, 5
signals will be sent, and 5 different threads will start polling the topic
in the following manner:

Map<String, Integer> topicCountMap = new HashMap<String, Integer>();

topicCountMap.put(kafkaTopic, new* Integer(1))*;

Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer

.createMessageStreams(topicCountMap);

List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(kafkaTopic);

KafkaStream<byte[], byte[]> stream = streams.*get(0);*

ConsumerIterator<byte[], byte[]> it = stream.iterator();

As you can see, I specify *Integer(1) to ensure there is only one stream in
the polling thread.*

But in my testing, I am using:

Map<String, Integer> topicCountMap = new HashMap<String, Integer>();

topicCountMap.put(topic, *new Integer(numOfPartitions));*

Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer

.createMessageStreams(topicCountMap);

List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);

executor = Executors.newFixedThreadPool(*numOfPartitions*);

for (final KafkaStream stream : streams) {

executor.submit(new ConsumerTest(stream, threadNumber,this.targetTopic));

threadNumber++;

Are these two methods fundamentally the same, or the later one is
preferred?

Thanks much!

Chen

↧

erlang client for 0.8?

August 14, 2014, 10:50 am

≫ Next: How to perform a controlled shutdown for rolling bounce?

≪ Previous: How many threads should I use per topic

anyone aware of any 0.8 compatible erlang client? thanks

↧

How to perform a controlled shutdown for rolling bounce?

August 14, 2014, 11:10 am

≫ Next: Writing to Kafka

≪ Previous: erlang client for 0.8?

Running 0.8.1 and am unable to do a controlled shutdown as part of a
rolling bounce.

Is this the primary reference for this task?

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.ControlledShutdown

I've set the config to enable controlled shutdown.

controlled.shutdown.enable=true
controlled.shutdown.max.retries=3
controlled.shutdown.retry.backoff.ms=5000

Before shutting down the first broker, topics looks like:

Topic:events PartitionCount:2 ReplicationFactor:1 Configs:
Topic: events Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: events Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic:failure PartitionCount:2 ReplicationFactor:1 Configs:
Topic: failure Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: failure Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic:retry PartitionCount:2 ReplicationFactor:1 Configs:
Topic: retry Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: retry Partition: 1 Leader: 3 Replicas: 3 Isr: 3

I then executed the bin/kafka-server-stop.sh program.

After that, the topics look like:

Topic:events PartitionCount:2 ReplicationFactor:1 Configs:
Topic: events Partition: 0 Leader: -1 Replicas: 1 Isr:
Topic: events Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic:failure PartitionCount:2 ReplicationFactor:1 Configs:
Topic: failure Partition: 0 Leader: -1 Replicas: 1 Isr:
Topic: failure Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic:retry PartitionCount:2 ReplicationFactor:1 Configs:
Topic: retry Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: retry Partition: 1 Leader: 3 Replicas: 3 Isr: 3

What does the -1 for Leader and blank Isr indicate? Do I need to run
something else for the leader election to occur? I thought that was
automatic with the controlled shutdown enabled. Is there a different
shutdown command to issue?

Thanks!
Ryan

↧

Writing to Kafka

August 14, 2014, 12:03 pm

≫ Next: Consumer sensitive expiration of topic

≪ Previous: How to perform a controlled shutdown for rolling bounce?

Hi,

I started using kafka with Samza. I’m trying to run a test that is supposed to create messages and write to a kafka topic.
In my test I start writing a small amount of messages and it should grow up to 33200/sec.

1. With this amount of messages can one broker handle the messages or should I use more than one?
2. Using more than one broker, the producer should have a list with brokers or just point to one and the messages sent to a different broker?
3. I have a java producer that just creates numbers and sends them to kafka, can kafka handle many producers writing to the same topic at once?

Thanks,

↧

Consumer sensitive expiration of topic

August 14, 2014, 2:56 pm

≫ Next: JMSXGroupID clustered message grouping

≪ Previous: Writing to Kafka

Hi,

I'm playing around with Kafka with the idea to implement a general purpose message exchanger for a distributed application with high throughput requirements (multiple hundred thousand messages per sec).

In this context, i would like to be able to use a topic as some form of private mailbox for a single consumer group. In this situation, once the single consumer group has committed its offset on its private topic, the messages there won't be used anymore and can be safely discarded. Therefore, i was wondering if you'd see a way (in the current release or in the future) to have a topic which expiration policy is based on consumer offsets.

Thanks,

↧

JMSXGroupID clustered message grouping

August 15, 2014, 9:42 am

≫ Next: How try out the new producer?

≪ Previous: Consumer sensitive expiration of topic

Hi,

When using multiple partitions and consumers, can Kafka ensure that JMS
messages with the same JMSXGroupID are handled by the same consumer? Or
does this have to be implemented when writing the Producers and Consumers?

For reference, this is in comparison to the clustered message grouping
feature provided by HornetQ:
http://docs.jboss.org/hornetq/2.4.0.Final/docs/user-manual/html/message-grouping.html
"Message groups are useful when you want all messages for a certain value
of the property to be processed serially by the same consumer."

Thanks

Andrew

↧

How try out the new producer?

August 15, 2014, 11:00 am

≫ Next: Embedded Kafka/Zookeeper for unit testing

≪ Previous: JMSXGroupID clustered message grouping

Hi,

I want to try out the new producer api
(org.apache.kafka.clients.producer.KafkaProducer) but found that it's not
in the published jar.

What's the best way to get it? Build from source from the 0.8.1.1 tag?
Any flags I need to set to include the new producer in the jar?

Thanks,

Roger

↧

Embedded Kafka/Zookeeper for unit testing

August 16, 2014, 7:13 am

≫ Next: ZkClient bug can bring down broker/consumer on zookeeper push in EC2 environment

≪ Previous: How try out the new producer?

Hi all,

If someone is searching for some up-to-date examples of embedded
kafka/zookeeper servers, like for unit testing, maybe will find this
short blog post useful:
http://pannoniancoder.blogspot.com/2014/08/embedded-kafka-and-zookeeper-for-unit.html

Cheers,
Vjeran

↧

ZkClient bug can bring down broker/consumer on zookeeper push in EC2 environment

August 17, 2014, 11:41 am

≫ Next: Consumer Parallelism

≪ Previous: Embedded Kafka/Zookeeper for unit testing

Recently, we found the serious ZkClient bug, actual Apache Zookeeper client
bug, which can bring down broker/consumer on zookeeper push.

We're running kafka and zookeeeper in AWS EC2 environment. Zookeeper
instances are bound with EIP to give the static hostname for each instance,
which means even if the EC2 instance is terminated and replaced with the
new one, it will have the same hostname but its private IP bound to the
hostname can be changed.

The scenario is, if we do rolling push all zookeeper server instances by
terminating and waiting until the new instance joins to the quorum one by
one, finally, ZkClient will try to connect to the old IP addresses which do
not exist any more due to DNS caching on Apache Zookeeper client side,
please refer to https://issues.apache.org/jira/browse/ZOOKEEPER-338

So, we need to restart kafka brokers and consumers to refresh DNS cache. To
solve this problem, I sent the following pull request to ZkClient,
https://github.com/sgroschupf/zkclient/pull/26

Please review the above PR. If new version of ZkClient with the following
fix is not released on the schedule of kafka 0.8.2 release, I'd like kafka
to ship the internally built ZkClient with the fix. I will really
appreciate.

Thank you
Best, Jae

↧