Quantcast
Channel: Kafka Timeline
Viewing all 1519 articles
Browse latest View live

Using the kafka dissector in wireshark/tshark 1.12

$
0
0
I'd seen references to there being a Kafka protocol dissector built into wireshark/tshark 1.12, but what I could find on that was a bit light on the specifics as to how to get it to do anything -- at least for someone (like me) who might use tcpdump a lot but who doesn't use tshark a lot.

I got this working, so I figured I'd post a few pointers here on the off-chance that they save someone else a bit of time.

Note that I'm using tshark, not wireshark; this might be easier and/or different in wireshark, but I don't feel like moving many gigabytes of data to a place where I can use wireshark. (-:

If you're reading traffic live, you'll want to do something like this:

tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -f 'dst port 9092' -Y (kafka options)

For example, if you want to see output only for ProduceRequest and ProduceResponses, and only for the topic "mytopic", you can do:

tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -f 'dst port 9092' -Y 'kafka.topic_name==mytopic & kafka.request_key==0'

You can get a complete list of Kafka-related fields by doing:

tshark -G fields | grep -i kafka

There is a very significant downside to processing packets live: tshark uses dumpcap to generate the actual packets, and unless I'm missing some obscure tshark option (which is possible!) it won't toss old data. So if you run this for a few hours, you'll end up with a ginormous file.

By default (under Linux, at least) tshark is going to put that file in /tmp, so if your /tmp is small and/or a tmpfs that can make things a little exciting. You can get around that by doing:

(export TMPDIR=/big/damn/filesystem ; tshark bla bla bla)

which I figure given typical Kafka data volumes is probably pretty important to know, and which doesn't seem to be documented in the tshark man pages. It is at least not all that hard to search for.

In theory, you can use the tshark "-b" option to specify a ring buffer of files, even for real-time processing, though:

* adding -b anything (e.g., "-b files:1 -b filesize:1024") seems to want to force you to use -w (filename)

* just adding -b and -w to the invocation above gets a warning about display filters not being supported when capturing and saving packets

* changing -Y to -2 -R and/or adding -P doesn't seem to help

(though again someone with more tshark experience might know the magic combination of arguments to get this to do what it's told).

So instead, you can capture packets somewhere, e.g.:

tcpdump -n -s 0 -w /var/tmp/kafka.tcpd -i eth1 'port 9092'

and then decode them later:

tshark -V -r /var/tmp/kafka.tcpd -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -R 'kafka.topic_name==mytopic & kafka.request_key==0' -2

Anyway, if you're seeing protocol-related weirdness, hopefully this will be at least of some help to you.

-Steve
(Yes, the email address is a joke. Just not on you! It does work.)

Blocking Recursive parsing from kafka.consumer.TopicCount$.constructTopicCount

$
0
0
Hi All,
We have a typical cluster of 3 kafka instances backed by 3 zookeeper instances (kafka version 0.8.1.1, scala version 2.10.3, java version 1.7.0_65). On consumer end, when some of our consumers were getting recycled, we found a troubling recursion which was taking a busy lock and blocking our consumers thread pool. I'll appreciate if anyone can provide an insight on how to mitigate the blocking logic. On the zookeeper util front (ZkUtils.scala) is there a possibility to switch to a better JSON parser based upon this finding http://engineering.ooyala.com/blog/comparing-scala-json-libraries?
Thanks,jsh
-------------------------Recursive BLOCKING thread-------------------------------"Sa863f22b1e5hjh6788991800900b34545c_profile-a-prod1-s-140789080845312-c397945e8_watcher_executor" prio=10 tid=0x00007f24dc285800 nid=0xda9 runnable [0x00007f249e40b000] java.lang.Thread.State: RUNNABLE at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.p$7(Parsers.scala:722) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.continue$1(Parsers.scala:726) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:737) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:721) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.lexical.Scanners$Scanner.<init>(Scanners.scala:49) at scala.util.parsing.combinator.lexical.Scanners$Scanner.rest(Scanners.scala:60) at scala.util.parsing.combinator.lexical.Scanners$Scanner.rest(Scanners.scala:44) at scala.util.parsing.combinator.Parsers$$anonfun$acceptIf$1.apply(Parsers.scala:608) at scala.util.parsing.combinator.Parsers$$anonfun$acceptIf$1.apply(Parsers.scala:606) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:736) at scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:721) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at scala.util.parsing.json.JSON$.parseRaw(JSON.scala:54) at scala.util.parsing.json.JSON$.parseFull(JSON.scala:68) at kafka.utils.Json$.liftedTree1$1(Json.scala:37) at kafka.utils.Json$.parseFull(Json.scala:36) - locked <0x00000000c5a7cdd8> (a java.lang.Object) at kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:56) at kafka.utils.ZkUtils$$anonfun$getConsumersPerTopic$1.apply(ZkUtils.scala:678) at kafka.utils.ZkUtils$$anonfun$getConsumersPerTopic$1.apply(ZkUtils.scala:677) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.utils.ZkUtils$.getConsumersPerTopic(ZkUtils.scala:677) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:437) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:408) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:402) - locked <0x00000000d3a7e1d0> (a java.lang.Object)

at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:355)
-------------many BLOCKED Threads had stack trace-------------------"application-my-context-105" prio=10 tid=0x00007f24e45aa800 nid=0x3494 waiting for monitor entry [0x00007f24afe26000] java.lang.Thread.State: BLOCKED (on object monitor) at kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:161) - waiting to lock <0x00000000d3a7e1d0> (a java.lang.Object) at kafka.javaapi.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:110) at com.custom1.Consumer.cancel(Consumer.java:312) at com.custom1.Consumer.run(Consumer.java:302) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Correct way to handle ConsumerTimeoutException

$
0
0
Folks,
I am using consumer.timeout.ms to force a consumer jump out hasNext call,
which will throw ConsumerTimeoutException. It seems that upon receiving
this exception, the consumer is no longer usable and I need to call
.shutdown, and recreate:

try{
} catch (ConsumerTimeoutException ex) {

logger.info("consumer timeout, we consider the topic is drained");

this.consumer.shutdown();

this.consumer = kafka.consumer.Consumer

.createJavaConsumerConnector(new ConsumerConfig(

this.consumerProperties));

Is this the expected behavior? I call

this.consumer = kafka.consumer.Consumer

.createJavaConsumerConnector(new ConsumerConfig(

this.consumerProperties));

in the thread initialization phase, and hope to reuse it upon
ConsumerTimeoutException

Thanks,

Chen

Consuming messages from Kafka and pushing on to a JMS queue

$
0
0
Hi,

We have an application that currently uses Camel to forward a JMS message
from HornetQ to multiple consumers (JMS queues).

The one challenge we have is handling failure of one of the consumers.
Kafka seems to provide a solution here by allowing different consumers to
keep track of their own offset.

However we'd prefer to ultimately "push" the messages to each end point and
not use Kafka's "pull" model as it requires implementing the Kafka's
Consumer API. One approach could be to write intermediary consumers which
forward each message onto the appropriate JMS queue. Is this a good
approach or is there a better way to do this (e.g. use Apache Storm)?

Thanks

Andrew

consumer read from specific partition

$
0
0
Hi,

Suppose I have N partitions. I would like to have X different consumer
threads ( X < N) read from a specified set of partitions. How can I achieve
this?

Thanks,

Josh

How many threads should I use per topic

$
0
0
Hey, Guys,
I am using the high level consumer. I have a daemon process that checks the
lag for a topic.
Suppose I have a topic with 5 partitions, and partition 0, 1 has lag of 0,
while the other 3 all have lags. In this case, should I best start 3
threads, or 5 threads to read from this topic again to achieve best
performance?
I am currently using 3 threads in this case, but it seems that each thread
still first try to get hold of partition0, 1 first.(which seems unnecessary
in my case)

Another question is that I am currently using a signal thread to spawn
different thread to read from kafka. So if a topic has 5 partitions, 5
signals will be sent, and 5 different threads will start polling the topic
in the following manner:

Map<String, Integer> topicCountMap = new HashMap<String, Integer>();

topicCountMap.put(kafkaTopic, new* Integer(1))*;

Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer

.createMessageStreams(topicCountMap);

List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(kafkaTopic);

KafkaStream<byte[], byte[]> stream = streams.*get(0);*

ConsumerIterator<byte[], byte[]> it = stream.iterator();

As you can see, I specify *Integer(1) to ensure there is only one stream in
the polling thread.*

But in my testing, I am using:

Map<String, Integer> topicCountMap = new HashMap<String, Integer>();

topicCountMap.put(topic, *new Integer(numOfPartitions));*

Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer

.createMessageStreams(topicCountMap);

List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);

executor = Executors.newFixedThreadPool(*numOfPartitions*);

for (final KafkaStream stream : streams) {

executor.submit(new ConsumerTest(stream, threadNumber,this.targetTopic));

threadNumber++;

Are these two methods fundamentally the same, or the later one is
preferred?

Thanks much!

Chen

erlang client for 0.8?

$
0
0
anyone aware of any 0.8 compatible erlang client? thanks

How to perform a controlled shutdown for rolling bounce?

$
0
0
Running 0.8.1 and am unable to do a controlled shutdown as part of a
rolling bounce.

Is this the primary reference for this task?

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.ControlledShutdown

I've set the config to enable controlled shutdown.

controlled.shutdown.enable=true
controlled.shutdown.max.retries=3
controlled.shutdown.retry.backoff.ms=5000

Before shutting down the first broker, topics looks like:

Topic:events PartitionCount:2 ReplicationFactor:1 Configs:
Topic: events Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: events Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic:failure PartitionCount:2 ReplicationFactor:1 Configs:
Topic: failure Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: failure Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic:retry PartitionCount:2 ReplicationFactor:1 Configs:
Topic: retry Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: retry Partition: 1 Leader: 3 Replicas: 3 Isr: 3

I then executed the bin/kafka-server-stop.sh program.

After that, the topics look like:

Topic:events PartitionCount:2 ReplicationFactor:1 Configs:
Topic: events Partition: 0 Leader: -1 Replicas: 1 Isr:
Topic: events Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic:failure PartitionCount:2 ReplicationFactor:1 Configs:
Topic: failure Partition: 0 Leader: -1 Replicas: 1 Isr:
Topic: failure Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic:retry PartitionCount:2 ReplicationFactor:1 Configs:
Topic: retry Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: retry Partition: 1 Leader: 3 Replicas: 3 Isr: 3

What does the -1 for Leader and blank Isr indicate? Do I need to run
something else for the leader election to occur? I thought that was
automatic with the controlled shutdown enabled. Is there a different
shutdown command to issue?

Thanks!
Ryan

Writing to Kafka

$
0
0
Hi,

I started using kafka with Samza. I’m trying to run a test that is supposed to create messages and write to a kafka topic.
In my test I start writing a small amount of messages and it should grow up to 33200/sec.

1. With this amount of messages can one broker handle the messages or should I use more than one?
2. Using more than one broker, the producer should have a list with brokers or just point to one and the messages sent to a different broker?
3. I have a java producer that just creates numbers and sends them to kafka, can kafka handle many producers writing to the same topic at once?

Thanks,

Consumer sensitive expiration of topic

$
0
0
Hi,

I'm playing around with Kafka with the idea to implement a general purpose message exchanger for a distributed application with high throughput requirements (multiple hundred thousand messages per sec).

In this context, i would like to be able to use a topic as some form of private mailbox for a single consumer group. In this situation, once the single consumer group has committed its offset on its private topic, the messages there won't be used anymore and can be safely discarded. Therefore, i was wondering if you'd see a way (in the current release or in the future) to have a topic which expiration policy is based on consumer offsets.

Thanks,

JMSXGroupID clustered message grouping

$
0
0
Hi,

When using multiple partitions and consumers, can Kafka ensure that JMS
messages with the same JMSXGroupID are handled by the same consumer? Or
does this have to be implemented when writing the Producers and Consumers?

For reference, this is in comparison to the clustered message grouping
feature provided by HornetQ:
http://docs.jboss.org/hornetq/2.4.0.Final/docs/user-manual/html/message-grouping.html
"Message groups are useful when you want all messages for a certain value
of the property to be processed serially by the same consumer."

Thanks

Andrew

How try out the new producer?

$
0
0
Hi,

I want to try out the new producer api
(org.apache.kafka.clients.producer.KafkaProducer) but found that it's not
in the published jar.

What's the best way to get it? Build from source from the 0.8.1.1 tag?
Any flags I need to set to include the new producer in the jar?

Thanks,

Roger

Embedded Kafka/Zookeeper for unit testing

$
0
0
Hi all,

If someone is searching for some up-to-date examples of embedded
kafka/zookeeper servers, like for unit testing, maybe will find this
short blog post useful:
http://pannoniancoder.blogspot.com/2014/08/embedded-kafka-and-zookeeper-for-unit.html

Cheers,
Vjeran

ZkClient bug can bring down broker/consumer on zookeeper push in EC2 environment

$
0
0
Recently, we found the serious ZkClient bug, actual Apache Zookeeper client
bug, which can bring down broker/consumer on zookeeper push.

We're running kafka and zookeeeper in AWS EC2 environment. Zookeeper
instances are bound with EIP to give the static hostname for each instance,
which means even if the EC2 instance is terminated and replaced with the
new one, it will have the same hostname but its private IP bound to the
hostname can be changed.

The scenario is, if we do rolling push all zookeeper server instances by
terminating and waiting until the new instance joins to the quorum one by
one, finally, ZkClient will try to connect to the old IP addresses which do
not exist any more due to DNS caching on Apache Zookeeper client side,
please refer to https://issues.apache.org/jira/browse/ZOOKEEPER-338

So, we need to restart kafka brokers and consumers to refresh DNS cache. To
solve this problem, I sent the following pull request to ZkClient,
https://github.com/sgroschupf/zkclient/pull/26

Please review the above PR. If new version of ZkClient with the following
fix is not released on the schedule of kafka 0.8.2 release, I'd like kafka
to ship the internally built ZkClient with the fix. I will really
appreciate.

Thank you
Best, Jae

Keep on getting kafka.common.OffsetOutOfRangeException: Random times

$
0
0
Hi Team,

Of late I am facing strange issue w.r.t Kafka. Random times I keep on getting these strange errors while consuming the topic:

kafka.common.OffsetOutOfRangeException: Request for offset 19 but we only have log segments in the range 0 to 0.
Sometimes I get like this:

kafka.common.OffsetOutOfRangeException: Request for offset 19 but we only have log segments in the range 19 to 22.

That number keeps on changing (with random ranges). I don't know what is the problem here. Both producer and consumer will work perfectly, but I keep on getting these errors randomly. In that situation if I clear the logs, remove the broker again it starts working fine again.

Can anyone please help me in this regard? This is affecting our application stability, if any more information required I can provide, also we are using only the defaults provided by the kafka we didn't changed any settings.

Thanks,
Pradeep Simha
Technical Lead

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com

kafka contrib/hadoop-consumer,producer

$
0
0
Hi,
I would like to know the status regarding
contrib/hadoop-consumer,hadoop-producer code. Is this considered
to be deprecated or there is still interest in these projects. Is
casmus the current actively developed project for writing data
from kafka to hdfs. If there is any interest in the above projects
from the community I am hoping to work on documentation and the
code as there isn't much activity happened on these projects.
Thanks,
Harsha

How to shutdown/restart kafka cluster properly

$
0
0
Hi,

Sorry if this had been answered before. Although, I couldn't find any
information besides "controlled shutdown of broker", which, I believe
not fully applies here.

Could anyone suggest what would be the safest strategy to shut down
kafka cluster? Should brokers be brought down one-by-one or
simultaneously? Does this strategy change depend on whether
"controlled shutdown" is on?

According to my brief observation, ad-hoc broker shutdown (mix and
match INT, TERM and KILL signals to the most resilient brokers)
results in prolonged cluster start up, with a long chain of
"Recovering unflushed segment 0" seen on some brokers. Is it expected
to see these after each shutdown/restart?

Thanks in advance,

Using kafka in non million users environment

$
0
0
Hello,

I'm managing a study to explore possibilities for migrating a monolith
architecture IT to a service oriented one.
That said, the company i'm working for is not a web based company, so
the number of user (i.e: the load) is not the heart of the issue, but
data/services diversity is.

I'm truly interested in Kafka as the main data pipeline because it'll
enable us abstracting storages from services, and as a consequence
having a better data management.
I would like to engineer a proof of concept around Kafka and Samza but
before going further into digging, I would like to have your advice
regarding this project in order to know weather your software will be
of use to a more 'common' company IT structure.

Thanks in advance for your insights.

Best regards

Justin Maltat

kafka high level consumer - threads guaranteed to read a single partition?

$
0
0
Hi,

For the kafka high level consumer, if I create exactly the number of
threads as the number of partitions, is there a guarantee that each thread
will be the only thread that reads from a particular partition? I'm
following this example
<https://github.com/bingoohuang/java-sandbox/blob/92318c6d3f2533bbadb253c59a201e4e70f72ad2/src/main/java/org/n3r/sandbox/kafka/ConsumerGroupExample.java>.
Assuming that the number of threads and partitions is fixed.

Thanks,
Josh

how is the partitioner working when compression is enabled?

$
0
0
I am using Kafka 0.8 and message compression can only be enabled on producer side.

However one batch of messages (decided by the producer configuration batch.num.messages) is compressed as a single message and stored at the broker side. I think one batch compressed message will be stored at one broker.

I am wondering in this case how the producer side partitioner class will work? The batch will be partitioned to the partition of the first or last message in the batch?

Lex
Viewing all 1519 articles
Browse latest View live




Latest Images