Java 如何在将消息传递给消费者之前过滤消息？

Question

提问by user1079877

I'm creating a lead and event management system with Kafka. The problem is we are getting many fake leads (advertisement). We also have many consumer in our system. Is there anyway to filter advertisement before going to consumers? My solution is to write everything into the first topic, then read it by a filter consumer, then write it back to the second topic or filter it. But I'm not sure if it's efficient or not. Any idea?

我正在用 Kafka 创建一个潜在客户和事件管理系统。问题是我们收到了很多虚假的潜在客户（广告）。我们的系统中也有很多消费者。有没有在去消费者之前过滤广告？我的解决方案是将所有内容写入第一个主题，然后由过滤器使用者读取，然后将其写回第二个主题或对其进行过滤。但我不确定它是否有效。任何的想法？

Answer 1

采纳答案by JongHyok Lee

You can use Kafka Streams (http://kafka.apache.org/documentation.html#streamsapi) with 0.10.+ version of Kafka. It's exactly for your use case i think.

您可以将 Kafka Streams ( http://kafka.apache.org/documentation.html#streamsapi) 与 0.10.+ 版本的 Kafka 一起使用。我认为这完全适合您的用例。

Answer 2

回答by Jeff Gong

Yes -- in fact I am mostly convinced that this is the way you're supposed to handle a problem in your context. Because Kafka is only useful for the efficient transmission of data, there is nothing it itself can do in terms of cleaning your data. Consume all the information you get by an intermediary consumer that can run its own tests to determine what passes its filter and push to a different topic / partition (based on your needs) to get the best data back.

是的——事实上，我深信这是您在上下文中处理问题的方式。因为Kafka只对数据的高效传输有用，所以它本身在清理你的数据方面无能为力。消费您从中间消费者获得的所有信息，这些消费者可以运行自己的测试来确定什么通过了过滤器并推送到不同的主题/分区（根据您的需要）以获取最佳数据。

Answer 3

回答by Nikita Shamgunov

You can use Spark Streaming: https://spark.apache.org/docs/latest/streaming-kafka-integration.html.

您可以使用 Spark Streaming：https: //spark.apache.org/docs/latest/streaming-kafka-integration.html。

Answer 4

回答by mancini0

Take a look at Confluent's KSQL. (It's free and open source, https://www.confluent.io/product/ksql/.) It uses Kafka Streams under the hood, you can define your ksql queries and tables on the server side, the results of which are written to kafka topics, so you could just consume those topics, instead of writing code to create a intermediary filtering consumer. You'd only need to write the ksql table "ddl" or queries.

看看 Confluent 的 KSQL。（它是免费和开源的，https://www.confluent.io/product/ksql/。）它在后台使用 Kafka Streams，您可以在服务器端定义您的 ksql 查询和表，其结果被写入kafka 主题，所以你可以只使用这些主题，而不是编写代码来创建一个中间过滤消费者。您只需要编写 ksql 表“ddl”或查询。

Java 如何在将消息传递给消费者之前过滤消息？

提问by user1079877

采纳答案by JongHyok Lee

回答by Jeff Gong

回答by Nikita Shamgunov

回答by mancini0

相关推荐

最近更新

标签

Java 如何在将消息传递给消费者之前过滤消息？

提问by user1079877

采纳答案by JongHyok Lee

回答by Jeff Gong

回答by Nikita Shamgunov

回答by mancini0

相关推荐

JavaFX - 如何获取选项卡、按钮等的背景颜色

将对象设置为彼此相等（java）

Java Maven-在当前项目和插件组中找不到前缀“spring-boot”的插件

Java 使用三元运算符“什么都不做”

相关推荐

最近更新

标签