Kafka:IDEA Receiver方式消费kafka数据

x33g5p2x  于2021-12-11 转载在 Kafka  
字(1.9k)|赞(0)|评价(0)|浏览(546)

1、启动集群

  1. 启动zookeeper,hadoop,saprk
  2. # 1、三个节点
  3. /usr/zookeeper/zookeeper-3.4.10/bin/zkServer.sh start
  4. /usr/zookeeper/zookeeper-3.4.10/bin/zkServer.sh status
  5. # 2、master节点 启动hadoop
  6. /usr/hadoop/hadoop-2.7.3/sbin/start-all.sh
  7. # 3、master节点 启动spark
  8. /usr/spark/spark-2.4.0-bin-hadoop2.7/sbin/start-all.sh

2、启动 kafka

  1. # 在三个节点后台启动kafka
  2. cd $KAFKA_HOME
  3. ./bin/kafka-server-start.sh config/server.properties &
  1. cd $KAFKA_HOME
  2. # 创建主题 badou_topic
  3. ./bin/kafka-topics.sh --create --zookeeper master:2181,slave1:2181,slave2:2181 --replication-factor 3 --partitions 6 --topic badou_topic
  4. # 创建生产者producer
  5. ./bin/kafka-console-producer.sh --broker-list master:9092,slave1:9092,slave2:9092 --topic badou_topic
  6. # 创建消费者consumer
  7. ./bin/kafka-console-consumer.sh --from-beginning --topic badou_topic --bootstrap-server master:9092,slave1:9092,slave2:9092

3、使用IDEA消费kafka数据

对2s内,拉取的数据进行wordCount处理。

  1. import org.apache.log4j.{Level, Logger}
  2. import org.apache.spark.SparkConf
  3. import org.apache.spark.streaming.kafka.KafkaUtils
  4. import org.apache.spark.streaming.{Seconds, StreamingContext}
  5. object ReceiverTest01 {
  6. def main(args: Array[String]): Unit = {
  7. val Array(group_id, topic, exectime) = Array("group_badou_topic", "badou_topic", "2")
  8. val conf = new SparkConf().setAppName("Receiver Test").setMaster("local[2]")
  9. Logger.getLogger("org.apache.spark").setLevel(Level.ERROR)
  10. val ssc = new StreamingContext(conf, Seconds(exectime.toInt))
  11. // 定义外部传入的topic, 可以指定多个topic
  12. val topicSet = topic.split(",").toSet
  13. // 定义线程
  14. val numThreads = 1
  15. val topicMap = topicSet.map((_, numThreads.toInt)).toMap
  16. val zkQuorum = "192.168.142.128:2181" // master的zookeeper地址
  17. // createStream 表示receiver方式,核心操作
  18. val lines = KafkaUtils.createStream(ssc, zkQuorum, group_id, topicMap).map(_._2)
  19. lines.map((_,1L)).reduceByKey(_+_).print()
  20. // (null,1) !==> map(_._2)
  21. ssc.start()
  22. ssc.awaitTermination()
  23. }
  24. }

运行代码程序,在xshell的kafka–>producer,进行数据输入

  1. 例如输入: 代码结果会显示:
  2. word
  3. word
  4. word
  5. set
  6. set
  7. set
  8. srt

相关文章

最新文章

更多