我是新的大数据,我有任务传输csv文件到hdfs使用flume,但它也应该转换这些csv到avro。我尝试使用以下Flume配置:
a1.channels = dataChannel
a1.sources = dataSource
a1.sinks = dataSink
a1.channels.dataChannel.type = memory
a1.channels.dataChannel.capacity = 1000000
a1.channels.dataChannel.transactionCapacity = 10000
a1.sources.dataSource.type = spooldir
a1.sources.dataSource.spoolDir = {spool_dir}
a1.sources.dataSource.fileHeader = true
a1.sources.dataSource.fileHeaderKey = file
a1.sources.dataSource.basenameHeader = true
a1.sources.dataSource.basenameHeaderKey = basename
a1.sources.dataSource.interceptors.attach-schema.type = static
a1.sources.dataSource.interceptors.attach-schema.key = flume.avro.schema.url
a1.sources.dataSource.interceptors.attach-schema.value = {path_to_schema_in_hdfs}
a1.sinks.dataSink.type = hdfs
a1.sinks.dataSink.hdfs.path = {sink_path}
a1.sinks.dataSink.hdfs.format = text
a1.sinks.dataSink.hdfs.inUsePrefix = .
a1.sinks.dataSink.hdfs.filePrefix = drone
a1.sinks.dataSink.hdfs.fileSuffix = .avro
a1.sinks.dataSink.hdfs.rollSize = 180000000
a1.sinks.dataSink.hdfs.rollCount = 100000
a1.sinks.dataSink.hdfs.rollInterval = 120
a1.sinks.dataSink.hdfs.idleTimeout = 3600
a1.sinks.dataSink.hdfs.fileType = DataStream
a1.sinks.dataSink.serializer = avro_event
输出avro文件与flume的默认模式 AvroEventSerializer
,但是我得到了很多不同的错误,我解决了所有的错误,除了这个:
ERROR hdfs.HDFSEventSink: process failed
java.lang.ExceptionInInitializerError
at org.apache.hadoop.hdfs.DFSOutputStream.computePacketChunkSize(DFSOutputStream.java:1305)
at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1243)
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1266)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1101)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1059)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:232)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:75)
谢谢你的帮助。
1条答案
按热度按时间uwopmtnx1#
抱歉,配置中有错误。我修复了它们,找到了将css转换为avro的方法。我改了一点
AvroEventSerializer
这种方式:这里是
city
定义:如果你知道更好的方法,请回答