本文共 2201 字,大约阅读时间需要 7 分钟。
方案一:收集到hdfs中方案二:插入已经有的表,使用flume收集数据到hive,hive中数据必须以orc格式保存source 网络日志channel 本地磁盘+memory,优先使用内存,如果内存使用完毕,就使用本地磁盘作为缓冲sink hivea1.sources = s1a1.channels=c1a1.sinks=k1#tcp协议a1.sources.s1.type = syslogtcpa1.sources.s1.port= 5140a1.sources.s1.host= wangfutaia1.sources.s1.channels = c1a1.channels = c1a1.channels.c1.type = SPILLABLEMEMORY a1.channels.c1.memoryCapacity = 10000 a1.channels.c1.overflowCapacity = 1000000 a1.channels.c1.byteCapacity = 800000 a1.channels.c1.checkpointDir =/home/wangfutai/a/flume/checkPointa1.channels.c1.dataDirs = /home/wangfutai/a/flume/dataa1.sinks = k1 a1.sinks.k1.type = hive a1.sinks.k1.channel = c1 a1.sinks.k1.hive.metastore = thrift://wangfutai:9083 a1.sinks.k1.hive.database = hivea1.sinks.k1.hive.table = flume#a1.sinks.k1.hive.partition = asia,%{country},%y-%m-%d-%H-%M #a1.sinks.k1.useLocalTimeStamp = false a1.sinks.k1.round = true a1.sinks.k1.roundValue = 10 a1.sinks.k1.roundUnit = minute a1.sinks.k1.serializer = DELIMITED a1.sinks.k1.serializer.delimiter = "," a1.sinks.k1.serializer.serdeSeparator = '\t' a1.sinks.k1.serializer.fieldnames =id,name,age19/01/16 22:24:59 ERROR node.PollingPropertiesFileConfigurationProvider: Failed to start agent because dependencies were not found in classpath. Error follows.java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/streaming/RecordWriter at org.apache.flume.sink.hive.HiveSink.createSerializer(HiveSink.java:219) at org.apache.flume.sink.hive.HiveSink.configure(HiveSink.java:202)1.将/home/wangfutai/module/hive-1.1.0-cdh5.15.0/hcatalog/share/hcatalog下的所有包,拷贝入/home/wangfutai/module/apache-flume-1.6.0-cdh5.15.0-bin/lib2..bash_profileexport HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*3.hive-site.xmlhive.support.concurrency true hive.enforce.bucketing true 4.表要分桶,和orc 格式create table hive.flume2 ( id int , name string, age int ) clustered by (id) into 2 bucketsstored as orctblproperties("transactional"='true');5.将hive.xml和hive-env.sh放到apache-flume-1.6.0-cdh5.15.0-bin/conf下 hive.txn.manager org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
转载地址:http://mgjxi.baihongyu.com/