`

Flume: hbase sink

 
阅读更多

flume.conf

a1.sinks.hbase-sink1.channel = ch1
a1.sinks.hbase-sink1.type = hbase
a1.sinks.hbase-sink1.table = users
a1.sinks.hbase-sink1.columnFamily= info
a1.sinks.hbase-sink1.serializer=org.apache.flume.sink.hbase.RegexHbaseEventSerializer
a1.sinks.hbase-sink1.serializer.regex=^(.+)\t(.+)\t(.+)$
a1.sinks.hbase-sink1.serializer.colNames=ROW_KEY,name,email
a1.sinks.hbase-sink1.serializer.rowKeyIndex=0
a1.sinks.hbase-sink1.serializer.depositHeaders=true

Note

A:In order to using rowKey, you should configure  rowKeyIndex=0   and  colNames=ROW_KEY.....  where in you post Josn data, rowkey must be the first filed. 

B: If you want to put the headers info of your json post, you must set depositHeaders=true

 

 

a1.sources.http-source1.channels = ch1
a1.sources.http-source1.type = http
a1.sources.http-source1.bind = 0.0.0.0
a1.sources.http-source1.port = 5140
a1.sources.http-source1.handler = org.apache.flume.source.http.JSONHandler

 

a1.channels = ch1
a1.sources = http-source1
a1.sinks = hbase-sink1

 

Hbase

#hbase shell

>create 'users' 'info'

 

Curl post json

curl -i -H 'content-type: application/json' -X POST 
-d '[{"headers":{"userId":"9","name":"ZhangZiYi","phoneNumber":"1522323222"},
"body":"9\tZhangZiYi\tzy@163.com"}]' http://192.168.10.204:5140

 

Hbase result

>scan 'users'



 

 

Note: the name column, which content comes from headers and boy of JSON, will just overwrite the same content. Acutally, you can specify different column names to save the same content in different cells.

 

 

References

http://flume.apache.org/FlumeUserGuide.html#hbasesinks

http://thunderheadxpler.blogspot.jp/2013/09/bigdata-apache-flume-hdfs-and-hbase.html

 

 

  • 大小: 29 KB
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics