import java.io.IOException; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.io.NullWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.output.MultipleOutputs; public class ImportDataFromMongoReducer extends Reducer<Text, Text, Text, Text> { private static final Log LOG = LogFactory .getLog(ImportDataFromMongoReducer.class); private MultipleOutputs out; public void setup(Context context) { out = new MultipleOutputs(context); } private String generateFileName(Text k) { return k.toString() + "/part"; } @Override public void reduce(final Text pKey, final Iterable<Text> pValues, final Context pContext) throws IOException, InterruptedException { for (final Text value : pValues) { // pContext.write(pKey, value); out.write(NullWritable.get(), value, generateFileName(pKey)); } } protected void cleanup(Context context) throws IOException, InterruptedException { out.close(); } }
References
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html
http://www.infoq.com/articles/HadoopOutputFormat
相关推荐
"Data Analytics with Hadoop: An Introduction for Data Scientists" ISBN: 1491913703 | 2016 | PDF | 288 pages | 7 MB Ready to use statistical and machine-learning techniques across large data sets? ...
hadoop权威指南代码 (Hadoop: The Definitive Guide code) http://www.hadoopbook.com
Hadoop: The Definitive Guide is a comprehensive resource for using Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and...
Hadoop 2.x is spreading its wings to cover a variety of application paradigms and solve a wider range of data problems. It is rapidly becoming a general-purpose cluster platform for all data ...
Hadoop: The Definitive Guide, 4th Edition Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable,...
[Hadoop权威指南(第2版)].(Hadoop:The.Definitive.Guide).Tom.White.文字版.pdf
With this digital Early Release edition of Hadoop: The Definitive Guide, you get the entire book bundle in its earliest form – the author’s raw and unedited content – so you can take advantage of ...
实战Hadoop:开启通向云计算的捷径
Hadoop的权威指南 Hadoop: The Definitive Guide 。里面有两个pdf的压缩包,中英两版本都有,欢迎查阅
实战Hadoop:开启通向云计算的捷径(刘鹏) 高清完整中文版PDF下载
资源名称:云计算Hadoop:快速部署Hadoop集群内容简介: 近来云计算越来越热门了,云计算已经被看作IT业的新趋势。云计算可以粗略地定义为使用自己环境之外的某一服务提供的可伸缩计算资源,并按使用量付费。可以...
SQL-on-Hadoop的一篇综述文章
Chapter 4 covers the fundamentals of I/O in Hadoop: data integrity, compression, serialization, and file-based data structures. The next four chapters cover MapReduce in depth. Chapter 5 goes through ...
pdf+epub This book will teach you how to deploy large-scale datasets in deep neural networks with Hadoop for optimal...this book will then show you how to set up the Hadoop environment for deep learning.
Ideal for enterprise architects, IT managers, application architects, and data engineers, this book shows you how to overcome the many challenges that emerge during Hadoop projects. You'll explore the...
计算Hadoop:快速部署Hadoop集群 详细的Hadoop集群部署文档,对您绝对有用~
分布式存储系统hadoop:hbase安装经验,非常不错的hadoop之hbase,入门环境搭建。
Hadoop Data Processing and Modelling 英文azw3 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
hadoop2.7汇总:新增功能最新编译64位安装、源码包、API、eclipse插件下载
Maven坐标:org.apache.hadoop:hadoop-mapreduce-client-common:2.6.5; 标签:apache、mapreduce、common、client、hadoop、jar包、java、API文档、中英对照版; 使用方法:解压翻译后的API文档,用浏览器打开...