使用Javaconfig配置HADOOP作业



我现在正在关注Spring HADOOP的介绍页面http://blog.springsource.org/2012/02/29/introducing-spring-hadoop/

示例配置是基于xml的。下面的代码描述了wordCount示例。

<!-- define the job -->
<hdp:job id="word-count"
  input-path="/input/" output-path="/ouput/"
  mapper="org.apache.hadoop.examples.WordCount.TokenizerMapper"
  reducer="org.apache.hadoop.examples.WordCount.IntSumReducer"/>
<!-- execute the job -->
<bean id="runner" class="org.springframework.data.hadoop.mapreduce.JobRunner"
              p:jobs-ref="word-count"/>

是否有一种方法来配置这个例子与Javaconfig?

@Configuration
@EnableHadoop
@PropertySource(value={"classpath:config/hadoop.properties"})
public class HadoopConfiguration extends SpringHadoopConfigurerAdapter {
@Override
public void configure(HadoopConfigConfigurer config) throws Exception {
    Properties props = new Properties();
    config.fileSystemUri("hdfs://");
    config.withProperties(props).property("propkey", "propvalue").and();
}
}

您可以通过使用Configuration对象的各种.set()方法以编程方式设置hadoop配置,如下所示:

Configuration conf = new Configuration();
conf.set("example.foo", "bar");

最新更新