迭代的MapReduce作业有NumberFormatException错误

我的程序运行多个 Map Reduce 作业，一个用于我传递给它的参数文件中的每一行参数。

主要功能如下：

public static void main(String[] args) throws Exception {
// Create configuration
Configuration conf = new Configuration();
if (args.length != 3) 
{
System.err.println("Usage: KnnPattern <in> <out> <parameter file>");
System.exit(2);
}
//Reading argument using Hadoop API now
conf.set ("params", (args[2]));
String param = conf.get("params");
StringTokenizer inputLine = new StringTokenizer(param, "|");
int n = 1;
while(inputLine.hasMoreTokens())
{
conf.set("passedVal", inputLine.nextToken());
//Job Configuration here
++n;
}}

main 函数读取第三个参数，即存储在 HDFS 中的参数文件，并为它运行的每个 MapReduce 作业传递 1 个参数字符串。或者至少这是我希望它做的。我不能 100% 确定这是否完全正确。

我的映射器的设置如下所示：

protected void setup(Context context) throws IOException, InterruptedException
{
// Read parameter file using alias established in main()
Configuration conf = context.getConfiguration();
String knnParams = conf.get("passedVal");
StringTokenizer st = new StringTokenizer(knnParams, ",");
// Using the variables declared earlier, values are assigned to K and to the test dataset, S.
// These values will remain unchanged throughout the mapper
K = Integer.parseInt(st.nextToken());
normalisedSAge = normalisedDouble(st.nextToken(), minAge, maxAge);
normalisedSIncome = normalisedDouble(st.nextToken(), minIncome, maxIncome);
sStatus = st.nextToken();
sGender = st.nextToken();
normalisedSChildren = normalisedDouble(st.nextToken(), minChildren, maxChildren);
}

我的参数文件包含以下内容：

67， 16668，单身，男

， 3|40， 25000，单身，男， 2|67， 16668，单身，男， 3

即由"|"分隔的 3 组输入。

我得到的运行时错误是这样的：

错误：java.lang.NumberFormatException：对于输入字符串："/KNN/PARAMS/paramFinal.txt" at java.lang.NumberFormatException.forInputString(NumberFormatException.java：65( at java.lang.Integer.parseInt(Integer.java：569( at java.lang.Integer.parseInt(Integer.java：615( at KnnPattern$KnnMapper.setup(KnnPattern.java：168( at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java：143( at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java：787( at org.apache.hadoop.mapred.MapTask.run(MapTask.java：341( at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java：164( at java.security.AccessController.doPrivileged(Native Method( at javax.security.auth.Subject.doAs(Subject.java：422( at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java：1762( at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java：158(
被应用程序主杀死的容器。集装箱被杀请求。退出代码为 143 容器以非零退出代码退出 143

据我所知，这看起来像一个类型转换错误(？(，我不确定如何以及为什么会发生这种情况。

这段代码主要是我从这里得到的 - https://github.com/matt-hicks/MapReduce-KNN/blob/master/KnnPattern.java

它对于一组参数运行良好，但我需要它一次运行多个参数或测试用例以供进一步应用。

有什么方法可以解决这个问题，或者至少知道为什么我会收到此错误？任何帮助都非常感谢。谢谢。

我想明白了为什么我会得到NumberFormatException。

问题是我将第三个参数(args[2](读取为字符串而不是HDFS中的文件位置，这就是为什么错误日志显示：

对于输入字符串："/KNN/PARAMS/paramFinal.txt">

我现在出于测试目的所做的是，我不是给出文件位置，而是直接将输入文本作为第三个参数传递。这帮助我摆脱了这个特定的错误。

$ hadoop jar poker00.jar KnnPokerhand /Poker/train.txt /PokerOutputs/Output00 1,1,1,13,2,4,2,3,1,12,0/3,12,3,2,3,11,4,5,2,5,1/1,9,4,6,1,4,3,2,3,9,1

希望这对将来遇到此问题的任何人有所帮助。干杯。

相关内容

最新更新

热门标签：