Hadoop: java.lang.Exception: java.lang.NoClassDefFoundError:



我之前的问题发布在这里:

Hadoop: java.lang.Exception:

java.lang.RuntimeException: 配置对象时出错

然后我按照建议将所有 jar 文件打包成一个,第一个问题就解决了。请参考上一篇文章的源代码。提前谢谢。但新的问题来了:

14/04/03 13:47:39 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/04/03 13:47:40 WARN snappy.LoadSnappy: Snappy native library is available
14/04/03 13:47:40 INFO snappy.LoadSnappy: Snappy native library loaded
14/04/03 13:47:40 INFO mapred.FileInputFormat: Total input paths to process : 1
14/04/03 13:47:40 INFO mapred.JobClient: Running job: job_local1748858601_0001
14/04/03 13:47:40 INFO mapred.LocalJobRunner: Waiting for map tasks
14/04/03 13:47:40 INFO mapred.LocalJobRunner: Starting task: attempt_local1748858601_0001_m_000000_0
14/04/03 13:47:40 INFO util.ProcessTree: setsid exited with exit code 0
14/04/03 13:47:40 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@c943d1
14/04/03 13:47:40 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/project/input1/url.txt:0+68
14/04/03 13:47:40 INFO mapred.MapTask: numReduceTasks: 1
14/04/03 13:47:40 INFO mapred.MapTask: io.sort.mb = 100
14/04/03 13:47:40 INFO mapred.MapTask: data buffer = 79691776/99614720
14/04/03 13:47:40 INFO mapred.MapTask: record buffer = 262144/327680
Prepare to get into webpage
14/04/03 13:47:41 INFO mapred.JobClient:  map 0% reduce 0%
14/04/03 13:47:43 INFO mapred.LocalJobRunner: Map task executor complete.
14/04/03 13:47:43 WARN mapred.LocalJobRunner: job_local1748858601_0001
java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/xerces/parsers/AbstractSAXParser
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.NoClassDefFoundError: org/apache/xerces/parsers/AbstractSAXParser
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:643)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
    at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
    at de.l3s.boilerpipe.sax.BoilerpipeSAXInput.getTextDocument(BoilerpipeSAXInput.java:51)
    at de.l3s.boilerpipe.extractors.ExtractorBase.getText(ExtractorBase.java:69)
    at de.l3s.boilerpipe.extractors.ExtractorBase.getText(ExtractorBase.java:87)
    at webPageToTxt.WebPageToTxt.webPageString(WebPageToTxt.java:82)
    at webPageToTxt.WebPageToTxt.multiWebPageString(WebPageToTxt.java:126)
    at webPageToTxt.WebPageToTxt.webPageToTxt(WebPageToTxt.java:40)
    at webPageToTxt.WebPageToTxtMapper.map(WebPageToTxtMapper.java:27)
    at webPageToTxt.WebPageToTxtMapper.map(WebPageToTxtMapper.java:1)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:701)
Caused by: java.lang.ClassNotFoundException: org.apache.xerces.parsers.AbstractSAXParser
    at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
    ... 29 more
14/04/03 13:47:44 INFO mapred.JobClient: Job complete: job_local1748858601_0001
14/04/03 13:47:44 INFO mapred.JobClient: Counters: 0
14/04/03 13:47:44 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
    at webPageToTxt.ConfMain.run(ConfMain.java:33)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at webPageToTxt.ConfMain.main(ConfMain.java:40)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:622)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

你需要在驱动程序和mapreduce代码所在的jar之外添加所有正在使用的jar,以便它们在运行时可供映射器使用。

我浏览了您提供的链接。尽管将其他类打包为Map Reducejar的一部分是有效的。这并不总是可能的。如您所见,您在这里使用的是xerces,您需要包含xerces-impl.jar。

更好的方法是将这些jar添加到分布式缓存中。

DistributedCache.addArchiveToClassPath(new Path("HDFS Path"), job);

你可以把罐子放在HDFS中。所以解决方案是添加薛塞斯罐。

相关内容

  • 没有找到相关文章

最新更新