引起原因:java.lang.BootstrapMethodError:java.lang.NoClassDefFoun



我试图使用maven构建下面的spark scala项目。构建是成功的,但当我运行那个jar时,它给出了以下错误。请帮我修复

火花标量码

package com.spark.DataAnalysis
import org.apache.log4j.Level
import org.apache.spark.sql.{Dataset, SparkSession}
import org.apache.spark.sql.functions._
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
object TwitterData {
def main(args: Array[String]) {
println("Start")
System.setProperty("hadoop.home.dir","C://Sankha//Study//spark-2.3.4-bin-hadoop2.7//spark-2.3.4-bin-hadoop2//spark-2.3.4-bin-hadoop2.7")
val conf = new SparkConf().setAppName("Spark Scala WordCount Example").setMaster("local[1]")
val spark = SparkSession.builder().appName("CsvExample").master("local").getOrCreate()
val sc = new SparkContext(conf)
val csvData = sc.textFile("C:\Sankha\Study\data\twitter-airline-sentiment\Tweets.csv",3)
val map_data = csvData.map(x=> x.split(",")).filter(x=> (x.length  < 13)).filter(x=> x(5) == "Virgin America")
println(map_data.count())
}
}

maven构建代码:

mvn package

从命令行运行火花代码,如下

spark-submit --class sparkWCExample.spWCExample.Twitter --master local[2] C:SankhaStudyspark_wsspWCExampletargetspWCExample-0.0.1-SNAPSHOT.jar C:SankhaStudyspark_wsspWCExampletargetout

例外:

20/03/04 02:45:58 INFO Executor: Adding file:/C:/Users/sankh/AppData/Local/Temp/spark-ae5c0e2c-76f7-42d9-bd2a-6b1f5b191bd8/userFiles-ef86ac49-debf-4d19-b2e9-4f0c1cb83325/spWCExample-0.0.1-SNAPSHOT.jar to class loader
20/03/04 02:45:58 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.io.IOException: unexpected exception type
at java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1736)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1266)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2078)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1260)
... 61 more
Caused by: java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: scala/runtime/LambdaDeserialize
at sparkWCExample.spWCExample.Twitter$.$deserializeLambda$(Twitter.scala)
... 71 more
Caused by: java.lang.NoClassDefFoundError: scala/runtime/LambdaDeserialize
... 72 more
Caused by: java.lang.ClassNotFoundException: scala.runtime.LambdaDeserialize
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 72 more

请告知

POM xml如下:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>sparkWCExample</groupId>
<artifactId>spWCExample</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>spWCExample</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.12.3</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
</plugin>
</plugins>
</build>
</project>

请检查并让我知道

您的代码和POM似乎有一些问题。

谈到代码,您已经分别创建了一个sparksession和一个sparkcontext,尽管您可以只创建sparksession对象,并且它有sparkcontext。此外,您已经在代码和spark-submit命令中设置了spark属性。我建议您创建一个单独的sparkProperties文件,并将其传递到您的spark-submit命令中(我也将共享该文件和该命令(。

因此,您可以编写如下代码:

package com.spark.DataAnalysis
import org.apache.log4j.Level
import org.apache.spark.sql.{Dataset, SparkSession}
import org.apache.spark.sql.functions._
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
object TwitterData {
def main(args: Array[String]) {
println("Start")
System.setProperty("hadoop.home.dir","C://Sankha//Study//spark-2.3.4-bin-hadoop2.7//spark-2.3.4-bin-hadoop2//spark-2.3.4-bin-hadoop2.7")
val spark = SparkSession.builder().appName("CsvExample").master("local").getOrCreate()
val csvData = .textFile("C:\Sankha\Study\data\twitter-airline-sentiment\Tweets.csv",3)
val map_data = csvData.map(x=> x.split(",")).filter(x=> (x.length  < 13)).filter(x=> x(5) == "Virgin America")
println(map_data.count())
spark.close
}
}

现在,说到pom.xml,您还没有添加maven汇编插件,只使用了maven编译器插件。这意味着您的代码是使用依赖项编译的,但依赖项没有打包在jar中。在这种情况下,您的scala依赖项没有打包,也没有在您的系统中找到,所以它给了您一个错误。这就是为什么使用maven汇编插件总是更好的原因。

因此,您的新pom.xml应该如下所示:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>sparkWCExample</groupId>
<artifactId>spWCExample</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>spWCExample</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.12.3</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
</plugin>
</plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>assemble-all</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</build>

示例属性文件如下:

spark.master    local[2]
spark.submit.deployMode client
spark.driver.memory     2G
spark.executor.memory   2G
spark.executor.instances        2
spark.executor.cores    2
spark.driver.maxResultSize      1g

现在,添加sparkProperties.properties文件后,您的spark-submit命令将更容易编写:

spark-submit --class sparkWCExample.spWCExample.Twitter C:SankhaStudyspark_wsspWCExampletargetspWCExample-0.0.1-SNAPSHOT.jar C:SankhaStudyspark_wsspWCExampletargetout --properties-file sparkProperties.properties

我希望我已经详细地回答了你的问题。如果您还有其他疑问,请随时询问。

最新更新