py4j.protocol.Py4JJava错误:调用o63.save时出错.:java.lang.NoClassDef



我是Spark和BigData组件-HBase的新手,我正在尝试在Pyspark中编写Python代码,并连接到HBase以从HBase读取数据。我使用以下版本:

  • Spark版本:spark-3.1.2-bin-hadoop2.7
  • Python版本:3.8.5
  • HBase版本:hbase-2.3.5

我在ubuntu 20.04 上的本地安装了独立的Hbase和Spark

代码:

from pyspark import SparkContext
from pyspark.sql import SQLContext
sc = SparkContext.getOrCreate()
sqlc = SQLContext(sc)
data_source_format = 'org.apache.spark.sql.execution.datasources.hbase'
df = sc.parallelize([("1","Abby","Smith","K","3456main","Orlando","FL","45235"), 
("2","Amaya","Williams","L","123Orange","Newark","NJ","27656"),("3","Alchemy","Davis","P","Warners","Sanjose","CA","34789")])
.toDF(schema=['key','firstName','lastName','middleName','addressLine','city','state','zipCode'])
df.show()
catalog=''.join('''{
"table":{"namespace":"emp_data","name":"emp_info"},
"rowkey":"key",
"columns":{
"key":{"cf":"rowkey","col":"key","type":"string"},
"fName":{"cf":"person","col":"firstName","type":"string"},
"lName":{"cf":"person","col":"lastName","type":"string"},
"mName":{"cf":"person","col":"middleName","type":"string"},
"addressLine":{"cf":"address","col":"addressLine","type":"string"},
"city":{"cf":"address","col":"city","type":"string"},
"state":{"cf":"address","col":"state","type":"string"},
"zipCode":{"cf":"address","col":"zipCode","type":"string"}
}
}'''.split())
#Writing
print("Writing into HBase")
df.write
.options(catalog=catalog)
.format(data_source_format)
.save()
#Reading
print("Readig from HBase")
df = sqlc.read
.options(catalog=catalog)
.format(data_source_format)
.load()
print("Program Ends")

错误消息:

写入HBase追踪(最近一次通话(:文件"/mnt/c/Codefiles/pyspark_test.py";,第36行,indf.write
文件"home/aditya/spark-3.1.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/readwriter.py";,线1107,保存中文件"/home/aditya/spark-3.1.2-in-hadoop2.7/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py";,行1304,在调用文件"中/home/aditya/spark-3.1.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py";,第111行,indeco文件"/home/aditya/spark-3.1.2-in-hadoop2.7/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py";,第326行,在get_return_value中py4j.protocol.Py4JJava错误:调用时出错o63.保存。:java.lang.NoClassDefFoundError:org/apache/spark/Logging位于java.lang.ClassLoader.defineClass1(本机方法(在java.lang.ClassLoader.defineClass(ClassLoader.java:756(

检查是否安装了java并且在env变量中设置了java_home。

相关内容

最新更新