无法在Spark会话中配置GeoSpark:



我一直在尝试使用Spark Session配置geospak,以便在PySpark上使用空间应用程序。我关注了这个链接&尝试运行如下所述的代码。

try:
import pyspark
from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession, SQLContext
except ImportError as e:
raise ImportError('PySpark is not Configured')
print(f"PySpark Version : {pyspark.__version__}")
# Creating a Spark-Context
sc = SparkContext.getOrCreate(SparkConf().setMaster('local[*]').set("spark.ui.port", "4050"))
# Spark Builder
spark = SparkSession.builder.appName('GeoSparkDemo').config('spark.executor.memory', '5GB')
.getOrCreate()
from geospark.register import upload_jars
from geospark.register import GeoSparkRegistrator
upload_jars()
GeoSparkRegistrator.registerAll(spark)

当我运行这个文件时,它会给我以下错误。

Traceback(上次调用(:文件"c: \surav\spark\coode\geosperk_demo.py";,第29行,inregisterAll(spark(File";C: \Users\user3.conda\envs\python37\lib\site-packages\geospark\register\geo_registerator.py",寄存器All中的第26行cls.register(spark(File";C: \Users\user3.conda\envs\python37\lib\site-packages\geospark\register\geo_registerator.py",第31行,寄存器中返回火花_jvm。GeoSparkSQLRegistor.registerAll(spark._jsparkSession(TypeError:"JavaPackage"对象不可调用

我试图在spark-jars文件夹中手动添加以下jar文件

•地质公园-1.3.1.jar•地质公园-sql_2.1-1.3.1.jar•geo_wrapper.jar

现在前面的错误消失了&新的异常是抛出,如下所示:

Traceback(上次调用(:文件"c: \surav\spark\coode\geosperk_demo.py";,第29行,inregisterAll(spark(File";C: \Users\user3.conda\envs\python37\lib\site-packages\geospark\register\geo_registerator.py",寄存器All中的第26行cls.register(spark(File";C: \Users\user3.conda\envs\python37\lib\site-packages\geospark\register\geo_registerator.py",第31行,寄存器中返回火花_jvm。GeoSparkSQLRegistor.registerAll(spark._jsparkSession(
文件"C: \Users\user3.conda\envs\python37\lib\site-packages\py4j\java_gateway.py",第1257行,在调用应答中,self.gateway_client,self.target_id,self.name(文件";C: \surav\spark\spark-2.4.7-bin-hadoop2.7\python\pyspark\sql\utils.py";,第63行,装饰return f(*a,**kw(File";C: \Users\user3.conda\envs\python37\lib\site-packages\py4j\protocol.py",第328行,在get_return_value中format(target_id,".",name(,value(py4j.protocol.Py4JJava错误:调用时出错z: org.datasyslab.geosperksql.utils.GeoSparkSQLRegistor.registerAll。:java.lang.NoSuchMethodError:org.apache.spark.sqlcatalyst.analysis.SimpleFunctionRegistry.registerFunction(Ljava/lang/String;Lscala/Function1;(V网址:org.datasyslab.geoparksql.UDF.UdfRegistor$$anonfun$registerAll$1.apply(UdfRegistor.scala:29(网址:org.datasyslab.geoparksql.UDF.UdfRegistor$$anonfun$registerAll$1.apply(UdfRegistor.scala:29(位于scala.collection.invariable.List.foreach(List.scala:392(网址:org.datasyslab.geoparksql.UDF.UdfRegistor$.registerAll(UdfRegistor.scala:29(网址:org.datasyslab.geoparksql.utils.GeoSparkSQLRegistor$.registerAll(geosparksql Registor.scala:34(网址:org.datasyslab.geoparksql.utils.GeoSparkSQLRegistor.registerAll(geosparksql Registor.scala(在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法(位于sun.reflect.NativeMethodAccessorImpl.invoke(未知源(在sun.reflect.DelegatingMethodAccessorImpl.invoke(未知源(位于java.lang.reflect.Method.ioke(未知源(在py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244(在py4j.reflection.ReflectionEngine.reinvoke(ReflectionEngine.java:357(在py4j.Gateway。invoke(Gateway。java:282(在py4j.commands.AbstractCommand.invokeMethod(AbstractCmd.java:132(在py4j.commands.CallCommand.execute(CallCommand.java:79(在py4j.GatewayConnection.run(GatewayConnection.java:238(在java.lang.Thread.run(未知源(

我发现这个链接也有类似的问题,我甚至试图用下面的代码在spark配置文件中添加jar,但似乎什么都不起作用。

spark.driver.extraClassPath C:souravsparkgeosparkjar/*

我使用的是Geospak 1.3.1、Java 8、Python 3.7、Apache Spark 2.4.7,我的Java_HOME、Spark_HOME设置正确,我在windows 10上运行。

如何解决此问题以继续?如有任何帮助/建议,我们将不胜感激。

GeoSpark目前可用作Apache Sedona。

对于类似的用例,我遵循了说明:

pip安装apache sedona

from pyspark.sql import SparkSession
from sedona.utils.adapter import Adapter
from sedona.register import SedonaRegistrator
from sedona.utils import KryoSerializer, SedonaKryoRegistrator
spark = SparkSession.builder.master("spark://test:7077").appName("sedonatest").
config("spark.serializer", KryoSerializer.getName). 
config("spark.kryo.registrator", SedonaKryoRegistrator.getName). 
config('spark.jars.packages',
'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
'org.datasyslab:geotools-wrapper:geotools-24.0').getOrCreate()
SedonaRegistrator.registerAll(spark)
resultsDF = spark.sql("SELECT ST_PolygonFromText('-74.0428197,40.6867969,-74.0421975,40.6921336,-74.0508020,40.6912794,-74.0428197,40.6867969', ',') AS polygonshape")

附言:在火花提交过程中通过低于2个罐子,罐子选项:

  • sedona-python-adapter-3.0_2.12-1.0.1-incubating.jar
  • 地理工具-包装工具-地理工具-240-sources.jar(https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/geotools-24.0/)

最新更新