我在 Spark 2.1.0 和 python 2.7 上使用 pyspark 和 jupyter notebook。我正在尝试使用以下代码创建一个新SparkSession
;
from pyspark import SparkContext
from pyspark import SparkConf
from pyspark.sql import SparkSession
from pyspark.sql import SQLContext
spark = SparkSession
.builder
.appName("Bank Service Classifier")
.config("spark.sql.crossJoin.enabled","true")
.getOrCreate()
sc = SparkContext()
sqlContext = SQLContext(sc)
但是,如果我收到以下错误;
IllegalArgumentException Traceback (most recent call last)
<ipython-input-40-2683a8d0ffcf> in <module>()
4 from pyspark.sql import SQLContext
5
----> 6 spark = SparkSession .builder .appName("example-spark") .config("spark.sql.crossJoin.enabled","true") .getOrCreate()
7
8 sc = SparkContext()
/srv/spark/python/pyspark/sql/session.py in getOrCreate(self)
177 session = SparkSession(sc)
178 for key, value in self._options.items():
--> 179 session._jsparkSession.sessionState().conf().setConfString(key, value)
180 for key, value in self._options.items():
181 session.sparkContext._conf.set(key, value)
/srv/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args)
1131 answer = self.gateway_client.send_command(command)
1132 return_value = get_return_value(
-> 1133 answer, self.gateway_client, self.target_id, self.name)
1134
1135 for temp_arg in temp_args:
/srv/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
77 raise QueryExecutionException(s.split(': ', 1)[1], stackTrace)
78 if s.startswith('java.lang.IllegalArgumentException: '):
---> 79 raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
80 raise
81 return deco
IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"
我该如何解决这个问题?
我遇到了同样的错误。下载为 Hadoop 2.6 而不是 2.7 预先构建的 Spark 对我很有帮助。