JavaPackage对象不可调用错误:Pyspark



dataframe.show()、sQLContext.read.json等操作运行良好,但大多数函数都会出现"JavaPackage对象不可调用错误"。例如:当我做时

dataFrame.withColumn(field_name, monotonically_increasing_id())

我收到一个错误

File "/tmp/spark-cd423f35-9572-45ee-b159-1b2732afa2a6/userFiles-3a6e1729-95f4-468b-914c-c706369bf2a6/Transformations.py", line 64, in add_id_column
    self.dataFrame = self.dataFrame.withColumn(field_name, monotonically_increasing_id())
  File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/sql/functions.py", line 347, in monotonically_increasing_id
    return Column(sc._jvm.functions.monotonically_increasing_id())
TypeError: 'JavaPackage' object is not callable

我使用的是apachezeppelin解释器,并已将py4j添加到python路径中。

当我做时

import py4j
print(dir(py4j))

导入成功

['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'compat', 'finalizer', 'java_collections', 'java_gateway', 'protocol', 'version']

当我尝试时

print(sc._jvm.functions)

在pyspark shell中,它打印

<py4j.java_gateway.JavaClass object at 0x7fdaf9727ba8>

但当我在翻译中尝试时,它会打印

<py4j.java_gateway.JavaPackage object at 0x7f07cc3f77f0> 

齐柏林飞艇中的解释器代码

java_import(gateway.jvm, "org.apache.spark.sql.*")

没有被处决。将此添加到导入中修复了问题

相关内容

  • 没有找到相关文章

最新更新