齐柏林飞船无法启动IPython内核



很长一段时间以来,我一直面临齐柏林飞船的问题,似乎无法启动IPython。我遵循了这个指南和这个。Pyspark解释器设置正确,python路径正确,默认情况下IPython处于激活状态。然而,当我尝试运行指南中的任何示例时,例如:

%ipyspark
import pandas as pd
df = pd.DataFrame({'name':['a','b','c'], 'count':[12,24,18]})
z.show(df)

我从日志中得到以下错误,但没有说明太多:

INFO [2018-11-30 15:17:08,653] ({pool-3-thread-2} IPythonInterpreter.java[setAdditionalPythonPath]:103) - setAdditionalPythonPath: /usr/hdp/current/spark2-client/python/lib/pyspark.zip:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/zeppelin-server/interpreter/lib/python
INFO [2018-11-30 15:17:08,654] ({pool-3-thread-2} IPythonInterpreter.java[open]:135) - Python Exec: python3
INFO [2018-11-30 15:17:09,189] ({pool-3-thread-2} IPythonInterpreter.java[checkIPythonPrerequisite]:195) - IPython prerequisite is meet
INFO [2018-11-30 15:17:09,191] ({pool-3-thread-2} IPythonInterpreter.java[open]:146) - Launching IPython Kernel at port: 39753
INFO [2018-11-30 15:17:09,191] ({pool-3-thread-2} IPythonInterpreter.java[open]:147) - Launching JVM Gateway at port: 36511
INFO [2018-11-30 15:17:09,402] ({pool-3-thread-2} IPythonInterpreter.java[setupIPythonEnv]:315) - PYTHONPATH:/usr/hdp/current/spark2-client/python/lib/pyspark.zip:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/zeppelin-server/interpreter/lib/python:/usr/hdp/current/spark2-client//python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/spark2-client//python/:/usr/hdp/current/spark2-client//python:/usr/hdp/current/spark2-client//python/lib/py4j-0.8.2.1-src.zip
INFO [2018-11-30 15:17:09,743] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
INFO [2018-11-30 15:17:09,844] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
WARN [2018-11-30 15:17:09,926] ({Exec Default Executor} IPythonInterpreter.java[onProcessFailed]:394) - Exception happens in Python Process
org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1)
at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48)
at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200)
at java.lang.Thread.run(Thread.java:745)
INFO [2018-11-30 15:17:09,944] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
INFO [2018-11-30 15:17:10,044] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
INFO [2018-11-30 15:17:39,465] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
WARN [2018-11-30 15:17:39,466] ({pool-3-thread-2} PySparkInterpreter.java[open]:134) - Fail to open IPySparkInterpreter
java.lang.RuntimeException: Fail to open IPythonInterpreter
at org.apache.zeppelin.python.IPythonInterpreter.open(IPythonInterpreter.java:157)
at org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:66)
at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:129)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Fail to launch IPython Kernel in 30 seconds
at org.apache.zeppelin.python.IPythonInterpreter.launchIPythonKernel(IPythonInterpreter.java:297)
at org.apache.zeppelin.python.IPythonInterpreter.open(IPythonInterpreter.java:154)
at org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:66)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
INFO [2018-11-30 15:17:39,466] ({pool-3-thread-2} PySparkInterpreter.java[open]:140) - IPython is not available, use the native PySparkInterpreter   
INFO [2018-11-30 15:17:39,533] ({pool-3-thread-2} PySparkInterpreter.java[createPythonScript]:118) - File /tmp/zeppelin_pyspark-5362368451576072994.py created
INFO [2018-11-30 15:17:39,534] ({pool-3-thread-2} Py4JUtils.java[createGatewayServer]:44) - Launching GatewayServer at 127.0.0.1:34508
INFO [2018-11-30 15:17:39,565] ({pool-3-thread-2} PySparkInterpreter.java[createGatewayServerAndStartScript]:265) - pythonExec: python3
INFO [2018-11-30 15:17:39,567] ({pool-3-thread-2} PySparkInterpreter.java[setupPySparkEnv]:236) - PYTHONPATH: /usr/hdp/current/spark2-client/python/lib/pyspark.zip:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/zeppelin-server/interpreter/lib/python:/usr/hdp/current/spark2-client//python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/spark2-client//python/:/usr/hdp/current/spark2-client//python:/usr/hdp/current/spark2-client//python/lib/py4j-0.8.2.1-src.zip
INFO [2018-11-30 15:17:41,953] ({pool-3-thread-2} SchedulerFactory.java[jobFinished]:115) - Job 20181129-172919_2135817500 finished by scheduler interpreter_131607019

我使用的是齐柏林飞船0.8.0附带的HDP3.0.1。所有节点都安装了带有最新版本jupyter和grpcio的python 3.7.1。从齐柏林飞船笔记本,我检查了ipython和python版本:

%pyspark
import sys
import IPython
print(IPython.__version__)
print(sys.version)

7.2.0

3.7.1(默认,2018年11月29日17:37:37(

我可以从任何节点启动IPython而不会出现问题,齐柏林飞船可以正确地获得IPython的版本。我试着找到除了齐柏林飞船之外是否还有其他日志报告错误,但什么也找不到。

你知道是什么阻止了齐柏林飞船推出IPython内核吗?

pip install --upgrade setuptools pip

或潜在的

pip install --upgrade ipython

在github.com/jupyter/notebook/issues/270 上还有其他一些快速的方法可以尝试

我在运行齐柏林飞船0.9.0-preview2时也遇到了这个问题。在我的案例中,原因是zeppelin无法识别pip-freeze中安装的conda软件包。例如,我使用conda安装了jupyter-client,所以pip freeze看起来像这样:

➜ pip freeze | grep jupyter-client
jupyter-client @ file:///tmp/build/80754af9/jupyter_client_1616770841739/work

请注意,jupyter客户端不遵循package==version格式。解决方案是删除安装了conda的jupyter&用pip安装。

➜ conda uninstall jupyter
➜ pip install jupyter
➜ pip freeze | grep jupyter-client
jupyter-client==6.1.12

看来齐柏林飞艇应该很快支持这两种情况。(如果还没有(。

还要记住,zeppelin-0.9.0还不支持Python 3.8。

我最近用Apache Zeppelin 0.10.0和Apache Zeppelin0.10.1运行了一个类似的问题。

原因(在我的情况下(是安装了pip install jupyter的最新版本将安装jupyter_client==7.4.4。然而,Apache Zeppelin 0.10.0无法识别这一点,它会查找包的旧名称(jupyter客户端而不是jupyter_client(。

请参阅此处了解更多详细信息和解决方法。

最新更新