我一直在努力寻找在齐柏林笔记本上运行spark程序的问题的解决方案。不知道哪里出错了
我用的是齐柏林-0.7.3-bin-all我已经创建了AWS胶水端点和端口转发。
跟着这些链接没有任何帮助https://gist.github.com/codspire/7b0955b9e67fe73f6118dad9539cbaa2https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-local-notebook.html
当我运行一块火花代码http://localhost:8080/
%pyspark
a=5*4
print("value = %i" % (a))
sc.version
得到以下错误
org.apache.thrift.transport。TTransportException在org.apache.thrift.transport.TIOStreamTransport.read (TIOStreamTransport.java: 132)org.apache.thrift.transport.TTransport.readAll (TTransport.java: 86)在org.apache.thrift.protocol.TBinaryProtocol.readAll (TBinaryProtocol.java: 429)在org.apache.thrift.protocol.TBinaryProtocol.readI32 (TBinaryProtocol.java: 318)在org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin (TBinaryProtocol.java: 219)在org.apache.thrift.TServiceClient.receiveBase (TServiceClient.java: 69)在org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService Client.recv_interpret美元(RemoteInterpreterService.java: 266)在org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService Client.interpret美元(RemoteInterpreterService.java: 250)在org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret (RemoteInterpreter.java: 373)在org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret (LazyOpenInterpreter.java: 97)org.apache.zeppelin.notebook.Paragraph.jobRun (Paragraph.java: 406)运行(Job.java:175org.apache.zeppelin.scheduler.RemoteScheduler JobRunner.run美元(RemoteScheduler.java: 329)在java.util.concurrent.Executors RunnableAdapter.call美元(未知java.util.concurrent. futurettask .run(来源未知)atjava.util.concurrent.ScheduledThreadPoolExecutor ScheduledFutureTask.access 201美元(未知源)java.util.concurrent.ScheduledThreadPoolExecutor ScheduledFutureTask.run美元(未知来源:java.util.concurrent.ThreadPoolExecutor.runWorker(未知java.util.concurrent.ThreadPoolExecutor$Worker.run(未知java.lang.Thread.run(未知来源)
请帮忙!
当您创建端点时,您需要使用与齐柏林飞艇兼容的Glue版本创建它。在您的情况下(齐柏林0.7.3))您需要Glue版本0.9.