在Kubernetes上的远程Flink集群上运行Apache Beam作业时出现问题



我在远程Kubernetes集群上部署了一个Flink SessionCluster(根据文档(,可在<FLINK_MASTER_URL>:8081上使用,我正在尝试在它上运行Apache Beam的示例wordcount作业。

然而,每次我遇到错误时,似乎都无法成功提交作业以供执行。错误日志和Beam的管道选项如下;我很感激你给我一些关于如何解决这个问题的提示(我不是一个经验丰富的Flink/Beam用户,所以请原谅任何明显的错误(。

管道选项:

PipelineOptions(
"--runner=FlinkRunner",
"--flink_master=<FLINK_MASTER_URL>:8081"
)

错误日志(已截断(:

WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
INFO:root:Using Python SDK docker image: apache/beam_python3.7_sdk:2.21.0. If the image is not available at local, we will try to pull from hub.docker.com
INFO:apache_beam.runners.portability.fn_api_runner.translations:==================== <function lift_combiners at 0x7f954007e710> ====================
INFO:apache_beam.runners.portability.flink_runner:Adding HTTP protocol scheme to flink_master parameter: http://<FLINK_MASTER_URL>:8081
INFO:apache_beam.utils.subprocess_server:Using cached job server jar from https://repo.maven.apache.org/maven2/org/apache/beam/beam-runners-flink-1.10-job-server/2.21.0/beam-runners-flink-1.10-job-server-2.21.0.jar
INFO:apache_beam.utils.subprocess_server:Starting service with ['java' '-jar' '/home/rjurczak/.apache_beam/cache/jars/beam-runners-flink-1.10-job-server-2.21.0.jar' '--flink-master' 'http://<FLINK_MASTER_URL>:8081' '--artifacts-dir' '/tmp/beam-tempht7lpipz/artifactsotk2otzl' '--job-port' '48375' '--artifact-port' '0' '--expansion-port' '0']
INFO:apache_beam.utils.subprocess_server:b'[main] INFO org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - LegacyArtifactStagingService started on localhost:37645'
INFO:apache_beam.utils.subprocess_server:b'[main] INFO org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - Java ExpansionService started on localhost:35547'
INFO:apache_beam.utilradars.subprocess_server:b'[main] INFO org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - JobService started on localhost:48375'
INFO:apache_beam.utils.subprocess_server:b'[grpc-default-executor-0] INFO org.apache.beam.runners.flink.FlinkJobInvoker - Invoking job BeamApp-rjurczak-0713164027-2a729669_d00db59c-cda9-46be-9bd8-1b8406d155a5 with pipeline runner org.apache.beam.runners.flink.FlinkPipelineRunner@25fa0e2d'
INFO:apache_beam.utils.subprocess_server:b'[grpc-default-executor-0] INFO org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation - Starting job invocation BeamApp-rjurczak-0713164027-2a729669_d00db59c-cda9-46be-9bd8-1b8406d155a5'
INFO:apache_beam.runners.portability.portable_runner:Job state changed to STOPPED
INFO:apache_beam.runners.portability.portable_runner:Job state changed to STARTING
INFO:apache_beam.runners.portability.portable_runner:Job state changed to RUNNING
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.beam.runners.flink.FlinkPipelineRunner - Translating pipeline to Flink program.'
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.beam.runners.flink.FlinkExecutionEnvironments - Creating a Batch Execution Environment.'
INFO:apache_beam.utilradars.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.beam.runners.flink.FlinkExecutionEnvironments - Using Flink Master URL 10.70.227.141:8081.'
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.flink.api.java.ExecutionEnvironment - The job has 0 registered types and 0 default Kryo serializers'
INFO:apache_beam.utils.subprocess_server:b'[Flink-RestClusterClient-IO-thread-4] WARN org.apache.flink.util.ExecutorUtils - ExecutorService did not terminate in time. Shutting it down now.'
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] ERROR org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation - Error during job invocation BeamApp-rjurczak-0713164027-2a729669_d00db59c-cda9-46be-9bd8-1b8406d155a5.'
INFO:apache_beam.utils.subprocess_server:b'java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.flink.utilradar.ExceptionUtils.rethrow(ExceptionUtils.java:199)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:952)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:860)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:194)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:116)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:83)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:83)'radarradar
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.lang.Thread.run(Thread.java:834)'
INFO:apache_beam.utils.subprocess_server:b'Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:947)'
INFO:apache_beam.utils.subprocess_server:b't... 11 more'
INFO:apache_beam.utils.subprocess_server:b'Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$7(RestClusterClient.java:359)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$8(FutureUtils.java:274)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1085)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)'
INFO:apache_beam.utils.subprocess_server:b't... 3 more'
INFO:apache_beam.utils.subprocess_server:b'Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Failed to deserialize JobGraph.]'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:390)'
INFO:apache_beam.utils.subprocess_server:b'tat org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:374)'
INFO:apache_beam.utils.subprocess_server:b'tat java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)'
INFO:apache_beam.utils.subprocess_server:b't... 4 more'
ERROR:root:org.apache.flink.runtime.rest.util.RestClientException: [Failed to deserialize JobGraph.]
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.beam.runners.fnexecution.artifact.AbstractLegacyArtifactRetrievalService - Manifest at /tmp/beam-tempht7lpipz/artifactsotk2otzl/job_364b1df0-7e66-4759-997f-91f87179932b/MANIFEST has 1 artifact locations'
INFO:apache_beam.runners.portability.portable_runner:Job state changed to FAILED
Traceback (most recent call last):
File "examples/wordcount.py", line 152, in <module>
run()
File "examples/wordcount.py", line 132, in run
result.wait_until_finish()
File "/home/rjurczak/envs/env/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py", line 550, in wait_until_finish
(self._job_id, self._state, self._last_error_message()))
RuntimeError: Pipeline BeamApp-rjurczak-0713164027-2a729669_d00db59c-cda9-46be-9bd8-1b8406d155a5 failed in state FAILED: org.apache.flink.runtime.rest.util.RestClientException: [Failed to deserialize JobGraph.]

这是一个相当简短的答案。它看起来和这里的错误一样。

请确保CLI Flink版本与在Kubernetes上运行的Flink master版本相匹配。

相关内容

  • 没有找到相关文章

最新更新