在AWS EKS集群中的端口7077
上启动并运行Spark core
作为服务。
> kubectl get service spark-core -n fargate-profile-selector
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
spark-core ClusterIP 10.100.142.199 <none> 7077/TCP,8080/TCP,6066/TCP 172m
> kubectl get pods -n fargate-profile-selector
NAME READY STATUS RESTARTS AGE
spark-master-controller-7f5dd4bf84-twpwb 1/1 Running 0 173m
spark-ui-proxy-c7kd9 1/1 Running 0 173m
spark-worker-controller-6ccc46994f-5kwkp 1/1 Running 0 173m
spark-worker-controller-6ccc46994f-6gjng 1/1 Running 0 173m
spark-worker-controller-6ccc46994f-8x98q 1/1 Running 0 173m
spark-worker-controller-6ccc46994f-dn2hw 1/1 Running 0 173m
spark-worker-controller-6ccc46994f-xsbkv 1/1 Running 0 173m
> kubectl describe pod spark-master-controller-7f5dd4bf84-twpwb -n fargate-profile-selector
Name: spark-master-controller-7f5dd4bf84-twpwb
Namespace: fargate-profile-selector
Priority: 2000001000
Priority Class Name: system-node-critical
Node: fargate-ip-172-31-4-37.us-west-2.compute.internal/172.31.4.37
Start Time: Sun, 30 Oct 2022 21:10:40 +0200
Labels: app=spark
chart=spark-0.0.1-366
component=spark-core
eks.amazonaws.com/fargate-profile=fargateprofile
heritage=Helm
pod-template-hash=7f5dd4bf84
release=spark
sdr.appname=spark
Annotations: CapacityProvisioned: 0.25vCPU 2GB
Logging: LoggingDisabled: LOGGING_CONFIGMAP_NOT_FOUND
kubernetes.io/psp: eks.privileged
Status: Running
IP: 172.31.4.37
IPs:
IP: 172.31.4.37
Controlled By: ReplicaSet/spark-master-controller-7f5dd4bf84
Containers:
spark-core:
Container ID: containerd://6bcb2a37b1fe1cfec0dfa0c23115a87889fc13b078553473d148512516d6ec8e
Image: <acc-id>.dkr.ecr.us-west-2.amazonaws.com/spark:3.0.1-dev-18
Image ID: <acc-id>.dkr.ecr.us-west-2.amazonaws.com/spark@sha256:7eb77fe90b97ee9da9e369df9a6795bfd32839c343678298dc5b84ee7ea7083d
Ports: 7077/TCP, 8080/TCP, 6066/TCP, 40000/TCP, 40100/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
/opt/spark/bin/spark-class
Args:
org.apache.spark.deploy.master.Master
--ip
spark-core
--port
7077
--webui-port
8080
--properties-file
/opt/spark/conf/spark.conf
> kubectl describe ingress spark-core-ingress -n fargate-profile-selector
Name: spark-core-ingress
Labels: <none>
Namespace: fargate-profile-selector
Address: a17976c980f984f6d998122f02f8109d-eb9d66d6481e7784.elb.us-west-2.amazonaws.com
Ingress Class: <none>
Default backend: <default>
Rules:
Host Path Backends
---- ---- --------
spark-core.comp.com
/ spark-core:7077 (172.31.4.37:7077)
Annotations: kubernetes.io/ingress.class: nginx
我使用Nginx入口控制器:
> kubectl get all --namespace=ingress-nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-ingress-controller-5c6567c67d-68vk8 1/1 Running 0 40m
pod/nginx-ingress-controller-5c6567c67d-885nc 1/1 Running 0 40m
pod/nginx-ingress-controller-5c6567c67d-mhnxq 1/1 Running 0 40m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ingress-nginx LoadBalancer 10.100.87.199 a17976c980f984f6d998122f02f8109d-eb9d66d6481e7784.elb.us-west-2.amazonaws.com 7077:30807/TCP,80:30801/TCP,443:30195/TCP 177m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-ingress-controller 3/3 3 3 40m
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-ingress-controller-5c6567c67d 3 3 3 40m
我看到以下可疑情况(就绪性探测失败:HTTP探测失败,状态代码:500(:
> kubectl describe nginx-ingress-controller-5c6567c67d-885nc -n ingress-nginx
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 54m default-scheduler Successfully assigned ingress-nginx/nginx-ingress-controller-5c6567c67d-885nc to ip-172-31-8-112.us-west-2.compute.internal
Normal Pulled 54m kubelet Container image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0" already present on machine
Normal Created 54m kubelet Created container nginx-ingress-controller
Normal Started 54m kubelet Started container nginx-ingress-controller
Warning Unhealthy 54m kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
> kubectl logs nginx-ingress-controller-5c6567c67d-885nc -n ingress-nginx
I1030 21:30:36.698026 7 main.go:237] Running in Kubernetes cluster version v1.21+ (v1.21.14-eks-6d3986b) - git (clean) commit 8877a3e28d597e1184c15e4b5d543d5dc36b083b - platform linux/amd64
I1030 21:30:36.914228 7 main.go:102] SSL fake certificate created /etc/ingress-controller/ssl/default-fake-certificate.pem
I1030 21:30:36.946709 7 nginx.go:263] Starting NGINX Ingress controller
I1030 21:30:36.970410 7 event.go:281] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"nginx-configuration", UID:"b01aa3e0-c134-4bc8-8c8d-3496d906c06f", APIVersion:"v1", ResourceVersion:"7120", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/nginx-configuration
I1030 21:30:36.971872 7 event.go:281] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"udp-services", UID:"e39e26ba-3e62-4f9f-8625-8926c10bc87c", APIVersion:"v1", ResourceVersion:"7138", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/udp-services
I1030 21:30:36.972031 7 event.go:281] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"tcp-services", UID:"667f0d5b-b5e5-4532-b832-24ef2f98793f", APIVersion:"v1", ResourceVersion:"7130", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/tcp-services
I1030 21:30:38.147200 7 nginx.go:307] Starting NGINX process
I1030 21:30:38.147316 7 leaderelection.go:242] attempting to acquire leader lease ingress-nginx/ingress-controller-leader-nginx...
I1030 21:30:38.149071 7 controller.go:137] Configuration changes detected, backend reload required.
I1030 21:30:38.153011 7 status.go:86] new leader elected: nginx-ingress-controller-5c6567c67d-vf46c
I1030 21:30:38.248630 7 controller.go:153] Backend successfully reloaded.
I1030 21:30:38.248854 7 controller.go:162] Initial sync, sleeping for 1 second.
I1030 21:31:13.052326 7 leaderelection.go:252] successfully acquired lease ingress-nginx/ingress-controller-leader-nginx
I1030 21:31:13.052474 7 status.go:86] new leader elected: nginx-ingress-controller-5c6567c67d-885nc
I1030 21:51:12.252844 7 event.go:281] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"fargate-profile-selector", Name:"spark-core-ingress", UID:"874c0b49-dc15-44de-a09a-10d95d4699ea", APIVersion:"networking.k8s.io/v1beta1", ResourceVersion:"49234", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress fargate-profile-selector/spark-core-ingress
I1030 21:51:12.252909 7 controller.go:137] Configuration changes detected, backend reload required.
I1030 21:51:12.351617 7 controller.go:153] Backend successfully reloaded.
I1030 21:51:13.065232 7 status.go:274] updating Ingress fargate-profile-selector/spark-core-ingress status from [] to [{ a17976c980f984f6d998122f02f8109d-eb9d66d6481e7784.elb.us-west-2.amazonaws.com}]
I1030 21:51:13.079963 7 event.go:281] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"fargate-profile-selector", Name:"spark-core-ingress", UID:"874c0b49-dc15-44de-a09a-10d95d4699ea", APIVersion:"networking.k8s.io/v1beta1", ResourceVersion:"49241", FieldPath:""}): type: 'Normal' reason: 'UPDATE' Ingress fargate-profile-selector/spark-core-ingress
我两种都试过了:
curl -I a17976c980f984f6d998122f02f8109d-eb9d66d6481e7784.elb.us-west-2.amazonaws.com
和
curl -i -H "Host:spark-core.comp.com" a17976c980f984f6d998122f02f8109d-eb9d66d6481e7784.elb.us-west-2.amazonaws.com
他们两个都挂着,没有任何错误。
更新:此处添加Host
:
curl -i -H "Host:spark-core.comp.com" a17976c980f984f6d998122f02f8109d-eb9d66d6481e7784.elb.us-west-2.amazonaws.com
我认为问题是您直接击中了Nginx入口控制器的外部IP。
spark-core.comp.com
是入口规则中的域,Nginx ingress controller
对此进行检查。
还不确定为什么您尝试在Header中传递主机名,因为入口控制器检查请求中的Host而不是Head。
通常,我们将具有CNAME的域spark-core.comp.com
映射到外部IP,如果您在本地系统上,您可以在浏览器中的/etc/host文件中添加一个条目来检查域。
当您点击curl -I a17976c980f984f6d998122f02f8109d-eb9d66d6481e7784.elb.us-west-2.amazonaws.com
时,它应该会从Nginx控制器返回404。