弹性/弹性搜索:在AWS集群上,pods与volumeClaimTemplate卡在挂起状态



我在aws上安装了kops集群,并且在集群中安装了helm。

我正在尝试使用图表安装easltic/aelasticsearch。我需要修改我在values.yml文件下面创建的默认卷大小。

# Allocate smaller chunks of memory per pod
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"

# Request smaller persistent volume
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: default
resources:
requests:
storage: 10Gi

,我就是这样安装的

helm install elasticsearch -n logging elastic/elasticsearch -f values.yml

安装成功,但现在pod被困在挂起状态

[ec2-user@ip-my elastic-search]$ kubectl get pods -n logging
NAME                     READY   STATUS    RESTARTS   AGE
elasticsearch-master-0   0/1     Pending   0          6m35s
elasticsearch-master-1   0/1     Pending   0          6m35s
elasticsearch-master-2   0/1     Pending   0          6m35s

更新:

[ec2-user@my-ip elastic-search]$ kubectl describe pods -n logging elasticsearch-master-0
Name:           elasticsearch-master-0
Namespace:      logging
Priority:       0
Node:           <none>
Labels:         app=elasticsearch-master
chart=elasticsearch
controller-revision-hash=elasticsearch-master-697ffb4548
release=elasticsearch
statefulset.kubernetes.io/pod-name=elasticsearch-master-0
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  StatefulSet/elasticsearch-master
Init Containers:
configure-sysctl:
Image:      docker.elastic.co/elasticsearch/elasticsearch:7.12.0
Port:       <none>
Host Port:  <none>
Command:
sysctl
-w
vm.max_map_count=262144
Environment:  <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-tmqp9 (ro)
Containers:
elasticsearch:
Image:       docker.elastic.co/elasticsearch/elasticsearch:7.12.0
Ports:       9200/TCP, 9300/TCP
Host Ports:  0/TCP, 0/TCP
Limits:
cpu:     1
memory:  2Gi
Requests:
cpu:      1
memory:   2Gi
Readiness:  exec [sh -c #!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
# Disable nss cache to avoid filling dentry cache when calling curl
# This is required with Elasticsearch Docker using nss < 3.52
export NSS_SDB_USE_CACHE=no
http () {
local path="${1}"
local args="${2}"
set -- -XGET -s
if [ "$args" != "" ]; then
set -- "$@" $args
fi
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
fi
curl --output /dev/null -k "$@" "http://127.0.0.1:9200${path}"
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
HTTP_CODE=$(http "/" "-w %{http_code}")
RC=$?
if [[ ${RC} -ne 0 ]]; then
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' ${BASIC_AUTH} http://127.0.0.1:9200/ failed with RC ${RC}"
exit ${RC}
fi
# ready if HTTP code 200, 503 is tolerable if ES version is 6.x
if [[ ${HTTP_CODE} == "200" ]]; then
exit 0
elif [[ ${HTTP_CODE} == "503" && "7" == "6" ]]; then
exit 0
else
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' ${BASIC_AUTH} http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
exit 1
fi
else
echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )'
if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
exit 1
fi
fi
] delay=10s timeout=5s period=10s #success=3 #failure=3
Environment:
node.name:                     elasticsearch-master-0 (v1:metadata.name)
cluster.initial_master_nodes:  elasticsearch-master-0,elasticsearch-master-1,elasticsearch-master-2,
discovery.seed_hosts:          elasticsearch-master-headless
cluster.name:                  elasticsearch
network.host:                  0.0.0.0
ES_JAVA_OPTS:                  -Xmx1g -Xms1g
node.data:                     true
node.ingest:                   true
node.master:                   true
node.ml:                       true
node.remote_cluster_client:    true
Mounts:
/usr/share/elasticsearch/data from elasticsearch-master (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-tmqp9 (ro)
Conditions:
Type           Status
PodScheduled   False
Volumes:
elasticsearch-master:
Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName:  elasticsearch-master-elasticsearch-master-0
ReadOnly:   false
default-token-tmqp9:
Type:        Secret (a volume populated by a Secret)
SecretName:  default-token-tmqp9
Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type     Reason            Age                From               Message
----     ------            ----               ----               -------
Warning  FailedScheduling  23s (x2 over 24s)  default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.
[ec2-user@my-ip elastic-search]$ kubectl describe statefulset -n logging elasticsearch-master
Name:               elasticsearch-master
Namespace:          logging
CreationTimestamp:  Mon, 19 Apr 2021 03:51:58 +0000
Selector:           app=elasticsearch-master
Labels:             app=elasticsearch-master
app.kubernetes.io/managed-by=Helm
chart=elasticsearch
heritage=Helm
release=elasticsearch
Annotations:        esMajorVersion: 7
meta.helm.sh/release-name: elasticsearch
meta.helm.sh/release-namespace: logging
Replicas:           3 desired | 3 total
Update Strategy:    RollingUpdate
Pods Status:        0 Running / 3 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels:  app=elasticsearch-master
chart=elasticsearch
release=elasticsearch
Init Containers:
configure-sysctl:
Image:      docker.elastic.co/elasticsearch/elasticsearch:7.12.0
Port:       <none>
Host Port:  <none>
Command:
sysctl
-w
vm.max_map_count=262144
Environment:  <none>
Mounts:       <none>
Containers:
elasticsearch:
Image:       docker.elastic.co/elasticsearch/elasticsearch:7.12.0
Ports:       9200/TCP, 9300/TCP
Host Ports:  0/TCP, 0/TCP
Limits:
cpu:     1
memory:  2Gi
Requests:
cpu:      1
memory:   2Gi
Readiness:  exec [sh -c #!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
# Disable nss cache to avoid filling dentry cache when calling curl
# This is required with Elasticsearch Docker using nss < 3.52
export NSS_SDB_USE_CACHE=no
http () {
local path="${1}"
local args="${2}"
set -- -XGET -s
if [ "$args" != "" ]; then
set -- "$@" $args
fi
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
fi
curl --output /dev/null -k "$@" "http://127.0.0.1:9200${path}"
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
HTTP_CODE=$(http "/" "-w %{http_code}")
RC=$?
if [[ ${RC} -ne 0 ]]; then
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' ${BASIC_AUTH} http://127.0.0.1:9200/ failed with RC ${RC}"
exit ${RC}
fi
# ready if HTTP code 200, 503 is tolerable if ES version is 6.x
if [[ ${HTTP_CODE} == "200" ]]; then
exit 0
elif [[ ${HTTP_CODE} == "503" && "7" == "6" ]]; then
exit 0
else
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' ${BASIC_AUTH} http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
exit 1
fi
else
echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )'
if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
exit 1
fi
fi
] delay=10s timeout=5s period=10s #success=3 #failure=3
Environment:
node.name:                      (v1:metadata.name)
cluster.initial_master_nodes:  elasticsearch-master-0,elasticsearch-master-1,elasticsearch-master-2,
discovery.seed_hosts:          elasticsearch-master-headless
cluster.name:                  elasticsearch
network.host:                  0.0.0.0
ES_JAVA_OPTS:                  -Xmx1g -Xms1g
node.data:                     true
node.ingest:                   true
node.master:                   true
node.ml:                       true
node.remote_cluster_client:    true
Mounts:
/usr/share/elasticsearch/data from elasticsearch-master (rw)
Volumes:  <none>
Volume Claims:
Name:          elasticsearch-master
StorageClass:  standard
Labels:        <none>
Annotations:   <none>
Capacity:      10Gi
Access Modes:  [ReadWriteOnce]
Events:
Type    Reason            Age    From                    Message
----    ------            ----   ----                    -------
Normal  SuccessfulCreate  2m50s  statefulset-controller  create Pod elasticsearch-master-0 in StatefulSet elasticsearch-master successful
Normal  SuccessfulCreate  2m50s  statefulset-controller  create Pod elasticsearch-master-1 in StatefulSet elasticsearch-master successful
Normal  SuccessfulCreate  2m50s  statefulset-controller  create Pod elasticsearch-master-2 in StatefulSet elasticsearch-master successful
[ec2-user@~]$ kubectl describe pvc -n logging elasticsearch-master-0
Error from server (NotFound): persistentvolumeclaims "elasticsearch-master-0" not found
[ec2-user@ip~]$ kubectl get storageclass
NAME                      PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
default                   kubernetes.io/aws-ebs   Delete          Immediate              false                  38d
gp2                       kubernetes.io/aws-ebs   Delete          Immediate              false                  38d
kops-ssd-1-17 (default)   kubernetes.io/aws-ebs   Delete          WaitForFirstConsumer   true                   38d

集群中没有名为standard的StorageClass。使用列表中的storageClass之一:

$ kubectl get storageclass
NAME                      PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
default                   kubernetes.io/aws-ebs   Delete          Immediate              false                  38d
gp2                       kubernetes.io/aws-ebs   Delete          Immediate              false                  38d
kops-ssd-1-17 (default)   kubernetes.io/aws-ebs   Delete          WaitForFirstConsumer   true                   38d

样品:

volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: default
resources:
requests:
storage: 10Gi

最新更新