在GKE集群中部署erigon主网归档节点



我目前正在尝试将带有erigon docker映像的主网归档节点部署到GKE集群(thorax/erigon)。我已经成功地部署了一个具有如下类似配置的Geth节点,但是当我试图在erigon上使用相同的方法时,我没有成功。

下面是我的YAML部署文件:

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: erigon-mainnet
namespace: erigon-mainnet
spec:
selector:
matchLabels:
app: erigon-mainnet
replicas: 2
serviceName: erigon-mainnet
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: erigon-mainnet
spec:
terminationGracePeriodSeconds: 300
containers:
- name: erigon
image: docker.io/thorax/erigon
ports:
- containerPort: 8545
- containerPort: 8546
- { containerPort: 30303, protocol: TCP }
- { containerPort: 30303, protocol: UDP }
args:
[
"--datadir=/mainnet",
"--chain=mainnet",
"--http",
"--http.addr=0.0.0.0",
"--http.api=eth,net,web3",
"--http.vhosts=*",
" --http.corsdomain=*",
"--ws",
"--ws.addr=0.0.0.0",
"--ws.api=eth,net,web3",
"--ws.origins=*",
]
resources:
requests:
memory: 2G
cpu: 1000m
limits:
memory: 16G
cpu: 8000m
livenessProbe:
initialDelaySeconds: 10
timeoutSeconds: 10
httpGet:
path: /
port: 8545
readinessProbe:
httpGet:
path: /
port: 8545
volumeMounts:
- name: mainnet
mountPath: /mainnet
nodeSelector:
chain: mainnet
volumeClaimTemplates:
- metadata:
name: "mainnet"
spec:
storageClassName: premium-rwo
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 4Ti
---
apiVersion: v1
kind: Service
metadata:
name: erigon-mainnet
namespace: erigon-mainnet
spec:
ports:
- protocol: TCP
targetPort: 8545
port: 8545
name: http
- protocol: TCP
targetPort: 8546
port: 8546
name: websoket
clusterIP: None
selector:
app: erigon-mainnet

kubectl describe pod得到的结果:

Events:
Type     Reason     Age                From               Message
----     ------     ----               ----               -------
Normal   Scheduled  53s                default-scheduler  Successfully assigned erigon-mainnet/erigon-mainnet-0 to gke-node-cluster-polygon-a017195b-fwhs
Normal   Pulled     49s                kubelet            Successfully pulled image "docker.io/thorax/erigon" in 430.462783ms
Normal   Pulled     48s                kubelet            Successfully pulled image "docker.io/thorax/erigon" in 399.71813ms
Normal   Pulling    30s (x3 over 50s)  kubelet            Pulling image "docker.io/thorax/erigon"
Normal   Created    29s (x3 over 49s)  kubelet            Created container erigon-mainnet
Warning  Failed     29s (x3 over 49s)  kubelet            Error: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "--datadir=/mainnet": stat --datadir=/mainnet: no such file or directory: unknown
Normal   Pulled     29s                kubelet            Successfully pulled image "docker.io/thorax/erigon" in 417.260296ms
Warning  BackOff    10s (x8 over 48s)  kubelet            Back-off restarting failed container

因此,我的假设是我可能将SSD挂载到错误的目录。我已经尝试将--datadir标志留空并将其挂载到默认的datadirerigon目录,但我仍然遇到崩溃循环。对于我的Geth节点,我将与上面完全相同的逻辑挂载到/chaindata,节点运行良好。如果有人知道这里的问题可能是什么,任何帮助都是感激的。我是GKE的新手,所以这可能是我忽略的一个简单的解决方案。

失败:

exec: "--datadir=/mainnet"
: stat --datadir=/mainnet: no such file or directory

erigon文档读:

使用--datadir来选择存储数据的位置

你可以用任何你想用的,但是目录/卷必须存在,这样stat /mainnet命令才不会失败。我假定您没有创建gcepersistentdisk来挂载:
https://cloud.google.com/sdk/gcloud/reference/compute/disks/create

gcloud compute disks create --size=1TB --zone=us-central1-a mainnet-data

然后声明为PD,持久磁盘:

gcePersistentDisk:
pdName: mainnet-data
fsType: ext4

ConfigvolumeClaimTemplates表示kind: StatefulSet;使用kind: Deployment。这个例子也使用Deployment的一切;它可以将整个区块链下载到--datadir.
当您查看以太坊链完全同步数据大小时,这意味着--size=1TB2TB:

759.03 GB2022年6月18日


相比之下,expedition只使用以太坊EPC端点。

最新更新