我正在使用下面提到的YAML文件部署flink有状态应用程序。
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
name: operational-reporting-15gb
spec:
image:.azurecr.io/stateful-app-v2
flinkVersion: v1_15
flinkConfiguration:
taskmanager.numberOfTaskSlots: "2"
state.savepoints.dir: abfs://flinktest@.dfs.core.windows.net/savepoints.v2
state.checkpoints.dir: abfs://flinktest@.dfs.core.windows.net/checkpoints.v2
high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: abfs://flinktest@.dfs.core.windows.net/ha.v2
serviceAccount: flink
jobManager:
resource:
memory: "15360m"
cpu: 2
taskManager:
resource:
memory: "15360m"
cpu: 3
podTemplate:
spec:
containers:
- name: flink-main-container
volumeMounts:
- mountPath: /flink-data
name: flink-volume
volumes:
- name: flink-volume
emptyDir: {}
job:
jarURI: local:///opt/operationalReporting.jar
parallelism: 1
upgradeMode: savepoint
state: running
Flink作业运行良好。对于自动缩放,我使用以下代码创建了HPA。
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: basic-hpa
namespace: default
spec:
minReplicas: 1
maxReplicas: 15
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageValue: 100m
scaleTargetRef:
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
name: operational-reporting-15gb
在描述自动缩放时,我得到了下面提到的错误。
类型状态原因消息
AbleToScale False FailedGetScaleHPA控制器无法获得目标的当前量表:flinkdeployments.flink.apache.org"操作报告-15gb";未找到事件:从消息键入原因年龄
警告失败GetScale 4m4s(x121超过34m(水平吊舱自动缩放器flinkdeployments.flink.apache.org";操作报告-15gb";未找到
HPA的目标显示为未知。请帮助
我假设您正在使用Kubernetes Operator的HPA示例。感谢您的尝试,正如文档中所述,这是一个实验性功能,我们目前对它的经验有限。
也就是说,检查明显的是您的名为operational-reporting-15gb
的FlinkDeployment
在default
命名空间中运行吗?否则,请相应地调整HPA的名称空间。
此外,请确保您安装了最新的FlinkDeployment
CRD。仅仅拥有v1beta1
只能确保兼容性——它实际上不是一个固定版本,我们最近添加了scale
子资源。
git clone https://github.com/apache/flink-kubernetes-operator
cd flink-kubernetes-operator
kubectl replace -f helm/flink-kubernetes-operator/crds/flinkdeployments.flink.apache.org-v1.yml