GKE指标—服务器失败(未能获得委派的身份验证kubecconfig)



Google Kubernetes引擎的指标服务器出现故障,这导致水平Pod自动缩放无法工作(CPU使用统计数据不可用)。当我检查metrics-server-v0.4.5部署,pod在CrashLoopBackOff日志如下:

2022-07-08 15:54:01.592 GMT Error: failed to get delegated authentication kubeconfig: failed to get delegated authentication kubeconfig: open /var/run/secrets/kubernetes.io/serviceaccount/token: permission denied
Error
2022-07-08 15:54:01.593 GMT Usage:
Error
2022-07-08 15:54:01.593 GMT [flags]
Error
2022-07-08 15:54:01.593 GMT
Error
2022-07-08 15:54:01.593 GMT Flags:
Error
2022-07-08 15:54:01.593 GMT --add_dir_header If true, adds the file directory to the header of the log messages
Error
2022-07-08 15:54:01.593 GMT --alsologtostderr log to standard error as well as files
Error
2022-07-08 15:54:01.593 GMT --authentication-kubeconfig string kubeconfig file pointing at the 'core' kubernetes server with enough rights to create tokenreviews.authentication.k8s.io.
Error
2022-07-08 15:54:01.593 GMT --authentication-skip-lookup If false, the authentication-kubeconfig will be used to lookup missing authentication configuration from the cluster.
Error
2022-07-08 15:54:01.593 GMT --authentication-token-webhook-cache-ttl duration The duration to cache responses from the webhook token authenticator. (default 10s)
Error
2022-07-08 15:54:01.593 GMT --authentication-tolerate-lookup-failure If true, failures to look up missing authentication configuration from the cluster are not considered fatal. Note that this can result in authentication that treats all requests as anonymous.
Error
2022-07-08 15:54:01.593 GMT --bind-address ip The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank or an unspecified address (0.0.0.0 or ::), all interfaces will be used. (default 0.0.0.0)
Error
2022-07-08 15:54:01.593 GMT --cert-dir string The directory where the TLS certs are located. If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored. (default "apiserver.local.config/certificates")
Error
2022-07-08 15:54:01.593 GMT --client-ca-file string If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate.
Error
2022-07-08 15:54:01.593 GMT --contention-profiling Enable lock contention profiling, if profiling is enabled
Error
2022-07-08 15:54:01.593 GMT -h, --help help for this command
Error
2022-07-08 15:54:01.593 GMT --http2-max-streams-per-connection int The limit that the server gives to clients for the maximum number of streams in an HTTP/2 connection. Zero means to use golang's default.
Error
2022-07-08 15:54:01.593 GMT --kubeconfig string The path to the kubeconfig used to connect to the Kubernetes API server and the Kubelets (defaults to in-cluster config)
Error
2022-07-08 15:54:01.593 GMT --kubelet-certificate-authority string Path to the CA to use to validate the Kubelet's serving certificates.
Error
2022-07-08 15:54:01.593 GMT --kubelet-client-certificate string Path to a client cert file for TLS.
Error
2022-07-08 15:54:01.593 GMT --kubelet-client-key string Path to a client key file for TLS.
Error
2022-07-08 15:54:01.593 GMT --kubelet-insecure-tls Do not verify CA of serving certificates presented by Kubelets. For testing purposes only.
Error
2022-07-08 15:54:01.593 GMT --kubelet-port int The port to use to connect to Kubelets. (default 10250)
Error
2022-07-08 15:54:01.593 GMT --kubelet-preferred-address-types strings The priority of node address types to use when determining which address to use to connect to a particular node (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
Error
2022-07-08 15:54:01.593 GMT --kubelet-use-node-status-port Use the port in the node status. Takes precedence over --kubelet-port flag.
Error
2022-07-08 15:54:01.593 GMT --log-flush-frequency duration Maximum number of seconds between log flushes (default 5s)
Error
2022-07-08 15:54:01.593 GMT --log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
Error
2022-07-08 15:54:01.593 GMT --log_dir string If non-empty, write log files in this directory
Error
2022-07-08 15:54:01.593 GMT --log_file string If non-empty, use this log file
Error
2022-07-08 15:54:01.593 GMT --log_file_max_size uint Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
Error
2022-07-08 15:54:01.593 GMT --logtostderr log to standard error instead of files (default true)
Error
2022-07-08 15:54:01.593 GMT --metric-resolution duration The resolution at which metrics-server will retain metrics. (default 1m0s)
Error
2022-07-08 15:54:01.593 GMT --permit-port-sharing If true, SO_REUSEPORT will be used when binding the port, which allows more than one instance to bind on the same address and port. [default=false]
Error
2022-07-08 15:54:01.593 GMT --profiling Enable profiling via web interface host:port/debug/pprof/ (default true)
Error
2022-07-08 15:54:01.593 GMT --requestheader-allowed-names strings List of client certificate common names to allow to provide usernames in headers specified by --requestheader-username-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed.
Error
2022-07-08 15:54:01.593 GMT --requestheader-client-ca-file string Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in headers specified by --requestheader-username-headers. WARNING: generally do not depend on authorization being already done for incoming requests.
Error
2022-07-08 15:54:01.593 GMT --requestheader-extra-headers-prefix strings List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-])
Error
2022-07-08 15:54:01.593 GMT --requestheader-group-headers strings List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group])
Error
2022-07-08 15:54:01.593 GMT --requestheader-username-headers strings List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user])
Error
2022-07-08 15:54:01.593 GMT --secure-port int The port on which to serve HTTPS with authentication and authorization. If 0, don't serve HTTPS at all. (default 443)
Error
2022-07-08 15:54:01.593 GMT --skip_headers If true, avoid header prefixes in the log messages
Error
2022-07-08 15:54:01.593 GMT --skip_log_headers If true, avoid headers when opening log files
Error
2022-07-08 15:54:01.593 GMT --stderrthreshold severity logs at or above this threshold go to stderr (default 2)
Error
2022-07-08 15:54:01.593 GMT --tls-cert-file string File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory specified by --cert-dir.
Error
2022-07-08 15:54:01.593 GMT --tls-cipher-suites strings Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be used.
Error
2022-07-08 15:54:01.593 GMT Preferred values: TLS_AES_128_GCM_SHA256, TLS_AES_256_GCM_SHA384, TLS_CHACHA20_POLY1305_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256, TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256, TLS_RSA_WITH_3DES_EDE_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_256_CBC_SHA, TLS_RSA_WITH_AES_256_GCM_SHA384.
Error
2022-07-08 15:54:01.593 GMT Insecure values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_RC4_128_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_RC4_128_SHA, TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_RC4_128_SHA.
Error
2022-07-08 15:54:01.593 GMT --tls-min-version string Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12, VersionTLS13
Error
2022-07-08 15:54:01.593 GMT --tls-private-key-file string File containing the default x509 private key matching --tls-cert-file.
Error
2022-07-08 15:54:01.593 GMT --tls-sni-cert-key namedCertKey A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. The domain patterns also allow IP addresses, but IPs should only be used if the apiserver has visibility to the IP address requested by a client. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.crt,example.key" or "foo.crt,foo.key:*.foo.com,foo.com". (default [])
Error
2022-07-08 15:54:01.593 GMT -v, --v Level number for the log level verbosity
Error
2022-07-08 15:54:01.593 GMT --version Show version
Error
2022-07-08 15:54:01.593 GMT --vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
Error
2022-07-08 15:54:01.593 GMT
Error
2022-07-08 15:54:01.688 GMT panic: failed to get delegated authentication kubeconfig: failed to get delegated authentication kubeconfig: open /var/run/secrets/kubernetes.io/serviceaccount/token: permission denied goroutine 1 [running]: main.main() /go/src/sigs.k8s.io/metrics-server/cmd/metrics-server/metrics-server.go:39 +0xfc

从日志来看,似乎有某种权限问题,但metrics-server是GKE的默认服务,而不是我配置的东西,所以我很困惑为什么它没有权限,不确定如何去解决这个问题。

如有任何帮助,不胜感激。

这个问题可能是由于GKE集群中运行的节点池版本。

请注意nodepool/s " nodepool_name "的nodepool版本比控制平面版本老两个以上小版本。Kubernetes版本和版本偏差支持策略仅保证控制平面与最多两个小版本旧的节点兼容;控制平面和任何节点距离较远的集群可能会遇到问题。

由于控制平面定期升级,在节点池级别上禁用了自动升级,因此可能会发生此版本倾斜。

解决Kubernetes版本不兼容的节点池需要升级。要做到这一点,你可以遵循这个指南。

如果您担心在nodepool上运行的工作负载中断,请查看"将工作负载迁移到不同的机器类型"教程。此方法应该允许您创建具有适当版本的节点池并迁移工作负载,从而最大限度地减少应用程序的停机时间。

相关内容

最新更新