堆栈驱动程序分析器无法创建配置文件,因为在 GKE 上运行 Java 的权限



我们正在尝试将我们的应用程序连接到堆栈驱动程序探查器,但由于权限问题似乎失败了。

我们正在 GKE 上运行一个 Java 应用程序。

以下是Dockerfile

FROM gcr.io/google-appengine/jetty
RUN mkdir -p /opt/cprof && 
wget -q -O- https://storage.googleapis.com/cloud-profiler/java/latest/profiler_java_agent.tar.gz 
| tar xzv -C /opt/cprof
RUN java 
-agentpath:/opt/cprof/profiler_java_agent.so=-cprof_service=gke,-logtostderr,-minloglevel=0,-cprof_service_version=1.0.0,-cprof_gce_metadata_server_retry_sleep_sec=10,-cprof_gce_metadata_server_retry_count=12 
-jar "$JETTY_HOME/start.jar" --create-startd --add-to-start=gcloud,http2c --approve-all-licenses
ENV JETTY_ARGS -Djava.util.logging.config.file=WEB-INF/flex.logging.properties
ENV DBG_ENABLE true
ADD . $APP_DESTINATION_EXPLODED_WAR
ENV JAVA_USER_OPTS -XX:-OmitStackTraceInFastThrow -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics -Xloggc:/tmp/logs.gc

群集已使用以下命令创建:-

gcloud beta container clusters create $cluster_name --machine-type=n1-highmem-8 --project=$project_id --zone=us-central1-c --scopes="cloud-platform" --num-nodes=2

我们按照 proprofile-java 中的步骤来配置Dockerfile但探查器失败并显示以下消息。

Failed to create profile, will retry: 7 (The caller does not have permission)

检查下面的完整部署日志:-

12:53:42 starting build "18a3a03c-6dc8-4683-8fae-92ec22aa84a8"
12:53:42 
12:53:42 FETCHSOURCE
12:53:42 Fetching storage object: gs://project_id_cloudbuild/source/1562838709.77-78c301e261714cd2a46391b235d5edc5.tgz#1562838738548166
12:53:42 Copying gs://project_id_cloudbuild/source/1562838709.77-78c301e261714cd2a46391b235d5edc5.tgz#1562838738548166...
12:53:42 / [0 files][    0.0 B/239.8 MiB]                                                
-
- [0 files][ 56.0 MiB/239.8 MiB]                                                

|
| [0 files][165.0 MiB/239.8 MiB]                                                
/
/ [1 files][239.8 MiB/239.8 MiB]                                                
12:53:42 Operation completed over 1 objects/239.8 MiB.                                    
12:53:42 BUILD
12:53:42 Already have image (with digest): gcr.io/cloud-builders/docker
12:53:42 Sending build context to Docker daemon  396.6MB
12:53:42 Step 1/8 : FROM gcr.io/google-appengine/jetty
12:53:42 latest: Pulling from google-appengine/jetty
12:53:42 Digest: sha256:7e37b8561b2f25660d1aa492dea4f09a6121fe7f8b7f6b2e9f8c65e1cf33328e
12:53:42 Status: Downloaded newer image for gcr.io/google-appengine/jetty:latest
12:53:42  ---> dac4353b3a0c
12:53:42 Step 2/8 : RUN mkdir -p /opt/cprof &&   wget -q -O- https://storage.googleapis.com/cloud-profiler/java/latest/profiler_java_agent.tar.gz   | tar xzv -C /opt/cprof
12:53:42  ---> Running in 3799f9af9c33
12:53:42 NOTICES
12:53:42 profiler_java_agent.so
12:53:42 Removing intermediate container 3799f9af9c33
12:53:42  ---> 8cd480829757
12:53:42 Step 3/8 : RUN java    -agentpath:/opt/cprof/profiler_java_agent.so=-cprof_service=gke,-logtostderr,-minloglevel=0,-cprof_service_version=1.0.0,-cprof_gce_metadata_server_retry_sleep_sec=10,-cprof_gce_metadata_server_retry_count=12    -jar "$JETTY_HOME/start.jar" --create-startd --add-to-start=gcloud,http2c --approve-all-licenses
12:53:42  ---> Running in bad2457cd65c
12:53:42 [91mI0711 09:52:57.100730     7 entry.cc:268] Profiler agent loaded
12:53:42 [0m[91mI0711 09:52:57.105134     7 entry.cc:154] Prepare JVMTI
12:53:42 [0m[91mI0711 09:52:57.301863     7 entry.cc:108] On VM init
12:53:42 [0m[91mI0711 09:52:57.304172     7 cloud_env.cc:136] Project ID is not set via flag or environment, will get from the metadata server
12:53:42 [0m[91mI0711 09:52:57.304981     7 throttler_api.cc:269] Will use profiler service cloudprofiler.googleapis.com to create and upload profiles
12:53:42 [0m[91mI0711 09:52:57.315691    15 throttler_api.cc:202] Initialized deployment: project_id=project_id, service=gke, service_version=1.0.0, zone_name=us-central1-f
12:53:42 [0m[91mI0711 09:52:57.320428    15 throttler_api.cc:302] Creating a new profile via profiler service
12:53:42 [0m[91mW0711 09:52:57.539041    15 throttler_api.cc:382] Failed to create profile, will retry: 7 (The caller does not have permission)
12:53:42 [0m[91mINFO  [0m[91m: [0m[91mAll Licenses Approved via Command Line Option[0m[91m
12:53:42 [0m[91mINFO  [0m[91m: [0m[91mgcloud          initialized in ${jetty.base}/start.d/gcloud.ini[0m[91m
12:53:42 [0m[91mINFO  [0m[91m: [0m[91mhttp2c          initialized in ${jetty.base}/start.d/http2c.ini[0m[91m
12:53:42 [0m[91mINFO  [0m[91m: [0m[91mBase directory was modified[0m[91m
12:53:42 [0m[91mI0711 09:52:58.506006     7 entry.cc:143] On VM death
12:53:42 [0m[91mI0711 09:53:25.504830    15 throttler_api.cc:302] Creating a new profile via profiler service
12:53:42 I0711 09:53:25.505025    15 worker.cc:177] Exiting the profiling loop
12:53:42 [0mRemoving intermediate container bad2457cd65c
12:53:42  ---> e9a969345ed1
12:53:42 Step 4/8 : ENV JETTY_ARGS -Djava.util.logging.config.file=WEB-INF/flex.logging.properties
12:53:42  ---> Running in d3e4b6456694
12:53:42 Removing intermediate container d3e4b6456694
12:53:42  ---> 52deb5efe19d
12:53:42 Step 5/8 : ENV DBG_ENABLE true
12:53:42  ---> Running in 6bd2b94ae0fc
12:53:42 Removing intermediate container 6bd2b94ae0fc
12:53:42  ---> 9800d1e2d074
12:53:42 Step 6/8 : ADD . $APP_DESTINATION_EXPLODED_WAR
12:53:42  ---> e1d7881f4558
12:53:42 Step 7/8 : ENV JAVA_USER_OPTS -XX:-OmitStackTraceInFastThrow -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics -Xloggc:/tmp/logs.gc
12:53:42  ---> Running in 81247ccde74f
12:53:42 Removing intermediate container 81247ccde74f
12:53:42  ---> d66dbce64869
12:53:42 Step 8/8 : ENV GCLOUD_PROJECT project_id
12:54:07  ---> Running in af00f02783b7
12:54:07 Removing intermediate container af00f02783b7
12:54:07  ---> 77826e33ef3c
12:54:07 Successfully built 77826e33ef3c
12:54:07 Successfully tagged gcr.io/project_id/ram-image:1
12:54:07 PUSH
12:54:07 Pushing gcr.io/project_id/ram-image:1
12:54:07 The push refers to repository [gcr.io/project_id/ram-image]

我们用于部署的服务帐户已roles/cloudprofiler.agent,但仍然失败。 知道我们缺少什么权限吗?

更新:GKE 节点使用默认的计算引擎服务帐户,我向其添加了roles/cloudprofiler.agent,但仍然出现相同的错误。

我遇到了同样的问题,在我的情况下,这个特定的 GKE 服务有自己的服务帐户,有权访问 BigQuery 数据集,并且秘密被放置在容器内/var/secrets/google/key.json以便应用程序 BiqQuery lib 在启动期间在那里找到它以及GOOGLE_APPLICATION_CREDENTIALS环境变量,这里的问题可能是探查器代理看起来像也在检查在此变量中设置的路径, 当它启动时,如果它找到它 - 它开始将其用于所有 cloudprofiler.googleapis.com 请求。 可以在此处找到有关此内容的一些信息 - https://cloud.google.com/profiler/docs/profiling-external#using_service_accounts 解决方案之一可能是为此服务帐户提供访问Google事件探查器服务的其他权限。

此外,如果不是这种情况,您应该检查附加到 GKE 节点的默认服务帐户的权限,有关此内容的更多信息 - https://cloud.google.com/kubernetes-engine/docs/how-to/hardening-your-cluster

Each GKE node has an IAM Service Account associated with it. By default, nodes are given the Compute Engine default service account, which you can find by navigating to the IAM section of the Cloud Console. 

希望这可以帮助某人。

最新更新