AWS XRay SDK无法读取docker容器中的环境变量



AWS XRay是一种跟踪服务,允许您跟踪分布式系统中的请求,甚至对您的服务进行配置。XRay基本上监控您的服务,并通过UDP将每个请求的数据发送到服务的守护进程,该守护进程收集这些数据并将其发送到AWS。

当这个守护程序在本地或EC2中运行时,它是运行服务的机器的本地守护程序,并且在端口2000上可用。这是守护程序主机位置的默认配置。

当在Kubernetes中运行时,需要设置一个守护进程在每个节点上运行。根据使用Kubernetes设置XRay的文档,您可以通过使用所需主机设置环境变量AWS_XRAY_DAEMON_ADDRESS来覆盖默认值,也可以设置JVM系统变量com.amazonaws.xray.emitters.daemonAddress。SDK文档中也提到了这一点。

由于我的用例以及我们在组织中共享配置的方式,我希望使用设置环境变量的方法。

根据文件,我们通过我们的舵图将其部署:

env:
- name: AWS_XRAY_DAEMON_ADDRESS
value: aws-xray-daemon.default

通过执行服务运行的pod,并运行printenv,我们可以看到在部署时成功设置了该值。


问题:

当XRay尝试评测并向守护进程发送数据时,会抛出一个SdkClientException

com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to 127.0.0.1:2000 [/127.0.0.1] failed: Connection refused (Connection refused)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1201) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1147) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524) ~[aws-java-sdk-core-1.11.739.jar!/:na]
at com.amazonaws.services.xray.AWSXRayClient.doInvoke(AWSXRayClient.java:1607) ~[aws-java-sdk-xray-1.11.739.jar!/:na]
at com.amazonaws.services.xray.AWSXRayClient.invoke(AWSXRayClient.java:1574) ~[aws-java-sdk-xray-1.11.739.jar!/:na]
at com.amazonaws.services.xray.AWSXRayClient.invoke(AWSXRayClient.java:1563) ~[aws-java-sdk-xray-1.11.739.jar!/:na]
at com.amazonaws.services.xray.AWSXRayClient.executeGetSamplingRules(AWSXRayClient.java:800) ~[aws-java-sdk-xray-1.11.739.jar!/:na]
at com.amazonaws.services.xray.AWSXRayClient.getSamplingRules(AWSXRayClient.java:771) ~[aws-java-sdk-xray-1.11.739.jar!/:na]
at com.amazonaws.xray.strategy.sampling.pollers.RulePoller.pollRule(RulePoller.java:65) ~[aws-xray-recorder-sdk-core-2.4.0.jar!/:na]
at com.amazonaws.xray.strategy.sampling.pollers.RulePoller.lambda$start$0(RulePoller.java:46) ~[aws-xray-recorder-sdk-core-2.4.0.jar!/:na]
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:na]
at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source) ~[na:na]
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[na:na]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
...

这意味着AWS SDK并没有像文档中所建议的那样获取这个环境变量,而只是使用默认值127.0.0.1:2000

然后,我深入研究了SDK代码,以了解它是如何检索这个变量的,并发现运行它的代码使用System.getenv("AWS_XRAY_DAEMON_ADDRESS"),如下所示:

/**
* Environment variable key used to override the address to which UDP packets will be emitted. Valid values are of the form `ip_address:port`. Takes precedence over any system property,
* constructor value, or setter value used.
*/
public static final String DAEMON_ADDRESS_ENVIRONMENT_VARIABLE_KEY = "AWS_XRAY_DAEMON_ADDRESS";
/**
* System property key used to override the address to which UDP packets will be emitted. Valid values are of the form `ip_address:port`. Takes precedence over any constructor or setter value
* used.
*/
public static final String DAEMON_ADDRESS_SYSTEM_PROPERTY_KEY = "com.amazonaws.xray.emitters.daemonAddress";
public DaemonConfiguration() {
String environmentAddress = System.getenv(DAEMON_ADDRESS_ENVIRONMENT_VARIABLE_KEY);
String systemAddress = System.getProperty(DAEMON_ADDRESS_SYSTEM_PROPERTY_KEY);
if (setUDPAndTCPAddress(environmentAddress)) {
logger.info(String.format("Environment variable %s is set. Emitting to daemon on address %s.", DAEMON_ADDRESS_ENVIRONMENT_VARIABLE_KEY, getUDPAddress()));
} else if (setUDPAndTCPAddress(systemAddress)) {
logger.info(String.format("System property %s is set. Emitting to daemon on address %s.", DAEMON_ADDRESS_SYSTEM_PROPERTY_KEY, getUDPAddress()));
}
}

所以我想,也许我没有正确设置环境变量?因此,我添加了一个在服务启动时检索环境变量的日志,发现JVM确实可以找到值:

代码:

System.out.println("System.getenv("AWS_XRAY_DAEMON_ADDRESS")" + " = " + System.getenv("AWS_XRAY_DAEMON_ADDRESS")) 

输出:

System.getenv("AWS_XRAY_DAEMON_ADDRESS") = aws-xray-daemon.default

据我所知,这段代码与AWS SDK应该运行的代码完全匹配,但它似乎从未被执行过,如果是,它的结果与我用日志测试的结果不同。

在本地运行时,我无法复制这个问题,因为它会从本地环境变量中获取我提供的主机。我还确认,当使用断点在本地运行时,可以达到上面粘贴的AWS SDK代码。

有什么想法吗?


渐变代码段:

ext {
...
springCloudVersion = "Greenwich.RELEASE"
awsCoreVersion = '1.11.739'
awsXrayVersion = '2.4.0' 
...
}
dependencyManagement {
imports {
mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
mavenBom "com.amazonaws:aws-java-sdk-bom:${awsCoreVersion}"
mavenBom "com.amazonaws:aws-xray-recorder-sdk-bom:${awsXrayVersion}"
}
}
dependencies {
...
implementation "com.amazonaws:aws-java-sdk-core"
implementation "com.amazonaws:aws-xray-recorder-sdk-core" 
implementation "com.amazonaws:aws-xray-recorder-sdk-aws-sdk" 
implementation "com.amazonaws:aws-xray-recorder-sdk-spring" 
implementation "com.amazonaws:aws-xray-recorder-sdk-apache-http" 
implementation "com.amazonaws:aws-xray-recorder-sdk-sql-postgres" 
implementation 'org.springframework.boot:spring-boot-starter-web'
implementation 'org.springframework.boot:spring-boot-starter'
implementation 'org.springframework.boot:spring-boot-starter-data-jpa'
implementation 'org.springframework.boot:spring-boot-starter-security'
...
}

其他信息:

  • 在Spring Boot v2.2.1中运行
  • OpenJDK v11.0.4
  • Gradle v6.0.1

其他尝试:-我尝试通过Dockerfile设置环境变量。结果是一样的。

事实证明,我链接的博客文章不是一篇好的博客文章。在这个例子中,他们没有指定主机的端口:

env:
- name: AWS_XRAY_DAEMON_ADDRESS 
value: xray-service.default

更改环境变量以包括端口修复了问题:

env:
- name: AWS_XRAY_DAEMON_ADDRESS 
value: xray-service.default:2000

最新更新