podTemplate应该是什么样子才能在Kubernetes中实现Jenkins代码驱动的构建代理



我有一个node.js服务,我想用Jenkins在Kubernetes中构建它,使用node.js项目指定的Jenkins代理pod。我正在尝试消除对Jenkins UI的手动触摸。所有的东西都在一个kubernetes集群中运行。

我正在关注这个博客并对其进行一些调整,但遇到了问题:

  1. 我得到一个错误‘Jenkins’ doesn’t have label ‘test-pod’
  2. 作业无限循环

构建代理已在Kubernetes中成功创建。test-pod标签是由Jenkinsfile指定的,所以我不知道为什么会出现这个错误。它是如何无限循环的?

podTemplate(
name: 'test-pod',
label: 'test-pod',
containers: [
containerTemplate(name: 'node14', image: 'node:14-alpine'),
containerTemplate(name: 'docker', image:'trion/jenkins-docker-client'),
],
{
node('test-pod') {
stage('Build'){
container('node14') {
// do nothing just yet
}
}
}
}
)

以下是Jenkins控制台输出的一部分:

Started by user admin
Obtained Jenkinsfile from git ssh://git@kube-master.cluster.dev/git/hello.git
Running in Durability level: MAX_SURVIVABILITY
[Pipeline] Start of Pipeline
[Pipeline] podTemplate
[Pipeline] {
[Pipeline] node
Created Pod: kubernetes jenkins/test-pod-2hdfp-9kcjj
[Normal][jenkins/test-pod-2hdfp-9kcjj][Scheduled] Successfully assigned jenkins/test-pod-2hdfp-9kcjj to kube-worker2.cluster.dev
[Normal][jenkins/test-pod-2hdfp-9kcjj][Pulled] Container image "node:14-alpine" already present on machine
[Normal][jenkins/test-pod-2hdfp-9kcjj][Created] Created container node14
[Normal][jenkins/test-pod-2hdfp-9kcjj][Started] Started container node14
[Normal][jenkins/test-pod-2hdfp-9kcjj][Pulled] Container image "trion/jenkins-docker-client" already present on machine
[Normal][jenkins/test-pod-2hdfp-9kcjj][Created] Created container docker
[Normal][jenkins/test-pod-2hdfp-9kcjj][Started] Started container docker
[Normal][jenkins/test-pod-2hdfp-9kcjj][Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine
[Normal][jenkins/test-pod-2hdfp-9kcjj][Created] Created container jnlp
[Normal][jenkins/test-pod-2hdfp-9kcjj][Started] Started container jnlp
jenkins/test-pod-2hdfp-9kcjj Container node14 was terminated (Exit Code: 0, Reason: Completed)
[Normal][jenkins/test-pod-2hdfp-9kcjj][Killing] Stopping container docker
Created Pod: kubernetes jenkins/test-pod-2hdfp-gc2qb
[Normal][jenkins/test-pod-2hdfp-gc2qb][Scheduled] Successfully assigned jenkins/test-pod-2hdfp-gc2qb to kube-worker2.cluster.dev
[Normal][jenkins/test-pod-2hdfp-gc2qb][Pulled] Container image "node:14-alpine" already present on machine
[Normal][jenkins/test-pod-2hdfp-gc2qb][Created] Created container node14
[Normal][jenkins/test-pod-2hdfp-gc2qb][Started] Started container node14
[Normal][jenkins/test-pod-2hdfp-gc2qb][Pulled] Container image "trion/jenkins-docker-client" already present on machine
[Normal][jenkins/test-pod-2hdfp-gc2qb][Created] Created container docker
[Normal][jenkins/test-pod-2hdfp-gc2qb][Started] Started container docker
[Normal][jenkins/test-pod-2hdfp-gc2qb][Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine
[Normal][jenkins/test-pod-2hdfp-gc2qb][Created] Created container jnlp
[Normal][jenkins/test-pod-2hdfp-gc2qb][Started] Started container jnlp
jenkins/test-pod-2hdfp-gc2qb Container node14 was terminated (Exit Code: 0, Reason: Completed)
Still waiting to schedule task
‘Jenkins’ doesn’t have label test-pod’
[Normal][jenkins/test-pod-2hdfp-gc2qb][Killing] Stopping container docker
Created Pod: kubernetes jenkins/test-pod-2hdfp-xwkm2
[Normal][jenkins/test-pod-2hdfp-xwkm2][Scheduled] Successfully assigned jenkins/test-pod-2hdfp-xwkm2 to kube-worker2.cluster.dev
[Normal][jenkins/test-pod-2hdfp-xwkm2][Pulled] Container image "node:14-alpine" already present on machine
[Normal][jenkins/test-pod-2hdfp-xwkm2][Created] Created container node14
[Normal][jenkins/test-pod-2hdfp-xwkm2][Started] Started container node14
[Normal][jenkins/test-pod-2hdfp-xwkm2][Pulled] Container image "trion/jenkins-docker-client" already present on machine
[Normal][jenkins/test-pod-2hdfp-xwkm2][Created] Created container docker
[Normal][jenkins/test-pod-2hdfp-xwkm2][Started] Started container docker
[Normal][jenkins/test-pod-2hdfp-xwkm2][Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine
[Normal][jenkins/test-pod-2hdfp-xwkm2][Created] Created container jnlp
[Normal][jenkins/test-pod-2hdfp-xwkm2][Started] Started container jnlp
jenkins/test-pod-2hdfp-xwkm2 Container node14 was terminated (Exit Code: 0, Reason: Completed)
[Normal][jenkins/test-pod-2hdfp-xwkm2][Killing] Stopping container docker
Created Pod: kubernetes jenkins/test-pod-2hdfp-4ltq3
[Normal][jenkins/test-pod-2hdfp-4ltq3][Scheduled] Successfully assigned jenkins/test-pod-2hdfp-4ltq3 to kube-worker2.cluster.dev
[Normal][jenkins/test-pod-2hdfp-4ltq3][Pulled] Container image "node:14-alpine" already present on machine
[Normal][jenkins/test-pod-2hdfp-4ltq3][Created] Created container node14
[Normal][jenkins/test-pod-2hdfp-4ltq3][Started] Started container node14
[Normal][jenkins/test-pod-2hdfp-4ltq3][Pulled] Container image "trion/jenkins-docker-client" already present on machine
[Normal][jenkins/test-pod-2hdfp-4ltq3][Created] Created container docker
[Normal][jenkins/test-pod-2hdfp-4ltq3][Started] Started container docker
[Normal][jenkins/test-pod-2hdfp-4ltq3][Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine
[Normal][jenkins/test-pod-2hdfp-4ltq3][Created] Created container jnlp
[Normal][jenkins/test-pod-2hdfp-4ltq3][Started] Started container jnlp
jenkins/test-pod-2hdfp-4ltq3 Container node14 was terminated (Exit Code: 0, Reason: Completed)
[Normal][jenkins/test-pod-2hdfp-4ltq3][Killing] Stopping container docker
Created Pod: kubernetes jenkins/test-pod-2hdfp-0216w
...

更新最新发现

主日志(请参阅调试(没有提供太多其他内容:

...
2021-04-30 11:52:42.715+0000 [id=4660]  INFO    hudson.slaves.NodeProvisioner#lambda$update$6: test-pod-gb4vq-hf3d4 provisioning successfully completed. We have now 2 computer(s)
2021-04-30 11:52:42.741+0000 [id=4659]  INFO    o.c.j.p.k.KubernetesLauncher#launch: Created Pod: kubernetes jenkins/test-pod-gb4vq-hf3d4
2021-04-30 11:52:42.847+0000 [id=4680]  WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: test-pod-gb4vq-pdd69, template=PodTemplate{id='f29ecbdd-9c1d-468f-86ff-dd46ff40f306', name='test-pod-gb4vq', namespace='jenkins', label='test-pod', containers=[ContainerTemplate{name='node14', image='node:14-alpine'}, ContainerTemplate{name='docker', image='trion/jenkins-docker-client'}], annotations=[PodAnnotation{key='buildUrl', value='http://172.16.1.12/job/hello/14/'}, PodAnnotation{key='runUrl', value='job/hello/14/'}]}
java.lang.IllegalStateException: Pod is no longer available: jenkins/test-pod-gb4vq-pdd69
...

只是它表明容器正在启动,然后出现故障。这个循环似乎是因为Kubernetes插件中的错误处理没有正确捕获它并使作业失败。

通过观察构建pod(使用k9s(,我能够捕获pod的日志,Unknown client name听起来也像是由快速容器终止引起的:

jnlp INFO: [JNLP4-connect connection to 172.16.1.12/172.16.1.12:50000] Local headers refused by remote: Unknown client name: test-pod-34sd7-5xhs2                                                                                              
jnlp Apr 29, 2021 10:42:15 PM hudson.remoting.jnlp.Main$CuiListener status                                                                                                                                                                      
jnlp INFO: Protocol JNLP4-connect encountered an unexpected exception                                                                                                                                                                           
jnlp java.util.concurrent.ExecutionException: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: test-pod-34sd7-5xhs2
jnlp     at org.jenkinsci.remoting.util.SettableFuture.get(SettableFuture.java:223)                                                                                                                                                             
jnlp     at hudson.remoting.Engine.innerRun(Engine.java:743)                                                                                                                                                                                    
jnlp     at hudson.remoting.Engine.run(Engine.java:518)                                                                                                                                                                                         
jnlp Caused by: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: test-pod-34sd7-5xhs2                                                                                                                     
jnlp     at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.newAbortCause(ConnectionHeadersFilterLayer.java:378)

刚刚发现类似的问题

这很有用:我在label之后将podRetention: always(),添加到podTemplate(),这样代理pod就不会终止,并且它们显示Error

发现良好

由于上面保留了错误的pod,我现在可以找到/var/log/containers/<failed pod>.log,它让我找到了根本原因。

2021-04-30T08:59:36.047989534-04:00 stderr F java.net.UnknownHostException: updates.jenkins.io

这是因为dnsPolicy将DNS限制为仅用于集群查找。解决方法是将hostNetwork: true添加到label旁边的podTemplate()

接下来,博客推荐的图片trion/jenkins-docker-client是客户端和服务器,所以它是错误的图片。

切换到jenkins/agent会产生一个新问题。吊舱现在上上下下什么都不做,甚至不记录。我怀疑这是一个启动参数问题。

现在很明显,我甚至不应该在Jenkinsfile中有Jenkins容器,因为Kubernetes插件将自动启动JNLP容器。

这意味着问题最终出现在node14容器上——它要么立即出错,要么立即找不到要做的事情并终止。

错误处理很难理解和解决,博客也是错误的。

从最低限度的工作代理开始Jenkinsfile:

podTemplate(
name: 'build-pod',
namespace: 'jenkins',
podRetention: always(),  // for debugging
{
node(POD_LABEL) {
stage('Build') {
sh "echo hello"
}
}
}
)

从那里开始,使用containersvolumescontainer构建部分等对其进行扩展,一步一个脚印。

使用日志进行故障排除:

kubectl get pods -n jenkins列出吊舱名称,然后kubectl logs -f <jenkins-pod> -n jenkins

(假设jenkins是您的Kubernetes命名空间(

最新更新