使用 Python 检测具有 if 语句的 jinja2 变量



从下面的文件中,我只想提取 if 语句块并迭代它们 还想只提取那些在块内有图像:作为键的

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: {{ template "fullname" . }}
labels:
app: {{ template "fullname" . }}
chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
release: "{{ .Release.Name }}"
heritage: "{{ .Release.Service }}"
spec:
replicas: {{ .Values.replicas }}
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
minReadySeconds: 5
template:
metadata:
labels:
name: {{ template "fullname" . }}
app: {{ template "fullname" . }}
spec:
{{- if .Values.pvc.enabled }}
volumes:
- name: {{ template "fullname" . }}
persistentVolumeClaim:
claimName: {{ template "claimname" . }}
{{- end }}
{{- if .Values.k8swait.enabled }}
serviceAccountName: {{ template "fullname" . }}-admin
initContainers:
- env:
- name: CLUSTER
value: "{{ .Values.k8swait.parameters.cluster}}"
- name: NAMESPACE
value: "{{ .Release.Namespace }}"
- name: RESOURCE
value: "{{ .Values.k8swait.parameters.resource}}"
- name: RNAME
value: "{{ .Values.k8swait.job.jobname }}"
- name: TIMEOUT
value: "{{ .Values.k8swait.parameters.timeout}}"
- name: FREQUENCE
value: "{{ .Values.k8swait.parameters.frequence}}"
name: {{ .Values.k8swait.parameters.name}}
image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
resources:
limits:
cpu: "{{ .Values.resources.limits.cpu }}"
memory: "{{ .Values.resources.limits.memory }}"
requests:
cpu: "{{ .Values.resources.requests.cpu }}"
memory: "{{ .Values.resources.requests.memory }}"
{{- end }}
securityContext:
runAsUser: 1000
fsGroup: 1000
containers:
- name: {{ template "fullname" . }}
image: "{{ .Values.global.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ default "" .Values.imagePullPolicy | quote }}
ports:
- name: http
containerPort: 9000
{{- if .Values.pvc.enabled }}
image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
volumeMounts:
- mountPath: /BACKUP
name: "{{ template "fullname" . }}"
{{- end }}

期望输出 :

{{- if .Values.k8swait.enabled }}
serviceAccountName: {{ template "fullname" . }}-admin
initContainers:
- env:
- name: CLUSTER
value: "{{ .Values.k8swait.parameters.cluster}}"
- name: NAMESPACE
value: "{{ .Release.Namespace }}"
- name: RESOURCE
value: "{{ .Values.k8swait.parameters.resource}}"
- name: RNAME
value: "{{ .Values.k8swait.job.jobname }}"
- name: TIMEOUT
value: "{{ .Values.k8swait.parameters.timeout}}"
- name: FREQUENCE
value: "{{ .Values.k8swait.parameters.frequence}}"
name: {{ .Values.k8swait.parameters.name}}
image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
resources:
limits:
cpu: "{{ .Values.resources.limits.cpu }}"
memory: "{{ .Values.resources.limits.memory }}"
requests:
cpu: "{{ .Values.resources.requests.cpu }}"
memory: "{{ .Values.resources.requests.memory }}"
{{- end }}
{{- if .Values.pvc.enabled }}
image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
volumeMounts:
- mountPath: /BACKUP
name: "{{ template "fullname" . }}"
{{- end }}

我尝试了以下代码,但它无法正常工作

with open(args.dataFileName) as fd:
data = fd.read()
match = re.findall(r'{{-?s?if .+ ends?}}', data, re.DOTALL)

如您所见,所需的输出仅包含 if 内部以图像为键的语句块 任何提示如何使用正则表达式实现这一点?

正则表达式的限制是,这仅在if块未嵌套时才有效。

另外,我只熟悉 Jinja2 中用于if 块{% if %}{% endif %}。所以,我正在跟随你的领导寻找{{-?s*ifs*}}{{-?s*ends*}}.如果这不正确,很容易补救。

import re
text = """apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: {{ template "fullname" . }}
labels:
app: {{ template "fullname" . }}
chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
release: "{{ .Release.Name }}"
heritage: "{{ .Release.Service }}"
spec:
replicas: {{ .Values.replicas }}
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
minReadySeconds: 5
template:
metadata:
labels:
name: {{ template "fullname" . }}
app: {{ template "fullname" . }}
spec:
{{- if .Values.pvc.enabled }}
volumes:
- name: {{ template "fullname" . }}
persistentVolumeClaim:
claimName: {{ template "claimname" . }}
{{- end }}
{{- if .Values.k8swait.enabled }}
serviceAccountName: {{ template "fullname" . }}-admin
initContainers:
- env:
- name: CLUSTER
value: "{{ .Values.k8swait.parameters.cluster}}"
- name: NAMESPACE
value: "{{ .Release.Namespace }}"
- name: RESOURCE
value: "{{ .Values.k8swait.parameters.resource}}"
- name: RNAME
value: "{{ .Values.k8swait.job.jobname }}"
- name: TIMEOUT
value: "{{ .Values.k8swait.parameters.timeout}}"
- name: FREQUENCE
value: "{{ .Values.k8swait.parameters.frequence}}"
name: {{ .Values.k8swait.parameters.name}}
image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
resources:
limits:
cpu: "{{ .Values.resources.limits.cpu }}"
memory: "{{ .Values.resources.limits.memory }}"
requests:
cpu: "{{ .Values.resources.requests.cpu }}"
memory: "{{ .Values.resources.requests.memory }}"
{{- end }}
securityContext:
runAsUser: 1000
fsGroup: 1000
containers:
- name: {{ template "fullname" . }}
image: "{{ .Values.global.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ default "" .Values.imagePullPolicy | quote }}
ports:
- name: http
containerPort: 9000
{{- if .Values.pvc.enabled }}
image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
volumeMounts:
- mountPath: /BACKUP
name: "{{ template "fullname" . }}"
{{- end }}"""
start_if = r'{{-?s*ifs*[^}]+}}' # {{- if }}
end_if = r'{{-?s*ends*}}' # {{- end }}
regex = re.compile(f'{start_if}(.*?){end_if}', flags=re.DOTALL)
matches = [m.group(0) for m in regex.finditer(text) if 'image: ' in m.group(1)]
for match in matches:
print(match)
print()

指纹:

{{- if .Values.k8swait.enabled }}
serviceAccountName: {{ template "fullname" . }}-admin
initContainers:
- env:
- name: CLUSTER
value: "{{ .Values.k8swait.parameters.cluster}}"
- name: NAMESPACE
value: "{{ .Release.Namespace }}"
- name: RESOURCE
value: "{{ .Values.k8swait.parameters.resource}}"
- name: RNAME
value: "{{ .Values.k8swait.job.jobname }}"
- name: TIMEOUT
value: "{{ .Values.k8swait.parameters.timeout}}"
- name: FREQUENCE
value: "{{ .Values.k8swait.parameters.frequence}}"
name: {{ .Values.k8swait.parameters.name}}
image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
resources:
limits:
cpu: "{{ .Values.resources.limits.cpu }}"
memory: "{{ .Values.resources.limits.memory }}"
requests:
cpu: "{{ .Values.resources.requests.cpu }}"
memory: "{{ .Values.resources.requests.memory }}"
{{- end }}
{{- if .Values.pvc.enabled }}
image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
volumeMounts:
- mountPath: /BACKUP
name: "{{ template "fullname" . }}"
{{- end }}

查看演示

即使你有 Nested if 语句,你仍然可以使用 Regex 来做到这一点,并且 然后解析文本文件会很快:

import re
code = """
some text ....some text ....some text ....
some text ....some text ....some text ....
{{- if .Values.pvc.enabled [don't extract this 0]}}
some text ....
{{- end }}
{{- if .Values.k8swait.enabled  [extract this 1}}
some text ....
image:
{{- end }}
{{- if .Values.k8swait.enabled [extract this 2]}}
some text ....
image: 000
{{- if [extract this 3]}}
image: 000 
{{- end }}
{{- end }}
{{- if .Values.k8swait.enabled  [extract this 4}}
some text ....
image:
{{- end }}
{{- if .Values.k8swait.enabled [extract this 5]}}
some text ....
image: 000
{{- if [don't extract this sub if 6 ]}}
{{- end }}
{{- end }}
"""

def extract_image_if_statement(text):
# this to extract nested ifs or if preceded by if statements
sub_if = re.compile("((?:{{-s*if.+?)+)({{-s*if.+?ends*}})", re.DOTALL)
# this to extract if statement that left by the first pattern
outer_if = re.compile("{{-s*if.+?ends*}}", re.DOTALL)
# used to get the if statement by index from expression list
get_if = re.compile("#(d)#")
# used to build back full nested expression
expression = []
# to hold expression that contains image: word
result = []
index = 0
def extract_if(pattern, repl, index_group):
"""
extract the if statement to expression and replace it with special word in the text.
#index_in_expression_list#.
index_group is the position of the target if statement because we have two pattern.
repl contains {} to format the current index of extract if statement
"""
nonlocal text
nonlocal index
m = pattern.search(text)
while m:
expression.append(m.group(index_group))
text = pattern.sub(repl.format(index), text)
m = sub_if.search(text)
index += 1
return index
def build_if_statement(exp):
""" we have the index of exp in expression so keep building back the statement, this is only for nested statements"""
while get_if.search(exp):
exp = get_if.sub(lambda m: expression[int(m.group(1))], exp)
return exp
# extract all if statements
extract_if(sub_if, r'1#{}#', 2)
extract_if(outer_if, r'#{}#', 0)
# for debugging
# print('nnn'.join(expression))
result = [build_if_statement(exp) for exp in expression if 'image:' in exp]
# for debugging
# print('nn'.join(result))
# print(text)  # if you need Order this will help with it just tell me so I can fix that.
return result

# Note this extract sub if and outer if if they both have image: word like [2,3]
print(('n'+'-'*100+'n').join(extract_image_if_statement(code)))

输出:

{{- if .Values.k8swait.enabled [extract this 5]}}
some text ....
image: 000
{{- if [don't extract this sub if 6 ]}}
{{- end }}
{{- end }}
----------------------------------------------------------------------------------------------------
{{- if .Values.k8swait.enabled  [extract this 4}}
some text ....
image:
{{- end }}
----------------------------------------------------------------------------------------------------
{{- if [extract this 3]}}
image: 000 
{{- end }}
----------------------------------------------------------------------------------------------------
{{- if .Values.k8swait.enabled [extract this 2]}}
some text ....
image: 000
{{- if [extract this 3]}}
image: 000 
{{- end }}
{{- end }}
----------------------------------------------------------------------------------------------------
{{- if .Values.k8swait.enabled  [extract this 1}}
some text ....
image:
{{- end }}

如果if statements的顺序对您很重要,我们也可以解决这个问题,只需添加您希望如何提取嵌套语句的注释即可。 如果在嵌套if statements的情况下,如果外部表达式有image:单词并且子表达式也有该单词,那么在结果中,如果您也不想这样做,我会提取两个元素,只需添加注释,我也会修复它。

我希望这能帮助你好运。

最新更新