对于Glue工作流中的Glue作业-给定Glue运行id,如何访问Glue工作流运行id



AWS Glue工作流具有多个链接的AWS Glue作业。

如何获取给定AWS Glue Job运行ID的工作流ID?

我在aws-cli中找不到api。

请注意,我正在尝试使用外部Python代码来分析作业运行指标。

您可以使用此代码来获取runID

import boto3
from awsglue.utils import getResolvedOptions
glue_client = boto3.client("glue")
args = getResolvedOptions(sys.argv, ['JOB_NAME','WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
runID = args['WORKFLOW_RUN_ID']

您可以使用类似于boto3库顶部构建的自定义函数:

def get_wid_from_jid(jid: str) -> str:
client = boto3.client('glue', region_name='us-east-1')
response = client.list_workflows()
jid_to_wid = {}
for workflow in response.get('Workflows', []):
response2 = client.get_workflow_runs(Name=workflow, IncludeGraph=True)
for run in response2.get('Runs', []):
wid = run.get('WorkflowRunId')
for node in run.get('Graph', {'Nodes': []}).get('Nodes', []):
for job in node.get('JobDetails', {'JobRuns': []}).get('JobRuns', []):
jid2 = job.get('Id')
if wid and jid2:
jid_to_wid[jid2] = wid
return jid_to_wid.get(jid, 'Error: Glue Job Run ID unknown.')

您可以按照以下示例运行此函数:

print(get_wid_from_jid('TestingFailScenarioHere'))
# Output:
# Error: Glue Job Run ID unknown.
print(get_wid_from_jid('jr_xxxxxxxxxxxxxxxxxxxx'))
# Output:
# wr_xxxxxxxxxxxxxxxxx

相关内容

  • 没有找到相关文章

最新更新