AWS Glue工作流具有多个链接的AWS Glue作业。
如何获取给定AWS Glue Job运行ID的工作流ID?
我在aws-cli中找不到api。
请注意,我正在尝试使用外部Python代码来分析作业运行指标。
您可以使用此代码来获取runID
import boto3
from awsglue.utils import getResolvedOptions
glue_client = boto3.client("glue")
args = getResolvedOptions(sys.argv, ['JOB_NAME','WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
runID = args['WORKFLOW_RUN_ID']
您可以使用类似于boto3
库顶部构建的自定义函数:
def get_wid_from_jid(jid: str) -> str:
client = boto3.client('glue', region_name='us-east-1')
response = client.list_workflows()
jid_to_wid = {}
for workflow in response.get('Workflows', []):
response2 = client.get_workflow_runs(Name=workflow, IncludeGraph=True)
for run in response2.get('Runs', []):
wid = run.get('WorkflowRunId')
for node in run.get('Graph', {'Nodes': []}).get('Nodes', []):
for job in node.get('JobDetails', {'JobRuns': []}).get('JobRuns', []):
jid2 = job.get('Id')
if wid and jid2:
jid_to_wid[jid2] = wid
return jid_to_wid.get(jid, 'Error: Glue Job Run ID unknown.')
您可以按照以下示例运行此函数:
print(get_wid_from_jid('TestingFailScenarioHere'))
# Output:
# Error: Glue Job Run ID unknown.
print(get_wid_from_jid('jr_xxxxxxxxxxxxxxxxxxxx'))
# Output:
# wr_xxxxxxxxxxxxxxxxx