我想从我的查询中获得特定行的不同列,但也想返回其他列,所以我想结合distinct和项目,但只使用我想要唯一值的列。或者更好地说,我希望查询只选择一个运行的管道副本,即使它运行了多次。
ADFActivityRun
| where ActivityType == "Copy" or ActivityType == "ExecuteDataFlow"
| where Status == "Succeeded" or Status == "Failed"
| project TimeGenerated, DataFactory=substring(tostring(split(ResourceId, "/", 8)), 2, strlen(tostring(split(ResourceId, "/", 8)))-4), PipelineRunId, PipelineName, ActivityName, Status, ActivityType, Start, End, ErrorMessage, FailureType, RowsRead = parse_json(Output).rowsRead, RowsCopied = parse_json(Output).rowsCopied, rowsWritten = parse_json(Output).runStatus.metrics.sink1.rowsWritten |order by TimeGenerated desc
| distinct PipelineName, PipelineRunId, ActivityName, Status, ActivityType, DataFactory
您可以使用take_any()
表示any,或者使用arg_max()
表示latest:
例如:
ADFActivityRun
| where ActivityType in ("Copy", "ExecuteDataFlow")
| where Status in ("Succeeded", "Failed")
| extend Output = parse_json(Output)
| project TimeGenerated,
DataFactory = substring(tostring(split(ResourceId, "/", 8)), 2, strlen(tostring(split(ResourceId, "/", 8)))-4),
PipelineRunId,
PipelineName,
ActivityName,
Status,
ActivityType,
Start,
End,
ErrorMessage,
FailureType,
RowsRead = Output.rowsRead,
RowsCopied = Output.rowsCopied,
rowsWritten = Output.runStatus.metrics.sink1.rowsWritten
| summarize arg_max(TimeGenerated, *) by PipelineRunId