Kusto:根据条件查找特定列不明显的地方



URL获取数据集:-

datatable(Id:string, Datetime:datetime, URL:string, country:string)
[
'1', datetime(2018-10-16 00:00), 'URL1',
'1', datetime(2018-10-16 00:01), 'internal-url-predefined',
'2', datetime(2018-10-16 00:02), 'URL2',
'3', datetime(2018-10-16 00:03), 'URL3',
'4', datetime(2018-10-16 00:03), 'URL4',
'4', datetime(2018-10-16 00:03), 'internal-url-predefined',
]

对于一些请求ID,如14,我们有两行。

如何找到per-unit-time,当有another row with same Id时,我们有多少次url == internal-url-predefined的情况

我想实现的伪代码:-

  • 查找所有没有distinct Id的行。因此,答案是14

    | where <<my_other_custom_conditions>>
    | summarize count() by Id
    | where count_ > 1 
    
  • 在上面我们将得到的Ids的匹配中,找出每单位时间哪些具有URL == "internal-url-predefined"

    ?? How do I reuse the result of previous dataset to match 
    rows that have URL == "internal-url-predefined" ??
    

我应该了解Kusto的哪些功能?

看起来我问得太早了,我应该知道materialize运算符,这是一种存储一个查询结果并在下一个查询中使用它的方法。

let dataset=datatable(Id:string, Datetime:datetime, URL:string)
[
'1', datetime(2018-10-16 00:00), 'URL1',
'1', datetime(2018-10-16 00:01), 'internal-url-predefined',
'2', datetime(2018-10-16 00:02), 'URL2',
'3', datetime(2018-10-16 00:03), 'URL3',
'4', datetime(2018-10-16 00:03), 'URL4',
'4', datetime(2018-10-16 00:03), 'internal-url-predefined',
];
let _duplicate_ids = materialize(dataset | summarize count() by Id | where count_ > 1 | project Id);
_duplicate_ids
| join kind=inner (dataset) on Id
| where URL == "internal-url-predefined"
| summarize count() by bin(Datetime, 1s)

结果:-

2018-10-16T00:01:00Z    1
2018-10-16T00:03:00Z    1

最新更新