SQL按文档编号汇总并检测列中的差异

  • 本文关键字:文档 编号 SQL sql
  • 更新时间 :
  • 英文 :


我想知道如何根据字段Purch Org对我的数据集进行汇总/分组,以及在同一列中值不同的情况下添加一个"#多个";单词

我的最终结果应该是每个单据号一个唯一的行

我正在研究Databricks,所以解决方案可以是SQL或Python

物料组><29130000(家用洗涤用品(>价值合同WKDL08XYW价值合同WKDE(耗材(DL08XYW价值合同WKD8(设备(
单据编号 采购组织采购_文档1采购_类别
465503991 DL08XYW
46550399132031600(过滤器(实验室((L072(实验室耗材(
46550399132169000(实验室用品(L072(实验室用品
select   [Doc Number]
,case when min([Purch Org])            <> max([Purch Org])            then '#multiple#' else min([Purch Org])            end as [Purch Org]
,case when min([Material_Group])       <> max([Material_Group])       then '#multiple#' else min([Material_Group])       end as [Material_Group]
,case when min([purchasing_document])  <> max([purchasing_document])  then '#multiple#' else min([purchasing_document])  end as [purchasing_document]
,case when min([Procurement_Class_2])  <> max([Procurement_Class_2])  then '#multiple#' else min([Procurement_Class_2])  end as [Procurement_Class_2]
,case when min([Procurement_Category]) <> max([Procurement_Category]) then '#multiple#' else min([Procurement_Category]) end as [Procurement_Category]
from     t
group by [Doc Number]

另一种选择(更简洁的语法,但min<>max解决方案会运行得更好(:

select   [Doc Number]
,case when count(distinct [Purch Org])            > 1   then '#multiple#' else min([Purch Org])            end as [Purch Org]
,case when count(distinct [Material_Group])       > 1   then '#multiple#' else min([Material_Group])       end as [Material_Group]
,case when count(distinct [purchasing_document])  > 1   then '#multiple#' else min([purchasing_document])  end as [purchasing_document]
,case when count(distinct [Procurement_Class_2])  > 1   then '#multiple#' else min([Procurement_Class_2])  end as [Procurement_Class_2]
,case when count(distinct [Procurement_Category]) > 1   then '#multiple#' else min([Procurement_Category]) end as [Procurement_Category]
from     t
group by [Doc Number]
物料组>#multiple#价值合同WK
单据编号采购组织采购_文档1采购_类别
465503991DL08XYW

最新更新