选择具有最大字符长度(以 sas 为单位)的行



我有以下数据集:

dataseta:
No. Name1 Name2                       Sales Inv Comp
1   TC    Tribal Council Inc          100   100  0
2.  TC    Tribal Council Limited INC  20    25   65
desired output:
datasetb:
No. Name1 Name2                       Sales Inv Comp
1   TC    Tribal Council Limited Inc  120   125  0

基本上,我需要为列名2选择最大字符长度的行。 我尝试了以下方法,但没有用

proc sql;
create table datasetb as select no,name1,name2,sum(sales),sum(inv),min(comp) from dataseta group by 1,2,3 having length(name2)=max(length(name2));quit;

如果我执行以下代码,它只会部分解析它,并且我得到重复的行

proc sql;
create table datasetb as select no,name1,max(length(name2)),sum(sales),sum(inv),min(comp) from dataseta group by 1,2 having length(name2)=max(length(name2));quit;

您似乎正在联接两个单独的聚合计算的结果。

假定:

no是唯一的,以便允许决胜标准,第一个(每no(最长的name2将与costinvcomp总计超过name1连接。

查询将发生很多事情...

  • name1中第一个最长的name2,嵌套子查询需要:

    • 确定最长name2,然后
    • 如果有多个,则根据no选择第一个。
  • 总计超过name1

    • 总计将是联接到的子查询,用于提供所需的结果集。

示例 (SQL(

data have;
length no 8 name1 $6 name2 $35 sales inv comp 8;
input 
no name1& name2&                      sales inv  comp; datalines;
1   TC    Tribal Council Inc          100   100  0    * name1=TC group
2   TC    Tribal Council Limited INC  20    25   65
3   TC    Tribal council co           0     0    0 
4   TC    The Tribal council Assoctn  10    10   10
7   LS    Longshore association       10    10   0    * name=LS group
8   LS    The Longshore Group, LLC    2     4    8
9   LS    The Longshore Group, llc    15    15   6
run;
proc sql;
create table want as
select 
first_longest_name2.no,
first_longest_name2.name1,
first_longest_name2.name2,
name1_totals.sales,
name1_totals.inv,
name1_totals.comp
FROM
(
select 
no, name1, name2 
from 
( select 
no, name1, name2 
from have
group by name1
having length(name2) = max(length(name2))
) longest_name2s
group by name1
having no = min(no)
) as
first_longest_name2
LEFT JOIN 
( 
select  
name1,
sum(sales) as sales,
sum(inv) as inv,
sum(comp) as comp
from 
have
group by name1
) as
name1_totals
ON
first_longest_name2.name1 = name1_totals.name1
;
quit;

示例(数据步骤(

name1组是连续行时,以串行方式处理数据可以使用 DOW 循环技术完成 - 即包含SET语句的循环。

data want2;
do until (last.name1);
set have;
by name1 notsorted;
if length(name2) > longest then do;
longest = length(name2);
no_at_longest = no;
name2_at_longest = name2;
end;
sales_sum = sum(sales_sum,sales);
inv_sum   = sum(inv_sum,inv);
comp_sum  = sum(comp_sum,comp);
end;
drop name2 no sales inv comp longest;
rename 
no_at_longest = no
name2_at_longest = name2
sales_sum = sales
inv_sum = inv
comp_sum = comp
;
run;

最新更新