SAS:根据第三列将一列转换为两列



>我有以下内容:

data have;
input id  day $ col1;
datalines;
84 day_1_x 302
84 day_1_y 443
18 day_2_x 774
18 day_2_y 281
61 day_5_x 488
61 day_5_y 564
; 
run;

我想根据day创建两个新列xy

| id | day |   x |   y |
|----+-----+-----+-----|
| 84 |   1 | 302 | 443 |
| 18 |   2 | 774 | 281 |
| 61 |   5 | 488 | 564 |
|----+-----+-----+-----|

我幼稚的尝试并没有让我完全到达那里。

data dont_quite_want;
set have;
length x 8 y 8;
if scan(day, 3, '_') = 'x' then x = col1;
else y = col1; 
run;
| Obs | id | day     | col1 | x   | y   |
|-----+----+---------+------+-----+-----|
|   1 | 84 | day_1_x |  302 | 302 | .   |
|   2 | 84 | day_1_y |  443 | .   | 443 |
|   3 | 18 | day_3_x |  774 | 774 | .   |
|   4 | 18 | day_3_y |  281 | .   | 281 |
|   5 | 61 | day_5_x |  488 | 488 | .   |
|   6 | 61 | day_5_y |  564 | .   | 564 |
|-----+----+---------+------+-----+-----|

如果有一次性的方法,我不会感到惊讶,但我想知道,如何将转换分解为单独的步骤?

我可能这样做的方法是proc转置 - 这乞求转置,如果这对您很重要,您可以在数据上一次性完成。

data for_transpose/view=for_transpose;
set have(rename=day=long_day);
day = scan(long_day,2,'_');
varname = scan(long_day,3,'_');
run;
proc transpose data=for_transpose out=want;
by id day notsorted;
var col1;
id varname;
run;

但是,在一个数据步骤中也很容易做到 - 只是更硬编码。

data want;
set have(rename=day=long_day);
by id notsorted; *your two ID rows still need to be adjacent;
length x 8 y 8;
retain x y day;
if first.id then call missing(x,y);  *reset them;
day = scan(long_day,2,'_');
if scan(long_day, 3, '_') = 'x' then x = col1;
else y = col1; 
if last.id then output;
run;

这是转置版本。 @Joe为您提供了完整的数据步骤。

/*Pull out the day and variable name from the day column
Use and index so you don't have to sort*/
data have(index=(id));
set have;
d = input(scan(day,2,"_"),best.);
_name_ = scan(day,3,"_");
run;
/*Transpose the data keeping the variables you want and renaming
as needed */
proc transpose data=have out=want(drop=_name_ rename=(d=day));
by id d;
var col1;
id _name_;
run;

最新更新