SAS 中嵌套 dow 循环的说明

我想知道下面的伪代码是如何工作的？如果有一个工作的例子，我很感激。

data want;
do until (last.var1);
do until (last.var2);
set have;
* other sas statements;
end;
end;
run;

基本上，单个 DoW 循环允许您在变量边界的每个操作之后执行特定操作，并且与正常数据步骤的时序略有不同(这可能有帮助，也可能没有帮助(。所以给定这个集合：

data have;
input x y z;
datalines;
1 1 1
1 1 2
1 2 1
1 2 2
2 1 1
2 1 2
2 2 1
2 2 2
;;;;
run;

这是正常的数据步骤：

data want;
set have;
by x;
if first.x then do;
put "First value of " x=;
end;
put _all_;
if last.x then do;
put "Last value of " x=;
end;
run;

这是道琼斯指数：

data want_dow;
put "First value of " x=;
do _n_ = 1 by 1 until (last.x);
set have;
by x;
put _all_;
end;
put "Last value of " x=;
run;

请注意，它的结果略有不同 - 对于第一次迭代和最后一次迭代，并且它输出不同的行。这是因为SAS在第一种方法中自动为我们完成所有这些工作，而DoW循环您必须自己完成(例如，如果您想要所有8个，则必须在其中放置一个OUTPUT语句，并且您必须测试EOF andSTOP'如果为真(。

但也许这就是你想要的——你希望一开始没有价值，然后你想做点什么。这就是DoW循环有用的时候。

嵌套的 DoW 循环是相同的，只是您可以在两个不同的点采取行动。请注意，它实际上并没有改变行的读取方式：每次遇到该set语句时，都会从数据集中读取下一行(无论该行是什么(。相同的顺序，只是你有更多的停止点让你编写代码。

data want;
set have;
by x y;
if first.x then do;
put "First value of " x=;
end;
if first.y then do;
put "First value of " y=;
end;
put _all_;
if last.y then do;
put "Last value of " y=;
end;
if last.x then do;
put "Last value of " x=;
end;
run;
data want_dow;
put "First value of " x=;
do _n_ = 1 by 1 until (last.x);
put "First value of " y=;
do _n_ = 1 by 1 until (last.y);
set have;
by x y;
put _all_;
end;
put "Last value of " y=;
end;
put "Last value of " x=;
run;

同样，您在这里存在差异，因为 DoW 循环"first"在读取第一行之前执行操作 - 这再次可能有帮助，也可能没有帮助，具体取决于您的用例。我不认为我曾经有过这方面的用例，但这肯定不是不可能。

这是一个有用的案例，例如，您基本上是手动进行PROC MEANS。当然，它可以两种方式完成;有些人会更喜欢每个。

data want_dow;    
do _n_ = 1 by 1 until (last.x);
do _n_ = 1 by 1 until (last.y);
set have;
by x y;
z_sum_y = sum(z_sum_y,z);
z_sum_x = sum(z_sum_x,z);
end;
z_sum = z_sum_y;
output;
call missing(z_sum_y);
end;
call missing(y);
z_sum = z_sum_x;
output;
drop z_sum_y z_sum_x;
run;
data want;
set have;
by x y;
z_sum_y+z;
z_sum_x+z;
if last.y then do;
z_sum = z_sum_y;
output;
z_sum_y=0;
end;
if last.x then do;
z_sum = z_sum_x;
call missing(y);
output;
z_sum_x=0;
end;
drop z_sum_y z_sum_x;
run;

不过，大多数情况下，DoW 循环对于双 DoW 循环最有用，这对于汇总然后读取同一数据步骤迭代中的汇总值很有用。这是相同的摘要，但允许您查看当前行上的值。如果您想查看差异，请将必须中的 Z 值更改为其他值(我故意将它们放在模式中的 1/2，以便更容易检查(。

data want_ddow;
array z_sum_ys[2] _temporary_;
do _n_ = 1 by 1 until (last.x);
do _n_ = 1 by 1 until (last.y);
set have;
by x y;
z_sum_ys[y] = sum(z_sum_ys[y],z);
z_sum_x = sum(z_sum_x,z);
end;
end;
do _n_ = 1 by 1 until (last.x);  *do not need nesting here;
set have;
by x y;
z_sum_y = z_sum_ys[y];
output;
end;
call missing(of z_sum_ys[*] z_sum_x);
run;

要在没有双 DoW 循环的情况下做到这一点，您必须将第一个want的结果合并回have. 这不一定是什么大问题，但它是对数据的第二次传递;双 DoW 回路利用缓冲来避免实际重新执行第二次读入的 I/O。

相关内容

最新更新

热门标签：