我有一个10000个观察值的列表,每个代码号(ID) 1000个:
proc surveyselect data=code_nums out=bootstrapped_code_names (keep=code_name qty replicate) sampsize=1825 method=urs outhits no print seed=0 rep=1000;
strata code_num;
run;
proc sql no print;
create table test as
select code_num
,replicate
,sum(quantity) as total_qty
from bootstrapped_code_names
group code_name
order by total_qty
quit;
获取:
data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87 0
123456780 34 0
12345678 837 2
123456780 475 4
123456780 74 5
123456780 507 9
123456780 28 9
123456788 76 3
我想获得一个数据集,捕获每个ID的每100次观察,得到这样的东西:
数量769810210611312315616878109117133145160
不确定我是否遵循您的完整问题描述,但这里有简单的代码来选择每个ID值的第一个,第101个,第201个等观察值。源数据必须已经按ID变量排序。
data want;
do _n_=1 by 1 until(last.id);
set have;
by id;
if 1=mod(_n_,100) then output;
end;
run;
修改昨天的代码,使用MOD函数检查值是否能被100整除。修改MOD函数中的值以获取其他值
https://documentation.sas.com/doc/en/vdmmlcdc/8.1/ds2ref/n0t9j8b09x4uphn1kl1i70x63z19.htm
data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87 0
123456780 34 0
123456780 837 2
123456780 475 4
123456780 74 5
123456780 507 9
123456780 28 9
123456780 87 0
123456780 34 0
123456780 837 2
123456780 475 4
123456780 74 5
123456780 507 9
123456780 28 9
123456788 76 3
123456788 76 3
123456788 76 3
123456788 76 3
123456788 76 3
;;;;
data code_counter;
set code;
by code_num;
if first.code_num then count=1;
else count+1;
run;
data code100;
set code_counter;
if mod(count, 100)=0 then output;
run;