在类级别上对变量进行分组，这些变量在转换为较小数据集的观测值中重复，其中观测值是类

在Stata中，我创建了时间t内每个学校每个班级在课程I中的平均成绩(使用bysort，egen(。现在，我有一个在课堂水平上重复的组变量，在观察中重复。我该如何更改为一个较小的数据集，在该数据集中，观测由某个t的某个学校的每个班级组成，而不是由某个t的某个学院的某个班级的学生组成？至于没有重复的信息，即只有每个课程的平均成绩映射到每个班级。

更具体地说，我的当前输出看起来是这样的(我把可变的时间去掉，让它更简单(：

studentid classid avegrade
1 1 14.4
2 1 14.4
3 1 14.4
4 2 16
5 2 16
6 2 16
7 3 13
8 3 13

我需要以下输出结构：

classid avegrade
1 14.4
2 16
3 13

我做过的一些代码：

sort classid
//This command creates a new variable newid that is 1 for the first observation for each class and missing otherwise.
by classid: gen newid = 1 if _n==1
//replace newid = sum(newid) could be an option in this particular case but under the dynamic timeframe t it won't work
keep if newid =1

问题是，现在所有类都在newid下被调用1。

根据Nick的建议b/c，我知道Stata文档对于缺乏经验的人来说并不总是直观的。

这应该会奏效：

collapse (mean) avegrade, by(classid)

相关内容

最新更新

热门标签：