约束具有相同 ID 的每个观察

  • 本文关键字:观察 ID 约束 stata
  • 更新时间 :
  • 英文 :


我有一个数据集,我已经在那里工作了一段时间,以将其从宽格式清理到长格式。

我们正在跟踪大约 1.000 名患有 1-5 个动脉瘤(可能有多个动脉瘤)的患者,部分或全部接受不同的可用治疗。患者可能有两个动脉瘤,其中一个用治疗A治疗,另一个用治疗B治疗。

下面是一个数据示例:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str32 record_id float treatmentChoice_ byte(treatment_1 treatment_3) float aneurysm_id
"007128de18ce5cb1635b8f27c5435ff3" . . . 1
"007128de18ce5cb1635b8f27c5435ff3" . . . 2
"00abd7bdb6283dd0ac6b97271608a122" . 2 . 1
"00abd7bdb6283dd0ac6b97271608a122" . . . 2
"0142103f84693c6eda416dfc55f65de1" . 1 . 1
"0142103f84693c6eda416dfc55f65de1" . . . 2
"0153826d93a58d7e1837bb98a3c21ba8" . . . 1
"0153826d93a58d7e1837bb98a3c21ba8" . . . 2
"01c729ac4601e36f245fd817d8977917" . 1 . 1
"01c729ac4601e36f245fd817d8977917" . . . 2
"01dd90093fbf201a1f357e22eaff6b6a" . . . 1
"01dd90093fbf201a1f357e22eaff6b6a" . 1 . 2
"0208e14dcabc43dd2b57e2e8b117de4d" . . . 1
"0208e14dcabc43dd2b57e2e8b117de4d" . 1 . 2
"0210f575075e5def7ffa77530ce17ef0" . . . 1
"0210f575075e5def7ffa77530ce17ef0" . . . 2
"022cc7a9397e81cf58cd9111f9d1db0d" . . . 1
"022cc7a9397e81cf58cd9111f9d1db0d" . . . 2
"02afd543116a22fc7430620727b20bb5" . 2 . 1
"02afd543116a22fc7430620727b20bb5" . . . 2
"0303ef0bd5d256cca1c836e2b70415ac" . . . 1
"0303ef0bd5d256cca1c836e2b70415ac" . 1 . 2
"041b2b0cac589d6e3b65bb924803cf1a" . . . 1
"041b2b0cac589d6e3b65bb924803cf1a" . . . 2
"0536317a2bbb936e85c3eb8294b076da" . . . 1
"0536317a2bbb936e85c3eb8294b076da" . 1 . 2
"06161d4668f217937cac0ac033d8d199" . . . 1
"06161d4668f217937cac0ac033d8d199" . . . 2
"065e151f8bcebb27fabf8b052fd70566" . . 1 1
"065e151f8bcebb27fabf8b052fd70566" . . . 2
"065e151f8bcebb27fabf8b052fd70566" . . . 3
"065e151f8bcebb27fabf8b052fd70566" . . . 4
"07196414cd6bf89d94a33e149983d102" . . . 1
"07196414cd6bf89d94a33e149983d102" . . . 2
"0721c38f8275dab504fc53aebcc005ce" . . . 1
"0721c38f8275dab504fc53aebcc005ce" . . . 2
"0721c38f8275dab504fc53aebcc005ce" . . . 3
"0721c38f8275dab504fc53aebcc005ce" 1 . . 4
"07bef516d53279a3f5e477d56d552a2b" . . . 1
"07bef516d53279a3f5e477d56d552a2b" . 2 . 2
"08678829b7e0ee6a01b17974b4d19cfa" . . . 1
"08678829b7e0ee6a01b17974b4d19cfa" . . . 2
"08bb6c65e63c499ea19ac24d5113dd94" . . . 1
"08bb6c65e63c499ea19ac24d5113dd94" . . . 2
"08f036417500c332efd555c76c4654a0" . . . 1
"08f036417500c332efd555c76c4654a0" . . . 2
"090c54d021b4b21c7243cec01efbeb91" . . . 1
"090c54d021b4b21c7243cec01efbeb91" . . . 2
"09166bb44e4c5cdb8f40d402f706816e" . . . 1
"09166bb44e4c5cdb8f40d402f706816e" . 1 . 2
"0930159addcdc35e7dc18812522d4377" . . . 1
"0930159addcdc35e7dc18812522d4377" . . . 2
"096844af91d2e266767775b0bee9105e" . . . 1
"096844af91d2e266767775b0bee9105e" . 2 . 2
"09884af1bb9d59803de0c74d6df57c23" . . . 1
"09884af1bb9d59803de0c74d6df57c23" . 2 . 2
"09e03748da35e9d799dc5d8ddf1909b5" . . . 1
"09e03748da35e9d799dc5d8ddf1909b5" . . . 2
"0a4ce4a7941ff6d1f5c217bf5a9a3bf9" . . . 1
"0a4ce4a7941ff6d1f5c217bf5a9a3bf9" . . . 2
"0a5db40dc58e97927b407c9210aab7ba" 4 . . 1
"0a5db40dc58e97927b407c9210aab7ba" . . . 2
"0a73c992955231650965ed87e3bd52f6" . . . 1
"0a73c992955231650965ed87e3bd52f6" . 2 . 2
"0a84ab77fff74c247a525dfde8ce988c" 1 . 2 1
"0a84ab77fff74c247a525dfde8ce988c" . . . 2
"0a84ab77fff74c247a525dfde8ce988c" . . . 3
"0af333ae400f75930125bb0585f0dcf5" . . . 1
"0af333ae400f75930125bb0585f0dcf5" . . . 2
"0af73334d9d2166191f3385de48f15d2" . 1 . 1
"0af73334d9d2166191f3385de48f15d2" . . . 2
"0b341ac8f396a8cdb88b7c658f66f653" . . . 1
"0b341ac8f396a8cdb88b7c658f66f653" . . . 2
"0b35cf4beb830b361d7c164371f25149" . 1 . 1
"0b35cf4beb830b361d7c164371f25149" . . . 2
"0b3e110c9765e14a5c41fadcc3cfc300" . . . 1
"0b6681f0f441e69c26106ab344ac0733" . . . 1
"0b6681f0f441e69c26106ab344ac0733" . . . 2
"0b8d8253a8415275dbc2619e039985bb" 4 . . 1
"0b8d8253a8415275dbc2619e039985bb" . . . 2
"0b8d8253a8415275dbc2619e039985bb" . . . 3
"0b92c26375117bf42945c04d8d6573d4" . 2 . 1
"0b92c26375117bf42945c04d8d6573d4" . . . 2
"0ba961f437f43105c357403c920bdef1" . . . 1
"0ba961f437f43105c357403c920bdef1" . . . 2
"0bb601fabe1fdfa794a5272408997a2f" . . . 1
"0bb601fabe1fdfa794a5272408997a2f" . . . 2
"0c75b36e91363d596dc46bd563c3f5ef" . 1 . 1
"0c75b36e91363d596dc46bd563c3f5ef" . . . 2
"0d461328a3bae7164ce7d3a10f366812" . . . 1
"0d461328a3bae7164ce7d3a10f366812" . 2 . 2
"0d4cc4eb459301a804cbef22914f44a3" . 1 . 1
"0d4cc4eb459301a804cbef22914f44a3" . . . 2
"0d4e29e11bb94e922112089f3fec61ef" . . . 1
"0d4e29e11bb94e922112089f3fec61ef" . 1 . 2
"0d513c74d667f55c8f4a9836c304149c" . 1 . 1
"0d513c74d667f55c8f4a9836c304149c" . . . 2
"0da25de126bb3b3ee565eff8888004c2" . . . 1
"0da25de126bb3b3ee565eff8888004c2" . 1 . 2
"0db9ae1f2201577f431b7603d0819fa6" . . . 1
end
label values treatment_1 treatment_1_
label def treatment_1_ 1 "Observation", modify
label def treatment_1_ 2 "Afsluttet", modify
label values treatment_3 treatment_3_
label def treatment_3_ 1 "Observation", modify
label def treatment_3_ 2 "Afsluttet", modify

如您所见,在这个例子中,有三种不同的治疗方法,我已经按record_ID(患者)对观察结果进行了排序。请注意,每个患者 (record_ID) 可以出现多次。事实上,我已经扩展了数据集,所以如果一个病人有 4 个动脉瘤,就会有 4 个观察结果,因为统计数据是基于动脉瘤,而不是患者。

我的问题是,这些观察结果中的哪一个将描述每个动脉瘤得到的治疗似乎是随机的,我想添加一个变量treatment,列出相应动脉瘤ID的治疗。另请注意treatmentChoice_的意思是"动脉瘤1得到了哪种治疗?"treatmentChoice_1的意思是"动脉瘤2得到了哪种治疗?">

有没有办法说:

"对于每个相同的record_ID,请查看treatmentChoice_,如果动脉瘤ID为1,则将treatment设置为该值。然后对treatmentChoice_1treatmentChoice_3执行相同的操作,如果动脉瘤 ID 分别为 2 或 3,则将treatment设置为它们的值。

如果我正确遵循这一点,您希望从每个观察中的某些变量中选择一个非缺失值。为此,您可以使用max()min()rowmin()rowmax()功能egen

有了你的示例数据(谢谢),我得到了这个。请注意两个未标记的值 4。

. generate treatment = max(treatmentChoice_, treatment_1, treatment_3)
(73 missing values generated)
. label val treatment treatment_1_
. tab treatment
treatment |      Freq.     Percent        Cum.
------------+-----------------------------------
Observation |         16       59.26       59.26
Afsluttet |          9       33.33       92.59
4 |          2        7.41      100.00
------------+-----------------------------------
Total |         27      100.00

最新更新