我正在使用R编程语言。假设我有以下数据帧:
a = rnorm(100,10,1)
b = rnorm(100,10,5)
c = rnorm(100,10,10)
my_data_2 = data.frame(a,b,c)
my_data_2$group = as.factor(C)
我的问题:假设我想向这个数据帧添加一个ID列,该列将第一个观测值列为";100〃;并且对于每个新列将ID增加1。我试着这样做:
my_data_2$id = seq(101, 200, by = 1)
然而;损坏的";数据帧:
head(my_data_2)
a b c
1 10.381397 9.534634 12.8330946
2 10.326785 6.397006 8.1217063
3 8.333354 11.474064 11.6035562
4 9.583789 12.096404 18.2764387
5 9.581740 12.302016 4.0601871
6 11.772943 9.151642 -0.3686874
group
1 c(9.98552413605153, 9.53807731118048, 6.92589246998173, 8.97095368638206, 9.70249918748529, 10.6161773148626, 9.2514231659343, 10.6566757899233, 10.2351848084123, 9.45970725813352, 9.15347719257448, 9.30428244749624, 8.43075784609759, 11.1200169905262, 11.3493313166827, 8.86895968334901, 9.13208319045466, 9.70062759133717)
2 c(8.90358954387628, 13.8756093430144, 12.9970566311467, 10.4227745183785, 21.3259516051226, 4.88590162247496, 10.260282181, 14.092109840631, 7.37839577680487, 9.09764173775965, 15.1636139760987, 9.9773055885761, 8.29361737323061, 8.61361852648607, 12.6807897406641, 0.00863359720839085, 10.7660528147358, 9.79616528370632)
3 c(25.8063583646201, -11.5722310383483, 8.56096791164312, 12.2858029391835, -0.312392781809937, 0.946343715084028, 2.45881422753051, 7.26197515743391, 0.333766891336273, 14.9149659649045, -4.55483090530928, -19.8075232688082, 16.59106194569, 18.7377329188129, 1.1771203751127, -6.19019973790205, -5.02277721344565, 23.3363430334739)
4 c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3)
5 c("B", "B", "B", "A", "B", "B", "B", "B", "B", "B", "B", "A", "B", "B", "B", "B", "B", "B")
6 c("B", "B", "B", "B", "B", "A", "B", "B", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B")
id
1 101
2 102
3 103
4 104
5 105
6 106
有人能告诉我如何解决这个问题吗?
谢谢!
问题不在于ID列,问题在于定义组变量的位置。您调用as.factor(C)
(注意大写的C(,但数据帧的列是小写的C。所以我猜您在数据帧的代码之外定义了另一个对象C,它现在"破坏"了您的数据帧。
你可能想做:
my_data_2$group <- as.factor(my_data_2$c)
我找到了答案!
a = rnorm(100,10,1)
b = rnorm(100,10,5)
c = rnorm(100,10,10)
my_data_2 = data.frame(a,b,c)
my_data_2$group = as.factor("C")
my_data_2$id = seq(101, 200, by = 1)
head(my_data_2)
a b c group id
1 9.436773 10.712568 3.7699748 C 101
2 10.265810 3.408589 11.9230024 C 102
3 10.503245 12.197000 8.3620889 C 103
4 9.279878 7.007812 16.8268852 C 104
5 10.683518 8.039032 5.2287997 C 105
6 11.097258 10.313103 0.4988398 C 106