我有以下数据集:
data = {"C1":[[(3, 5), (6, 8), (9-10)], [(0, 2), (5, 7), (9, 10)], [], [(1, 11)], [(0, 7), (8, 10)], [(5, 6)], [(0, 1)]]}
dt = pd.DataFrame(data)
print(dt)
看起来像:
0 [(3, 5), (6, 8), (9,10)]
1 [(0, 2), (5, 7), (9, 10)]
2 []
3 [(1, 11)]
4 [(0, 7), (8, 10)]
5 [(5, 6)]
6 [(0, 1)]
我想得到每个元组的长度(元组的第二个元素减去第一个元素(。
我最喜欢的输出是类似的东西
0 [(3, 5), (6, 8), (9,10)] [2,2,1]
1 [(0, 2), (5, 7), (9,10)] [2,2,1]
2 [] []
3 [(1, 11)] [10]
4 [(0, 7), (8, 10)] [7,2]
5 [(5, 6)] [1]
6 [(0, 1)] [1]
我目前正在使用这个代码:
dt['C2] = dt['C1'].apply(list(map(lambda x: x[1]-x[0])))
它给出以下错误:
map() must have at least two arguments
既然我使用的是apply
方法,我希望map
的第二个参数能自动从apply
中获得,为什么没有发生?
给.apply()
的lambda被单独应用于列中的每一行。所以你可以把理解列表放进去,做你想做的事:
data = {"C1":[[(3, 5), (6, 8), (9, 10)], [(0, 2), (5, 7), (9, 10)], [], [(1, 11)], [(0, 7), (8, 10)], [(5, 6)], [(0, 1)]]}
dt = pd.DataFrame(data)
print(dt)
>>> dt['C2'] = dt['C1'].apply(lambda lst: [tup[1] - tup[0] for tup in lst])
>>> dt
C1 C2
0 [(3, 5), (6, 8), (9, 10)] [2, 2, 1]
1 [(0, 2), (5, 7), (9, 10)] [2, 2, 1]
2 [] []
3 [(1, 11)] [10]
4 [(0, 7), (8, 10)] [7, 2]
5 [(5, 6)] [1]
6 [(0, 1)] [1]