根据索引获取行,然后创建另一个单独的数据帧



我编写了一个代码来从数据帧中提取索引,但我不知道如何使用这些索引从原始数据帧创建另一个数据帧。

是否可以缩短我当前的代码?它很长。

已编辑==

import pandas as pd
a = pd.DataFrame({"a":["I have something", "I have nothing", "she has something", "she is nice", "she is not nice","Me", "He"],
"b":[["man"], ["man", "eating"], ["cat"], ["man"], ["cat"], ["man"], ["cat"]]})
a = a[a.b.apply(lambda x:len(x)) == 1] # is it possible to shorten the code from here
c = a.explode("b").groupby("b")
k = ["man", "cat"]
bb = a
for x in k:
bb = c.get_group(x).head(2).index # to here?.... this part is supposed to take the first 2 indexes of each element in k

当前结果:

a       b
4   she is not nice [cat]
Expected results:

a       b
0   I have something    [man]
2   she has something   [cat]
3   she is nice [man]
4   she is not nice [cat]

首先按Series.str.len过滤,然后将一个元素字符串转换为字符串,因此可能通过Series.duplicated测试重复性。按~反转布尔掩码并按boolean indexing过滤:

a = a[a.b.str.len() == 1]
b = a[~a['b'].str[0].duplicated()]
print (b)
a      b
3      she is nice  [man]
4  she is not nice  [cat]

编辑:对于多个值,请使用GroupBy.head

b1 = a.groupby(a['b'].str[0]).head(2)
print (b1)
a      b
0   I have something  [man]
2  she has something  [cat]
3        she is nice  [man]
4    she is not nice  [cat]

相关内容

  • 没有找到相关文章

最新更新