其中col等于x,返回列表中的下一个有序值-Pandas



我希望通过df进行解析,并根据条件返回特定的行。具体来说,对于等于A的ColString,我希望返回等于任一'B','C' or 'D'的下一行。如果在下一个A之前这些值都不存在,那么我希望返回一条消息,声明No item found for:IDString或类似的内容。

注:此处不适用isin

Item = ['A','B','C','D']
df = df[df['String'].isin(Item)]

因为我只想返回出现在A之后的'B','C' or 'D'第一个字符串。我不希望所有这些字符串出现在任何位置。我还想确定A是否存在,但在下一个A之前没有'B','C' or 'D'

import pandas as pd
df1 = pd.DataFrame({
'String' : ['A','F','B','C','D','A','X','C','B','D','A','Y','A','C','A','D','C','B'],                  
'ID' : [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18],                         
})
mask = df1.loc[df1['String'] == 'A', 'ID']
# Next value required
# Something like. where mask == True find next row equal to val in item list.
# If None: print(No item found for: ID)
item = ['B','C','D']

预期输出:

String  ID
0       A   1
2       B   3
5       A   6
7       C   8
10      A  11
12      A  13
13      C  14
14      A  15
15      D  16
No item found for: ID 11

这里有一个解决方案,需要几个步骤来澄清:

df1["is_a"] = df1.String == "A"
df1["a_group"] = df1.is_a.cumsum()
def foo(s):
matches = s[s.String.isin(["B", "C", "D"]) ]
if len(matches):
return s.loc[[s.index[0], matches.index[0]]]
else: 
print(f"No match for A in location {s.index[0]}")


df1.groupby("a_group").apply(foo)

输出为:

No match for A in location 10
String  ID   is_a  a_group
a_group                              
1       0       A   1   True        1
2       B   3  False        1
2       5       A   6   True        2
7       C   8  False        2
4       12      A  13   True        4
13      C  14  False        4
5       14      A  15   True        5
15      D  16  False        5

使用shift的解决方案

item = ['A', 'B','C','D']

# filter df
df1 = df1.loc[df1.String.isin(item)]
# find missing ID's using shift
for line in df1[(df1.String == 'A') & (df1.String.shift(-1) =='A')].iterrows():
print(f'No item found for: ID {line[1].ID}')
# find final df using shift
print(df1.loc[(df1.String == 'A') | ((df1.String.shift(1) == 'A') & (df1.String.isin(item)))])
#prints
No item found for: ID 11
String  ID
0       A   1
2       B   3
5       A   6
7       C   8
10      A  11
12      A  13
13      C  14
14      A  15
15      D  16

最新更新