我有以下数据帧。
df
ID List_values1 List_values2 A_value B_value C_code
1 [[('A_code', 2), ('B_code', 2)]] (C_code, 4) 1 0 0
2 (B_code, 3) [[('A_code', 2), ('B_code', 2), ('C_code', 4)]] 0 1 1
我想检查列为A_value, B_value, and C_code
的List_values1
和List_values2
的列表元素的值,这取决于它们的名称(例如A_code and A_value
(。例如,从列表值的List_values1
列中,如果A_code
从相应列中具有A_value == 0
,我希望从列表中删除此特定元素。其他人也一样。
我提出的产出如下。
ID List_values1 List_values2 A_value B_value C_code
1 ('A_code', 2) 1 0 0
2 (B_code, 3) [[('B_code', 2),('C_code', 4)]] 0 1 1
有人能帮忙吗?
如果列中的1
为匹配值,则对元组中的测试值使用自定义函数:
print (df)
ID List_values1 List_values2
0 1 [(A_code, 2), (B_code, 2)] (C_code, 4)
1 2 (B_code, 3) [(A_code, 2), (B_code, 2), (C_code, 4)]
A_value B_value C_value
0 1 0 0
v = ['A_value','B_value','C_value']
L = ['List_values1','List_values2']
def f(x):
need = x[v].index[x[v].astype(bool)].str.replace('value','code').tolist()
# print (need)
for c, val in x[L].items():
# print (val)
if isinstance(val, tuple):
x[c] = val if val[0] in need else ''
if isinstance(val, list):
out = [z for z in val if z[0] in need]
x[c] = out[0] if len(out) == 1 else out
return x
df = df.apply(f, axis=1)
print (df)
ID List_values1 List_values2 A_value B_value C_value
0 1 (A_code, 2) 1 0 0
1 2 (B_code, 3) [(B_code, 2), (C_code, 4)] 0 1 1
编辑:
v = ['A_value','B_value','C_value']
L = ['List_values1','List_values2']
def f(x):
need = x[v].index[x[v].astype(bool)].str.split('_').str[0].tolist()
print (need)
for c, val in x[L].items():
# print (val)
if isinstance(val, tuple):
x[c] = val if val[0].split('_')[0] in need else ''
if isinstance(val, list):
out = [z for z in val if z[0].split('_')[0] in need]
x[c] = out[0] if len(out) == 1 else out
return x
df = df.apply(f, axis=1)