我来这里寻求帮助。我正在处理以下数据:
df1:
name name1 name2
A 13 13 13
B 13 27 57
C 12 12 12
D 26 23 2
我正在尝试使用这样的代码:
def val(df):
ret = []
for idx, row in df.iterrows():
if row.nunique()==1:
ret.append(f'The values of {idx} in name, name1, name2 are corrects')
else:
ret(["".join(f'*The values in {idx} are:',
', '.join(f'{c} in {v}' for v,c in row.iteritems()),
'Check your data before compare.']))
return ret
这里的问题是,运行不好。首先,我需要将结果作为字符串而不是列表。我知道"".join()
是可能的,但当我尝试代码时,我只得到最后一个结果,而不是我想要的全部答案。请问,如何才能得到完整的答案?。我希望看到更多的选择,而不仅仅是一个。
Example:
-The values of A in name, name1, name3 are corrects.
- The values in B are:
13 in name, 27 in name3 and 57 in name2.
Check your data before compare.
-The values of C in name, name1, name3 are corrects.
- The values in D are:
26 in name, 23 in name3 and 2 in name2.
Check your data before compare.
def val(df):
ret = []
for idx, row in df.iterrows():
if row.nunique() == 1:
ret.append(f'- The values of {idx} in name, name1, name2 are corrects')
else:
ret.append(
f"- The values in {idx} are:n"
f" {row[0]} in name, {row[1]} in name1, {row[2]} in name2.n"
" Check your data before compare."
)
return ret
ans = val(df)
输出
for i in ans:
print(i)
- The values of A in name, name1, name2 are corrects
- The values in B are:
13 in name, 27 in name1, 57 in name2.
Check your data before compare.
- The values of C in name, name1, name2 are corrects
- The values in D are:
26 in name, 23 in name1, 2 in name2.
Check your data before compare.
import pandas as pd
df = pd.DataFrame({'name': {'A': 13, 'B': 13, 'C': 12, 'D': 26},
'name1': {'A': 13, 'B': 27, 'C': 12, 'D': 23},
'name2': {'A': 13, 'B': 57, 'C': 12, 'D': 2}})
很难知道如何纠正你的函数,因为它有很多错误。
您可以像使用字符串格式的字典一样使用Pandas系列。
In [25]: s = '{name:} in name, {name1:} in name1, {name2:} in name2'
In [26]: row = df.loc['A',:]
In [27]: print(s.format(**row))
13 in name, 13 in name1, 13 in name2
In [28]: for idx,row in df.iterrows():
...: print(idx, s.format(**row))
...:
A 13 in name, 13 in name1, 13 in name2
B 13 in name, 27 in name1, 57 in name2
C 12 in name, 12 in name1, 12 in name2
D 26 in name, 23 in name1, 2 in name2
使用格式化字符串文字(f-string(也是如此。
In [29]: for idx,row in df.iterrows():
...: print(idx, f'''{row['name']} in name, {row['name1']} in name1, {row['name2']} in name2''')
...:
A 13 in name, 13 in name1, 13 in name2
B 13 in name, 27 in name1, 57 in name2
C 12 in name, 12 in name1, 12 in name2
D 26 in name, 23 in name1, 2 in name2
像这样把字符串作为np.where
子句的一部分怎么样?
所有其他答案都复制了原始的行迭代方法,这在玩具数据集之外效率很低。np.where
是一个矢量化操作,因此它将比自定义函数更快,逻辑也更简单。唯一需要注意的是,字符串插值在这里不起作用,因此有点尴尬的多行语法。
import pandas as pd
import numpy as np
from io import StringIO
data = StringIO("""
index name name1 name2
A 13 13 13
B 13 27 57
C 12 12 12
D 26 23 2
""")
df = pd.read_csv(data, delim_whitespace=True, index_col="index")
results = np.where(
df.nunique(axis=1) == 1,
'The values in ' + df.index + ' in name, name1, name2 are the samen',
'The values in ' + df.index + ' are:n' +
df["name"].astype(str) + ' in name, ' +
df["name1"].astype(str) + ' in name1, ' +
df["name2"].astype(str) + ' in name2.nCheck your data.n'
)
print(*results, sep='n')