在用pandas数据帧函数填充新列时遇到数据类型问题



Dears,

我有一个要求,读取一个csv文件,根据3个现有列的简单除法计算填充3个新列(元素是带两位小数的数字(,然后生成一个新的csv文件。为了达到目的,我尝试了两种使用单个片段的方法,但都失败了。

方法1:

from numpy import *
import pandas as pd
def func_process(file_in, file_out):
df = pd.read_csv(file_in)
df.eval("""col1_new = col1 / 2
col2_new = col2 / 4
col3_new = col3 / 8
""", inplace=True)
df.to_csv(file_out, encoding='utf-8', index=False)

if __name__ == '__main__':
input_file = 'C:/temp/csv/in/test.csv'
output_file = 'C:temp/csv/out/result.csv'
func_process(input_file, output_file)

==============================================

运行该函数后,出现错误类型错误:不支持/:"object"one_answers"<类'int'>'

方法2

from numpy import *
import pandas as pd
def my_test(a, b):
return a / b
def func_process(file_in, file_out):
df = pd.read_csv(file_in)
df['col1_new'] = df.apply(lambda x: my_test(x['col1'], 2), axis=1)
df['col2_new'] = df.apply(lambda x: my_test(x['col2'], 4), axis=1)
df['col3_new'] = df.apply(lambda x: my_test(x['col3'], 8), axis=1)
df.to_csv(file_out, encoding='utf-8', index=False)

if __name__ == '__main__':
input_file = 'C:/temp/csv/in/test.csv'
output_file = 'C:temp/csv/out/result.csv'
func_process(input_file, output_file)

==============================================

运行该函数后,出现错误类型错误:"不支持/:"str"one_answers"int"的操作数类型",'发生在索引0'(

我想我正在为这些类型的错误而挣扎,你们能帮我吗?

谢谢,cea

您的列中有字符串。您可以更改为数字数据,并使字符串与pd.to_numeric一起转到NaN并传递errors='coerce'。您也可以转换为.astype(float).astype(int):

def func_process(file_in, file_out):
df = pd.read_csv(file_in)
df['col1_new'] = pd.to_numeric(df['col1_new'], errors='coerce').astype(float)
df['col2_new'] = pd.to_numeric(df['col2_new'], errors='coerce').astype(float)
df['col3_new'] = pd.to_numeric(df['col3_new'], errors='coerce').astype(float)
df['col1_new'] = df.apply(lambda x: my_test(x['col1'], 2), axis=1)
df['col2_new'] = df.apply(lambda x: my_test(x['col2'], 4), axis=1)
df['col3_new'] = df.apply(lambda x: my_test(x['col3'], 8), axis=1)
df.to_csv(file_out, encoding='utf-8', index=False)

相关内容

最新更新