尝试在 PANDA 数据帧上做平均值时出错



我有一个来自csv文件的熊猫数据框,我需要对其中3列进行平均值,并将结果放在新列中。 这就是数据的样子——

0      week     12    exp    exp    exp
1   Subject  Group      1      2      3
2       255   HD 0  117.4  104.8   87.0
3       418   WT 0   61.2   56.1   97.9
4       300   HD 0  111.7  126.9  118.4
5       299   HD 0   50.7   37.8   30.6
6       258   WT 0   56.0   67.9   58.5
7       173   HD 0   76.2  131.7  119.5

我的代码是 -

with open('final results.csv', 'r') as frame:
date_again = csv.reader(frame)   
frame = []
for line in date_again:
frame = frame + [line]
panda_file = pd.DataFrame(frame)  

panda_file['平均'] = 帧[3:].平均值( 轴=1(

我得到的错误是 属性错误:"列表"对象没有属性"平均值">

我该如何解决?

谢谢

首先用于创建DataFrameread_csv与参数header=[0,1]一起使用,因为 csv 有 2 个行标题,用于列中带有MultiIndexDataFrame

import pandas as pd
temp=u"""week,12,exp,exp,exp
Subject,Group,1,2,3
255,HD,0,117.4,104.8,87.0
418,WT,0,61.2,56.1,97.9
300,HD,0,111.7,126.9,118.4
299,HD,0,50.7,37.8,30.6
258,WT,0,56.0,67.9,58.5
173,HD,0,76.2,131.7,119.5"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), header=[0,1])
print (df)
week    12    exp              
Subject Group      1      2      3
255      HD     0  117.4  104.8   87.0
418      WT     0   61.2   56.1   97.9
300      HD     0  111.7  126.9  118.4
299      HD     0   50.7   37.8   30.6
258      WT     0   56.0   67.9   58.5
173      HD     0   76.2  131.7  119.5

然后选择最后 3 列进行mean

df1 = df.iloc[:, -3:].mean(axis=1)
print (df1)
255    103.066667
418     71.733333
300    119.000000
299     39.700000
258     60.800000
173    109.133333
dtype: float64

对于新列需要分配给元组定义的新 MultiIndex 列的名称:

df[('exp', 'mean')] = df.iloc[:, -3:].mean(axis=1)
print (df)
week    12    exp                          
Subject Group      1      2      3        mean
255      HD     0  117.4  104.8   87.0  103.066667
418      WT     0   61.2   56.1   97.9   71.733333
300      HD     0  111.7  126.9  118.4  119.000000
299      HD     0   50.7   37.8   30.6   39.700000
258      WT     0   56.0   67.9   58.5   60.800000
173      HD     0   76.2  131.7  119.5  109.133333

但为了简化,可以展平列:

df.columns = df.columns.map('_'.join)
print (df)
week_Subject  12_Group  exp_1  exp_2  exp_3
255           HD         0  117.4  104.8   87.0
418           WT         0   61.2   56.1   97.9
300           HD         0  111.7  126.9  118.4
299           HD         0   50.7   37.8   30.6
258           WT         0   56.0   67.9   58.5
173           HD         0   76.2  131.7  119.5
df['exp_mean'] = df.iloc[:, -3:].mean(axis=1)
print (df)
week_Subject  12_Group  exp_1  exp_2  exp_3    exp_mean
255           HD         0  117.4  104.8   87.0  103.066667
418           WT         0   61.2   56.1   97.9   71.733333
300           HD         0  111.7  126.9  118.4  119.000000
299           HD         0   50.7   37.8   30.6   39.700000
258           WT         0   56.0   67.9   58.5   60.800000
173           HD         0   76.2  131.7  119.5  109.133333

最新更新