Pandas数据帧处理行向量花费了太多时间

我有一个大约有1000行260列的示例文件。我想通过从单元格值中减去行平均值来获得新值。下面的代码运行良好，但仅400行就需要花费大量时间。有没有更好的解决方案可以在更短的时间内完成这项任务？

import numpy as np
import xlrd
import pandas as pd
Source = xlrd.open_workbook('Sample.xlsx')
Destination = 'Destination.xlsx'
writer = pd.ExcelWriter(Destination, engine='openpyxl')
ws1 = Source.sheet_by_index(0)
nrows = ws1.nrows
ncols = ws1.ncols
Rows = pd.DataFrame(index=range(nrows), columns=range(ncols))
for i in range(nrows):
Avg = np.mean(ws1.row_values(i))
for j in range(ncols):
Rows.iloc[i:,j:] = ((ws1.cell_value(i,j)-Avg))
Rows.to_excel(writer, sheet_name='Sheet1', startcol=0, startrow=0, index=False, header=False)
writer.save()
writer.close()

为什么不尝试通过pandas以及使用打开和保存excel

source = pd.read_csv(sample.xlsx)
source.to_csv('output.csv', index=False)

用DataFrame.sub:减去DataFrame的所有值

df = pd.DataFrame({
'A':[4,5,4],
'B':[7,8,9],
'C':[1,3,5],
'D':[5,3,6]
})
df = df.sub(df.mean(axis=1), axis=0)
print (df)
A     B     C     D
0 -0.25  2.75 -3.25  0.75
1  0.25  3.25 -1.75 -1.75
2 -2.00  3.00 -1.00  0.00

您的代码应该用read_excel更改为DataFrame，用DataFrame.to_excel更改为新的excel文件：吗

df = pd.read_excel('Sample.xlsx')
df1 = df.sub(df.mean(axis=1), axis=0)
df1.to_excel( 'Destination.xlsx', index=False)

相关内容

最新更新

热门标签：