对数据帧列中的所有唯一值运行spicy.stats方差分析测试

我有一个数据框架，由许多城市及其相应的温度组成：

CurrentThermostatTemp
City                                
Cradley Heath                   20.0
Cradley Heath                   20.0
Cradley Heath                   18.0
Cradley Heath                   15.0
Cradley Heath                   19.0
...                              ...
Walsall                         16.0
Walsall                         22.0
Walsall                         20.0
Walsall                         20.0
Walsall                         20.0
[6249 rows x 1 columns]

唯一值为：

Index(['Cradley Heath', 'ROWLEY REGIS', 'Smethwick', 'Oldbury',
'West Bromwich', 'Bradford', 'Bournemouth', 'Poole', 'Wareham',
'Wimborne',
...
'St. Helens', 'Altrincham', 'Runcorn', 'Widnes', 'St Helens',
'Wakefield', 'Castleford', 'Pontefract', 'Walsall', 'Wednesbury'],
dtype='object', name='City', length=137)

我的目标是做单向方差分析测试，即

from scipy.stats import f_oneway

用于数据帧中的所有唯一值。也是如此

SciPy.stats.f_oneway("all unique values")

并接收输出：所有变量的单向方差分析测试给出p值为{}的{}这是我尝试过很多次但都不起作用的方法：

all = Tempvs.index.unique()
Tempvs.sort_index(inplace=True)
for n in range(len(all)):
truncated = Tempvs.truncate(all[n], all[n])
print(f_oneway(truncated))

IIUC您想要一个ANOVA测试，其中每个样本包含唯一元素City的值Temp。如果是这种情况，你可以做

import numpy as np
import pandas as pd
import scipy.stats as sps
# I create a sample dataset
index = ['Cradley Heath', 'ROWLEY REGIS',
'Smethwick', 'Oldbury',
'West Bromwich', 'Bradford', 
'Bournemouth', 'Poole', 'Wareham',
'Wimborne','St. Helens', 'Altrincham', 
'Runcorn', 'Widnes', 'St Helens',
'Wakefield', 'Castleford', 'Pontefract', 
'Walsall', 'Wednesbury']
np.random.seed(1)
df = pd.DataFrame({
'City': np.random.choice(index, 500),
'Temp': np.random.uniform(15, 25, 500)
})
# populate a list with all
# values of unique Cities
values = []
for city in df.City.unique():
_df = df[df.City==city]
values.append(_df.Temp.values)
# compute the ANOVA
# with starred *list
# as arguments
sps.f_oneway(*values)

在这种情况下，将给出

F_onewayResult(statistic=0.4513685152123563, pvalue=0.9788508507035195)

PS：不要将all用作变量，因为它是一个内置的python函数，请参阅https://docs.python.org/3/library/functions.html#all

相关内容

最新更新

热门标签：