我是这样加载ALOI的。在python中使用pandas很容易
outlier,att1,att2,att3,att4,att5,att6,att7,att8,att9,att10,att11,att12,att13,att14,att15,att16,att17,att18,att19,att20,att21,att22,att23,att24,att25,att26,att27,id
'yes',0.8728117766203703,4.521122685185185E-6,0.0,3.616898148148148E-5,0.0,0.0,0.0,0.0,0.0,0.05032687717013889,4.521122685185185E-6,0.0,0.005631058304398148,0.004163953993055556,0.0,2.2605613425925925E-6,2.0345052083333332E-5,0.0,0.01421214916087963,1.0398582175925926E-4,0.0,0.025490089699074073,0.004937065972222222,1.1302806712962962E-5,5.425347222222222E-5,0.006804289641203704,0.015385380497685185,1.0
'yes',0.9752061631944444,0.0,0.0,6.510416666666666E-4,0.0,0.0,0.0,0.0,0.0,0.007039388020833333,0.0,0.0,0.009996202256944444,4.7019675925925923E-4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004853425202546296,0.001582392939814815,0.0,0.0,2.0118995949074074E-4,0.0,2.0
'yes',0.9637767650462963,0.0,0.0,0.00200511791087963,0.0,0.0,0.0,0.0,0.0,0.006641529224537037,2.2605613425925925E-6,0.0,0.012351707175925927,6.7138671875E-4,0.0,0.0,6.781684027777777E-6,0.0,0.0,0.0,0.0,0.007828323929398149,0.0025227864583333335,0.0,3.933376736111111E-4,0.003800003616898148,0.0,3.0
我试图加载像
data = pd.read_csv('ALOI.arff')
但是我想要完全相同的加载,但我的ALOI。Arff文件不是这种格式。以下是文件中数据的呈现方式
@RELATION 'ALOI'
@ATTRIBUTE 'outlier' {'yes','no'}
@ATTRIBUTE 'att1' real
@ATTRIBUTE 'att2' real
@ATTRIBUTE 'att3' real
@ATTRIBUTE 'att4' real
@ATTRIBUTE 'att5' real
@ATTRIBUTE 'att6' real
@ATTRIBUTE 'att7' real
@ATTRIBUTE 'att8' real
@ATTRIBUTE 'att9' real
@ATTRIBUTE 'att10' real
@ATTRIBUTE 'att11' real
@ATTRIBUTE 'att12' real
@ATTRIBUTE 'att13' real
@ATTRIBUTE 'att14' real
@ATTRIBUTE 'att15' real
@ATTRIBUTE 'att16' real
@ATTRIBUTE 'att17' real
@ATTRIBUTE 'att18' real
@ATTRIBUTE 'att19' real
@ATTRIBUTE 'att20' real
@ATTRIBUTE 'att21' real
@ATTRIBUTE 'att22' real
@ATTRIBUTE 'att23' real
@ATTRIBUTE 'att24' real
@ATTRIBUTE 'att25' real
@ATTRIBUTE 'att26' real
@ATTRIBUTE 'att27' real
@ATTRIBUTE 'id' real
@DATA
'yes',0.8728117766203703,4.521122685185185E-6,0.0,3.616898148148148E-5,0.0,0.0,0.0,0.0,0.0,0.05032687717013889,4.521122685185185E-6,0.0,0.005631058304398148,0.004163953993055556,0.0,2.2605613425925925E-6,2.0345052083333332E-5,0.0,0.01421214916087963,1.0398582175925926E-4,0.0,0.025490089699074073,0.004937065972222222,1.1302806712962962E-5,5.425347222222222E-5,0.006804289641203704,0.015385380497685185,1.0
'yes',0.9752061631944444,0.0,0.0,6.510416666666666E-4,0.0,0.0,0.0,0.0,0.0,0.007039388020833333,0.0,0.0,0.009996202256944444,4.7019675925925923E-4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004853425202546296,0.001582392939814815,0.0,0.0,2.0118995949074074E-4,0.0,2.0
'yes',0.9637767650462963,0.0,0.0,0.00200511791087963,0.0,0.0,0.0,0.0,0.0,0.006641529224537037,2.2605613425925925E-6,0.0,0.012351707175925927,6.7138671875E-4,0.0,0.0,6.781684027777777E-6,0.0,0.0,0.0,0.0,0.007828323929398149,0.0025227864583333335,0.0,3.933376736111111E-4,0.003800003616898148,0.0,3.0
'yes',0.9732462565104166,0.0,0.0,5.560980902777778E-4,0.0,0.0,0.0,0.0,0.0,0.008978949652777778,2.2605613425925925E-6,0.0,0.012433087384259259,2.147533275462963E-4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004392270688657407,1.6954210069444444E-4,0.0,0.0,6.781684027777777E-6,0.0,4.0
'yes',0.9607204861111112,0.0,0.0,6.555627893518518E-4,0.0,0.0,0.0,0.0,0.0,0.013319227430555556,0.0,0.0,0.01389114945023148,2.0571108217592592E-4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010299117476851851,5.60619212962963E-4,0.0,8.364076967592593E-5,2.644856770833333E-4,0.0,5.0
如何在python中实现?我是新手。你能发现任何问题吗?我试图加载它,但没有工作,并得到错误的
File "D:projectsworkmachineLearningmain.py", line 8, in <module>
df = pd.DataFrame(data['data'], columns=[attr[0] for attr in data['attributes']])
~~~~^^^^^^^^
TypeError: 'generator' object is not subscriptable
My trying Code:
import arff
with open('ALOI.arff') as f:
data = arff.load(f)
data = pd.DataFrame(data['data'], columns=[attr[0] for attr in data['attributes']])
应用scipy.io.arff.loadarff
读取arff文件:
from scipy.io import arff
data, meta = arff.loadarff('ALOI.arff')
df = pd.DataFrame(data)
print(df)
outlier att1 att2 att3 att4 att5 att6 att7 att8 att9
0 b'yes' 0.872812 0.000005 0.0 0.000036 0.0 0.0 0.0 0.0 0.0
1 b'yes' 0.975206 0.000000 0.0 0.000651 0.0 0.0 0.0 0.0 0.0
2 b'yes' 0.963777 0.000000 0.0 0.002005 0.0 0.0 0.0 0.0 0.0
3 b'yes' 0.973246 0.000000 0.0 0.000556 0.0 0.0 0.0 0.0 0.0
4 b'yes' 0.960720 0.000000 0.0 0.000656 0.0 0.0 0.0 0.0 0.0
att10 att11 att12 att13 att14 att15 att16 att17
0 0.050327 0.000005 0.0 0.005631 0.004164 0.0 0.000002 0.000020
1 0.007039 0.000000 0.0 0.009996 0.000470 0.0 0.000000 0.000000
2 0.006642 0.000002 0.0 0.012352 0.000671 0.0 0.000000 0.000007
3 0.008979 0.000002 0.0 0.012433 0.000215 0.0 0.000000 0.000000
4 0.013319 0.000000 0.0 0.013891 0.000206 0.0 0.000000 0.000000
att18 att19 att20 att21 att22 att23 att24 att25
0 0.0 0.014212 0.000104 0.0 0.025490 0.004937 0.000011 0.000054
1 0.0 0.000000 0.000000 0.0 0.004853 0.001582 0.000000 0.000000
2 0.0 0.000000 0.000000 0.0 0.007828 0.002523 0.000000 0.000393
3 0.0 0.000000 0.000000 0.0 0.004392 0.000170 0.000000 0.000000
4 0.0 0.000000 0.000000 0.0 0.010299 0.000561 0.000000 0.000084
att26 att27 id
0 0.006804 0.015385 1.0
1 0.000201 0.000000 2.0
2 0.003800 0.000000 3.0
3 0.000007 0.000000 4.0
4 0.000264 0.000000 5.0