将matlab数据结构读入numpy数组



我有一组mat文件,其中包含matlabstruct。结构体中有很多数组。我想打开文件,把它们全部转换成数组。到目前为止,我已经编写了以下代码:

import h5py
>>> fs = h5py.File('statistics_VAD.mat','r')
>>> list(fs.keys())
['#refs#', 'data']
>>> 
>>> fs['data'].visititems(lambda n,o:print(n, o))
C <HDF5 dataset "C": shape (100, 1), type "|O">
P <HDF5 dataset "P": shape (100, 1), type "|O">
V <HDF5 dataset "V": shape (100, 1), type "|O">
Wn <HDF5 dataset "Wn": shape (100, 1), type "|O">
X <HDF5 dataset "X": shape (100, 1), type "|O">
a <HDF5 dataset "a": shape (100, 1), type "|O">
dn <HDF5 dataset "dn": shape (100, 1), type "|O">
>>> struArray = fs['data']
>>> print(struArray['P'])
<HDF5 dataset "P": shape (100, 1), type "|O">

我不知道如何将HDF5 dataset "P"转移到numpy数组。如有任何建议,不胜感激

下面的代码是我的评论(dtd 20121-03-01)中提到的示例。它从NumPy数组创建2个数据集,然后创建一个具有2个对象引用的数据集,每个数据集1个对象引用。然后展示如何使用对象引用来访问数据。为了完整性,我们还创建了包含区域引用的第二个数据集。

注意h5f[]是如何被使用两次的:内部的获得对象,外部的从对象引用中获得数据。这是一个微妙的地方,让新用户对引用感到困惑。

import numpy as np
import h5py
with h5py.File('SO_66410592.h5','w') as h5f :
# Create 2 datasets using numpy arrays
arr = np.arange(100).reshape(20,5)
h5f.create_dataset('array1',data=arr)    
arr = np.arange(100,0,-1).reshape(20,5)
h5f.create_dataset('array2',data=arr) 

# Create a dataset of OBJECT references: 
h5f.create_dataset('O_refs', (10,), dtype=h5py.ref_dtype)
h5f['O_refs'][0] = h5f['array1'].ref
print (h5f['O_refs'][0])
print (h5f[ h5f['O_refs'][0] ])
print (h5f[ h5f['O_refs'][0] ][0,:])
h5f['O_refs'][1] = h5f['array2'].ref
print (h5f['O_refs'][1])
print (h5f[ h5f['O_refs'][1] ])
print (h5f[ h5f['O_refs'][1] ][-1,:])
# Create a dataset of REGION references: 
h5f.create_dataset('R_refs', (10,), dtype=h5py.regionref_dtype)
h5f['R_refs'][0] = h5f['array1'].regionref[0,:]
print (h5f['R_refs'][0])
print (h5f[ h5f['R_refs'][0] ])    
print (h5f[ h5f['R_refs'][0] ] [ h5f['R_refs'][0] ]) 
h5f['R_refs'][1] = h5f['array2'].regionref[-1,:]
print (h5f['R_refs'][1])
print (h5f[ h5f['R_refs'][1] ])    
print (h5f[ h5f['R_refs'][1] ] [ h5f['R_refs'][1] ]) 

最新更新