嵌套的MATLAB结构在DataJoint Python中获取时难以解析



从DataJoint Python和DataJoint MATLAB中,我将相同的值插入到longblob属性中。从DataJoint Python中,它作为字典插入,从DataJoint MATLAB中,它作为结构体插入。使用DataJoint MATLAB插入的条目在Python中获取时是一个重数组,这是预期的。然而,由于有嵌套的值,这个重数组很难解析。

插入DataJoint Python,获取DataJoint Python:

{'cat_gt': {'use_cat_gt': 1,
'cat_gt_params': {'apfilter': ['biquad', 2, 300, 0],
'gfix': [0.4, 0.1, 0.02],
'extras': ['prb_fld', 't_miss_ok', 'ap', 'gblcar', 'out_prb_fld']}},
'process_cluster': 'tiger',
'clustering_method': 'Kilosort2'}

用DataJoint MATLAB插入,用DataJoint Python获取:

rec.array([[(rec.array([[(array([[1.]]), rec.array([[(MatCell([['biquad'],
[2.0],
[300.0],
[0.0]], dtype=object), array([[0.4 ],
[0.1 ],
[0.02]]), MatCell([['prb_fld'],
['t_miss_ok'],
['ap'],
['gblcar'],
['out_prb_fld']], dtype='<U11'))     ]],
dtype=[('apfilter', 'O'), ('gfix', 'O'), ('extras', 'O')]))]],
dtype=[('use_cat_gt', 'O'), ('cat_gt_params', 'O')]), array(['tiger'], dtype='<U5'), array(['Kilosort2'], dtype='<U9'))]],
dtype=[('cat_gt', 'O'), ('process_cluster', 'O'), ('clustering_method', 'O')])

使用query.fetch(as_dict=True)似乎并没有解决这个问题:

[{'preprocess_paramset': rec.array([[(rec.array([[(array([[1.]]), rec.array([[(MatCell([['biquad'],
[2.0],
[300.0],
[0.0]], dtype=object), array([[0.4 ], ...

我可以创建一个递归函数来将一个重数组转换为一个字典,但是想知道在DataJoint中是否有一个本地方法来获取和转换这个条目到一个字典?

谢谢!

这是预期的行为。MATLAB结构体不等同于python中的字典列表。它们更像numpy.recarrayfetch标志as_dict适用于获取结果的结构,而不是blob内部。

可以编写一个函数将嵌套的重数组转换为字典。很难使其普遍工作,因为MATLAB结构数组和单元格数组不容易映射到本地Python类型。

我不确定从datajoint返回的内容是否类似于从scipy的matlab加载函数返回的内容,但这里有一些代码,至少可以开始清理从这篇文章中借来的matlab结构体/重数组。从datajoint有干净的往返是很好的(因为模型定义在两种语言中是相同的,所以看起来从存储格式返回也应该是相同的,但还没有看到内部),但同时…

def clean_recarray(data:np.recarray) -> dict:
'''
Clean up a recarray into python lists, dictionaries, and
numpy arrays rather than the sort-of hard to work with numpy record arrays.

Credit to https://stackoverflow.com/a/29126361/13113166
Args:
data (:class:`numpy.recarray`): Array to be cleaned!
Returns:
dict
'''
def _check_keys(d):
'''
checks if entries in dictionary are mat-objects. If yes
todict is called to change them to nested dictionaries
'''
for key in d:
if isinstance(d[key], mat_struct):
d[key] = _todict(d[key])
elif _has_struct(d[key]):
d[key] = _tolist(d[key])
return d
def _has_struct(elem):
"""Determine if elem is an array and if any array item is a struct"""
return isinstance(elem, np.ndarray) and any(isinstance(
e, mat_struct) for e in elem)
def _todict(matobj):
'''
A recursive function which constructs from matobjects nested dictionaries
'''
d = {}
for strg in matobj._fieldnames:
elem = matobj.__dict__[strg]
if isinstance(elem, mat_struct):
d[strg] = _todict(elem)
elif _has_struct(elem):
d[strg] = _tolist(elem)
else:
d[strg] = elem
return d
def _tolist(ndarray):
'''
A recursive function which constructs lists from cellarrays
(which are loaded as numpy ndarrays), recursing into the elements
if they contain matobjects.
'''
elem_list = []
for sub_elem in ndarray:
if isinstance(sub_elem, mat_struct):
elem_list.append(_todict(sub_elem))
elif _has_struct(sub_elem):
elem_list.append(_tolist(sub_elem))
else:
elem_list.append(sub_elem)
return elem_list
return _check_keys(data)

相关内容

  • 没有找到相关文章

最新更新