熊猫 - 如何将多阵列 (3D) 转换为可理解的数据帧?



我得到了这种格式,但无法执行df.head()

data = pd.read_pickle(r'C:UsersxxxxxDocumentsGitHub555967_1012435_bundle_archivedata.txt')
data
[array([[[108, 108,  74],
[113, 113,  79],
[108, 108,  74],
...,
[176, 179, 188],
[175, 178, 187],
[172, 175, 184]],

[[104, 104,  70],
[111, 111,  77],
[107, 107,  73],
...,
[176, 179, 188],
[175, 178, 187],
[172, 175, 184]],

[[ 91,  91,  57],
[104, 104,  70],
[103, 103,  69],
df = pd.json_normalize(data, ['x', 'y', 'z'])
df

错误信息

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-4-5354fc045452> in <module>
----> 1 df = pd.json_normalize(data, ['x', 'y', 'z'])
2 df
c:usersxxxxxappdatalocalprogramspythonpython37libsite-packagespandasiojson_normalize.py in _json_normalize(data, record_path, meta, meta_prefix, record_prefix, errors, sep, max_level)
339                 records.extend(recs)
340 
--> 341     _recursive_extract(data, record_path, {}, level=0)
342 
343     result = DataFrame(records)
c:usersxxxxxappdatalocalprogramspythonpython37libsite-packagespandasiojson_normalize.py in _recursive_extract(data, path, seen_meta, level)
308                         seen_meta[key] = _pull_field(obj, val[-1])
309 
--> 310                 _recursive_extract(obj[path[0]], path[1:], seen_meta, level=level + 1)
311         else:
312             for obj in data:
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

我正在使用这个数据集:https://www.kaggle.com/imrul273/realdataset40k

这里的问题是您在数据未保存为"腌制"python对象的情况下使用pd.read_pickle()。在这种情况下,您很可能想要pd.read_csv()

只要确保正确设置您的delimiter参数即可

最新更新