我得到了这种格式,但无法执行df.head()
data = pd.read_pickle(r'C:UsersxxxxxDocumentsGitHub555967_1012435_bundle_archivedata.txt')
data
[array([[[108, 108, 74],
[113, 113, 79],
[108, 108, 74],
...,
[176, 179, 188],
[175, 178, 187],
[172, 175, 184]],
[[104, 104, 70],
[111, 111, 77],
[107, 107, 73],
...,
[176, 179, 188],
[175, 178, 187],
[172, 175, 184]],
[[ 91, 91, 57],
[104, 104, 70],
[103, 103, 69],
df = pd.json_normalize(data, ['x', 'y', 'z'])
df
错误信息
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-4-5354fc045452> in <module>
----> 1 df = pd.json_normalize(data, ['x', 'y', 'z'])
2 df
c:usersxxxxxappdatalocalprogramspythonpython37libsite-packagespandasiojson_normalize.py in _json_normalize(data, record_path, meta, meta_prefix, record_prefix, errors, sep, max_level)
339 records.extend(recs)
340
--> 341 _recursive_extract(data, record_path, {}, level=0)
342
343 result = DataFrame(records)
c:usersxxxxxappdatalocalprogramspythonpython37libsite-packagespandasiojson_normalize.py in _recursive_extract(data, path, seen_meta, level)
308 seen_meta[key] = _pull_field(obj, val[-1])
309
--> 310 _recursive_extract(obj[path[0]], path[1:], seen_meta, level=level + 1)
311 else:
312 for obj in data:
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
我正在使用这个数据集:https://www.kaggle.com/imrul273/realdataset40k
这里的问题是您在数据未保存为"腌制"python对象的情况下使用pd.read_pickle()
。在这种情况下,您很可能想要pd.read_csv()
只要确保正确设置您的delimiter
参数即可