Pandas 和 JSON ValueError:数组的长度必须相同



我正在尝试制作一个简单的应用程序,该应用程序将从歌曲中获取歌词并保存它们,我正在使用lyricsgenius创建一个JSON文件,其中包含我请求的歌曲的歌词,但是,我不知道如何解析JSON文件中的数据。我尝试按照本教程进行操作,但是当我开始使用 Pandas 时出现错误。

用于创建 JSON 文件的代码

import lyricsgenius as genius
import os
os.getcwd()
geniusCreds = "qlDFcHWqCRpSfq0pVTctt1ZhDc4wHF6lpP5WGODh4iVQB7yTPn7Hw6SjWAFiCdxa"
artist_name = "Steely Dan"
api = genius.Genius(geniusCreds)
artist = api.search_artist(artist_name, max_songs=3)
artist.save_lyrics()

从 JSON 文件中读取数据的代码

import pandas as pd
import os

Artist = pd.read_json("Lyrics_SteelyDan.json")
df = pd.DataFrame.from_dict(Artist['songs'])
df.head

每当我运行上面的代码时,我都会收到错误,有关如何修复错误或更好的数据解析方法的任何帮助将不胜感激,谢谢。

"c:/Users/Admin/Desktop/Steely Dan/Data.py"
Traceback (most recent call last):
File "c:/Users/Admin/Desktop/Steely Dan/Data.py", line 5, in <module>
Artist = pd.read_json("Lyrics_SteelyDan.json")
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandasiojson_json.py", line 592, in read_json
result = json_reader.read()
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandasiojson_json.py", line 717, in read
obj = self._get_object_parser(self.data)
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandasiojson_json.py", line 739, in _get_object_parser
obj = FrameParser(json, **kwargs).parse()
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandasiojson_json.py", line 849, in parse
self._parse_no_numpy()
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandasiojson_json.py", line 1093, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandascoreframe.py", line 411, in __init__
mgr = init_dict(data, index, columns, dtype=dtype)
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandascoreinternalsconstruction.py", line 257, in init_dict
return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandascoreinternalsconstruction.py", line 77, in arrays_to_mgr
index = extract_index(arrays)
File "C:UsersAdminAppDataLocalProgramsPythonPython37-32libsite-packagespandascoreinternalsconstruction.py", line 368, in extract_index
raise ValueError("arrays must all be same length")
ValueError: arrays must all be same length

如果行的长度不同,因此原始代码将失败。

试试这个:

import json
with open('Lyrics_SteelyDan.json') as json_data:
data = json.load(json_data)
df = pd.DataFrame(data['songs'])
df['lyrics']

另请阅读以下内容: https://hackersandslackers.com/json-into-pandas-dataframes/

最新更新