从JSON文件创建DataFrame



我正在尝试将JSON文件加载到DataFrame中,我知道这个问题已经回答了多次,但我一直在尝试所有可能的解决方案,但没有成功,以下是我的JSON文件的样子:

{
"event": {
"origin": "devicename",
"module": "",
"interface": "",
"component": "",
"payload": "{"typeofsensor" : "US_distance","distance": 2}"
}}
{
"event": {
"origin": "devicename",
"module": "",
"interface": "",
"component": "",
"payload": "{"typeofsensor" : "mpu6050","accelX": 0.06, "accelY": 0.50, "accelZ": -0.88, "temp": 25.45}"
}}

我想做的是提取"有效负载"中的信息。为了创建列为的DataFrame, typepeofsensor表示传感器值因为我有不同类型的传感器。


我试着:

data = []
for line in open('data.JSON', 'r'):
data.append(json.loads(line))

我得到这个错误:

JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)

我也试过了:

df = pd.read_json('data.JSON', lines=True)

我得到这个错误:

ValueError                                Traceback (most recent call last)
C:UsersDEVELO~1AppDataLocalTemp/ipykernel_204/911564313.py in <module>
----> 1 df = pd.read_json('data.JSON', lines=True)
~AppDataLocalPackagesPythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0LocalCachelocal-packagesPython37site-packagespandasutil_decorators.py in wrapper(*args, **kwargs)
205                 else:
206                     kwargs[new_arg_name] = new_arg_value
--> 207             return func(*args, **kwargs)
208 
209         return cast(F, wrapper)
~AppDataLocalPackagesPythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0LocalCachelocal-packagesPython37site-packagespandasutil_decorators.py in wrapper(*args, **kwargs)
309                     stacklevel=stacklevel,
310                 )
--> 311             return func(*args, **kwargs)
312 
313         return wrapper
~AppDataLocalPackagesPythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0LocalCachelocal-packagesPython37site-packagespandasiojson_json.py in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, encoding_errors, lines, chunksize, compression, nrows, storage_options)
612 
613     with json_reader:
--> 614         return json_reader.read()
615 
616 
~AppDataLocalPackagesPythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0LocalCachelocal-packagesPython37site-packagespandasiojson_json.py in read(self)
744                 data = ensure_str(self.data)
745                 data_lines = data.split("n")
--> 746                 obj = self._get_object_parser(self._combine_lines(data_lines))
747         else:
748             obj = self._get_object_parser(self.data)
~AppDataLocalPackagesPythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0LocalCachelocal-packagesPython37site-packagespandasiojson_json.py in _get_object_parser(self, json)
768         obj = None
769         if typ == "frame":
--> 770             obj = FrameParser(json, **kwargs).parse()
771 
772         if typ == "series" or obj is None:
~AppDataLocalPackagesPythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0LocalCachelocal-packagesPython37site-packagespandasiojson_json.py in parse(self)
883 
884         else:
--> 885             self._parse_no_numpy()
886 
887         if self.obj is None:
~AppDataLocalPackagesPythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0LocalCachelocal-packagesPython37site-packagespandasiojson_json.py in _parse_no_numpy(self)
1138         if orient == "columns":
1139             self.obj = DataFrame(
-> 1140                 loads(json, precise_float=self.precise_float), dtype=None
1141             )
1142         elif orient == "split":
ValueError: Expected object or value

将您的文件更改为如下有效格式(从中间删除{}并更改密钥名称以使其唯一):

{
"event1": {
"origin": "devicename",
"module": "",
"interface": "",
"component": "",
"payload": "{"typeofsensor" : "US_distance","distance": 2}"
},
"event2": {
"origin": "devicename",
"module": "",
"interface": "",
"component": "",
"payload": "{"typeofsensor" : "mpu6050","accelX": 0.06, "accelY": 0.50, "accelZ": -0.88, "temp": 25.45}"
}
}

轻松读取文件:

pd.read_json('path to/your_file.json')
#output
event1                                             event2
origin                                         devicename                                         devicename
module                                                                                                      
interface                                                                                                   
component                                                                                                   
payload    {"typeofsensor" : "US_distance","distance": 2}  {"typeofsensor" : "mpu6050","accelX": 0.06, "a...

最新更新