我有一个名为sample.txt的文本文件,该文本文件包含如下数据
0: 480x640 2 persons, 1 tv, 1: 480x640 5 persons, 2 tvs, 1 oven, Done. (0.759s) Mon, 04 April 11:39:48 status : Low
0: 480x640 2 persons, 1 tv, 1: 480x640 4 persons, 3 chairs, 1 oven, Done. (0.763s) Mon, 04 April 11:39:50 status : High
这类数据在示例文本中。
我试过这个代码将文本文件转换成json格式
cam_details = pd.read_csv('sample.txt', sep=r'(?:,s*|^)(?:d+: d+xd+|Done[^)]+)s*)',
header=None, engine='python', names=(None, 'a', 'b', 'date', 'status')).iloc[:, 1:]
cam_details.to_json('output.json', orient = "records", date_format = "epoch", double_precision = 10,
force_ascii = True, date_unit = "ms", default_handler = None)
我已经尝试过这个代码,但我没有得到正确的格式json。现在如何使用pandas数据框架分隔符将文本转换为上面提到的json格式。
我得到这样的输出
{
"a": " 2 persons, 1 tv, 1 laptop, 1 clock",
"b": " 4 persons, 1 car, 1 bottle, 3 chairs, 2 tvs, 1 oven",
"date": "Mon, 04 April 11:39:51 status : Low"
}
现在我希望把它转换成json文件,像这样
[
{
"a": " 2 persons, 1 tv, 1 laptop, 1 clock",
"b": " 5 persons, 1 bottle, 3 chairs, 2 tvs, 1 cell phone, 1 oven",
"date": "Mon, 04 April 11:39:48" ,
"status": "Low"
},
{
"a": " 2 persons, 1 tv, 1 laptop, 2 clocks",
"b": " 4 persons, 1 car, 3 chairs, 2 tvs, 1 laptop, 1 oven",
"date": "Mon, 04 April 11:39:50",
"status": "Low"
} ]
似乎问题是status
部分没有被分隔符分隔。您可以通过在pandas中添加一些处理来解决这个问题,在写入json:
# Splits the date part and the status part into two columns (your status is being dragged into the date column)
cam_details[['date', 'status']] = cam_details['date'].map(lambda x: x.split('status')).tolist()
# Clean up the status column which still has the colons and extra whitespaces
cam_details['status'] = cam_details['status'].map(lambda x: x.replace(':', '').strip())