我有一个这种格式的Python字典:
test_scr = {
"visited_pages" : [ {
"visited_page_id" : {
"$oid" : "57d01dd3f1a475f7307b23d9"
}, "url" : "google.com",
"page_height" : "3986",
"visited_on" : {
"$date" : 1473256915000
}, "visited_page_clicks" : [ {
"x" : "887",
"y" : "35",
"page_height" : "3986",
"created" : {
"$date" : 1473256920000
}
} ],
"total_clicks" : 1,
"total_time_spent_in_minutes" : "0.10",
"total_mouse_moves" : 0
}, {
"visited_page_id" : {
"$oid" : "57d01dddf1a475a6377b23d4"
}, "url" : "google.com",
"page_height" : "3088",
"visited_on" : {
"$date" : 1473256925000
}, "visited_page_clicks" : [ {
"x" : "888",
"y" : "381",
"page_height" : "3088",
"created" : {
"$date" : 1473256934000
}
},{
"x" : "888",
"y" : "381",
"page_height" : "3088",
"created" : {
"$date" : 1473256935000
}
},{
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256936000
}
},{
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256936000
}
}, {
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256937000
}
},{
"x" : "1347",
"y" : "445",
"page_height" : "3088",
"created" : {
"$date" : 1473256942000
}
},{
"x" : "259",
"y" : "798",
"page_height" : "3018",
"created" : {
"$date" : 1473257244000
}
},{
"x" : "400",
"y" : "98",
"page_height" : "3088",
"created" : {
"$date" : 1473257785000
}
}],"total_clicks" : 8,
"total_time_spent_in_minutes" : "14.26",
"total_mouse_moves" : 0
}, {
"visited_page_id" : {
"$oid" : "57d0213ff1a475a6377b23d5"
},"url" : "google.com",
"page_height" : "3088",
"visited_on" : {
"$date" : 1473257791000
},"visited_page_clicks" : [ {
"x" : "805",
"y" : "425",
"page_height" : "3088",
"created" : {
"$date" : 1473257826000
}
}, {
"x" : "523",
"y" : "100",
"page_height" : "3088",
"created" : {
"$date" : 1473257833000
}
} ], "total_clicks" : 2,
"total_time_spent_in_minutes" : "0.47",
"total_mouse_moves" : 0
}
}
我只需要从字典中提取 X 和 Y 值,并将它们以矩阵形式存储在数据框中。输出应如下所示:
X Y
887 35
888 381
888 381
875 364
. .
. .
. .
我该怎么做?
在这篇文章中,你的字典格式很糟糕,但我写了一个快速的小脚本,它能够循环并从字典中获取 x 和 y 值。
您可以使用dictionary["key"]
语法访问字典值。它将返回为该键存储的值或对象。
# Two lists to store the x and y values in
x = []
y = []
# Store the visited_pages object in a list
visited_pages = test_scr["visited_pages"]
# Loop through all the pages
for page in visited_pages:
page_clicks = page["visited_page_clicks"]
# Loop through all the clicks for the page
for click in page_clicks:
# Add the x and y values to the lists
x.append(click["x"])
y.append(click["y"])
您可以使用列表推导来执行此操作
coords = [[click['x'],click['y']] for page in test_scr['visited_pages'] for click in page['visited_page_clicks']]
您可以使用各种技术将其转换为数据框,也可以将其重新塑造为所需的格式。
另外,请正确格式化您的代码
输出
[['887', '35'],
['888', '381'],
['888', '381'],
['875', '364'],
['875', '364'],
['875', '364'],
['1347', '445'],
['259', '798'],
['400', '98'],
['805', '425'],
['523', '100']]