如何在 Python 中的数据中访问数据中的数据



我有一个这种格式的Python字典:

test_scr = { 
    "visited_pages" : [ { 
          "visited_page_id" : { 
              "$oid" : "57d01dd3f1a475f7307b23d9" 
          }, "url" : "google.com", 
         "page_height" : "3986", 
         "visited_on" : { 
             "$date" : 1473256915000 
          }, "visited_page_clicks" : [ { 
                "x" : "887", 
                "y" : "35", 
                "page_height" : "3986", 
                "created" : { 
                    "$date" : 1473256920000 
                 } 
            } ], 
         "total_clicks" : 1, 
         "total_time_spent_in_minutes" : "0.10", 
         "total_mouse_moves" : 0 
      }, { 
          "visited_page_id" : { 
              "$oid" : "57d01dddf1a475a6377b23d4" 
          }, "url" : "google.com", 
         "page_height" : "3088", 
         "visited_on" : { 
             "$date" : 1473256925000 
          }, "visited_page_clicks" : [ {
                "x" : "888", 
                "y" : "381", 
                "page_height" : "3088", 
                "created" : { 
                    "$date" : 1473256934000 
                 } 
             },{
                "x" : "888", 
                "y" : "381", 
                "page_height" : "3088",
                "created" : { 
                    "$date" : 1473256935000 
                 } 
             },{ 
                 "x" : "875", 
                 "y" : "364",
                 "page_height" : "3088",
                  "created" : { 
                     "$date" : 1473256936000 
                 } 
             },{ 
                 "x" : "875",
                 "y" : "364",
                 "page_height" : "3088",
                 "created" : { 
                      "$date" : 1473256936000 
                  } 
             }, {
                 "x" : "875", 
                 "y" : "364",
                 "page_height" : "3088",
                 "created" : {
                      "$date" : 1473256937000 
                  } 
             },{ 
                 "x" : "1347",
                 "y" : "445", 
                 "page_height" : "3088", 
                 "created" : { 
                      "$date" : 1473256942000 
                  } 
             },{ 
                  "x" : "259", 
                  "y" : "798", 
                  "page_height" : "3018", 
                  "created" : { 
                       "$date" : 1473257244000 
                  } 
             },{ 
                  "x" : "400", 
                  "y" : "98", 
                  "page_height" : "3088",
                  "created" : { 
                       "$date" : 1473257785000 
                  } 
             }],"total_clicks" : 8, 
                "total_time_spent_in_minutes" : "14.26", 
                "total_mouse_moves" : 0 
         }, { 
            "visited_page_id" : { 
                    "$oid" : "57d0213ff1a475a6377b23d5" 
            },"url" : "google.com",
            "page_height" : "3088",
            "visited_on" : { 
                    "$date" : 1473257791000 
            },"visited_page_clicks" : [ { 
                  "x" : "805", 
                  "y" : "425", 
                  "page_height" : "3088", 
                  "created" : { 
                        "$date" : 1473257826000 
                  } 
              }, {
                  "x" : "523", 
                  "y" : "100", 
                  "page_height" : "3088", 
                  "created" : { 
                        "$date" : 1473257833000 
                  } 
            } ], "total_clicks" : 2, 
            "total_time_spent_in_minutes" : "0.47", 
            "total_mouse_moves" : 0 
        } 
    }

我只需要从字典中提取 X 和 Y 值,并将它们以矩阵形式存储在数据框中。输出应如下所示:

X       Y
887     35
888     381
888     381
875     364
.        .
.        .
.        .

我该怎么做?

在这篇文章中,你的字典格式很糟糕,但我写了一个快速的小脚本,它能够循环并从字典中获取 x 和 y 值。
您可以使用dictionary["key"]语法访问字典值。它将返回为该键存储的值或对象。

# Two lists to store the x and y values in    
x = []
y = []
# Store the visited_pages object in a list
visited_pages = test_scr["visited_pages"]
# Loop through all the pages
for page in visited_pages:
    page_clicks = page["visited_page_clicks"]
    # Loop through all the clicks for the page
    for click in page_clicks:
        # Add the x and y values to the lists
        x.append(click["x"])
        y.append(click["y"])

您可以使用列表推导来执行此操作

coords = [[click['x'],click['y']] for page in test_scr['visited_pages'] for click in page['visited_page_clicks']]

您可以使用各种技术将其转换为数据框,也可以将其重新塑造为所需的格式。

另外,请正确格式化您的代码

输出

[['887', '35'],
['888', '381'],
['888', '381'],
['875', '364'],
['875', '364'],
['875', '364'],
['1347', '445'],
['259', '798'],
['400', '98'],
['805', '425'],
['523', '100']]

相关内容

  • 没有找到相关文章

最新更新