从源-目标-权重数据框到JSON文件



我有这个源,目标和权重数据框架:

source            target  weight
0     A                  B       3
1     A                  C       2
2     B                  C       0
3     C                  D       1
4     D                  A       1
5     D                  B       1
...            

如何得到一个JSON文件,看起来像这样:

{
"nodes": [
{"id": "A"},
{"id": "B"},
{"id": "C"},
{"id": "D"}
],
"links": [
{"source": "A", "target": "B", "weight": 3},
{"source": "A", "target": "C", "weight": 2},
{"source": "B", "target": "C", "weight": 0},
{"source": "C", "target": "D", "weight": 1},
{"source": "D", "target": "A", "weight": 1},
{"source": "D", "target": "B", "weight": 1}
]
}

我可以通过循环和列表来重建它,但是有没有更简单的方法呢?

nodes可以从源和目标(通过np.unique)中的唯一值构建,然后links可以从DataFrame.to_dict构建:

import numpy as np
import pandas as pd
df = pd.DataFrame({
'source': ['A', 'A', 'B', 'C', 'D', 'D'],
'target': ['B', 'C', 'C', 'D', 'A', 'B'],
'weight': [3, 2, 0, 1, 1, 1]
})
data = {
'nodes': [{'id': v} for v in np.unique(df[['source', 'target']])],
'links': df.to_dict(orient='records')
}

data:

{
'nodes': [{'id': 'A'}, {'id': 'B'}, {'id': 'C'}, {'id': 'D'}],
'links': [{'source': 'A', 'target': 'B', 'weight': 3},
{'source': 'A', 'target': 'C', 'weight': 2},
{'source': 'B', 'target': 'C', 'weight': 0},
{'source': 'C', 'target': 'D', 'weight': 1},
{'source': 'D', 'target': 'A', 'weight': 1},
{'source': 'D', 'target': 'B', 'weight': 1}]
}

根据需求,networkx也支持json_graph.node_link_data,这当然是多余的,除非需要额外的图操作:

import networkx as nx
import pandas as pd
from networkx.readwrite import json_graph
df = pd.DataFrame({
'source': ['A', 'A', 'B', 'C', 'D', 'D'],
'target': ['B', 'C', 'C', 'D', 'A', 'B'],
'weight': [3, 2, 0, 1, 1, 1]
})
G = nx.from_pandas_edgelist(df, source='source',
target='target',
edge_attr='weight')
data = json_graph.node_link_data(G)

data:

{'directed': False,
'graph': {},
'links': [{'source': 'A', 'target': 'B', 'weight': 3},
{'source': 'A', 'target': 'C', 'weight': 2},
{'source': 'A', 'target': 'D', 'weight': 1},
{'source': 'B', 'target': 'C', 'weight': 0},
{'source': 'B', 'target': 'D', 'weight': 1},
{'source': 'C', 'target': 'D', 'weight': 1}],
'multigraph': False,
'nodes': [{'id': 'A'}, {'id': 'B'}, {'id': 'C'}, {'id': 'D'}]}

您可以使用df.to_json(),但您可能需要做一些工作才能将其变成所需的形式。

的例子:

import re
'{'+'"nodes": '+re.sub(r"d", "id", df.source.drop_duplicates().to_json(orient="index"))+' "links": '+df.to_json(orient="records")+'}'
输出:

'{"nodes": {"id":"A","id":"B","id":"C","id":"D"} "links": [{"source":"A","target":"B","weight":3},{"source":"A","target":"C","weight":2},{"source":"B","target":"C","weight":0},{"source":"C","target":"D","weight":1},{"source":"D","target":"A","weight":1},{"source":"D","target":"B","weight":1}]}'

最新更新