我用GraphFrame创建了一个图形
g = GraphFrame (vertices, edges)
除了使用 GraphFrame 提供的查询和属性分析图形之外,我还想可视化要在演示文稿中使用的图形。
您知道任何工具/库/API/代码允许以简单的方式进行这种可视化吗?
不是一个简单的方法,但你可以使用python-igraph库,https://igraph.org/。我从R中使用了它,但python应该类似。请参阅下面的简单示例。所有这些工具的主要问题,您应该仔细选择要绘制的小子图。
安装它:
#>pip install python-igraph
最简单的可视化:
g = GraphFrame (vertices, edges)
from igraph import *
ig = Graph.TupleList(g.edges.collect(), directed=True)
plot(ig)
另一种方法是使用图形模块 networkx 中的绘图功能
import networkx as nx
from graphframes import GraphFrame
def PlotGraph(edge_list):
Gplot=nx.Graph()
for row in edge_list.select('src','dst').take(1000):
Gplot.add_edge(row['src'],row['dst'])
plt.subplot(121)
nx.draw(Gplot)
spark = SparkSession
.builder
.appName("PlotAPp")
.getOrCreate()
sqlContext = SQLContext(spark)
vertices = sqlContext.createDataFrame([
("a", "Alice", 34),
("b", "Bob", 36),
("c", "Charlie", 30),
("d", "David", 29),
("e", "Esther", 32),
("e1", "Esther2", 32),
("f", "Fanny", 36),
("g", "Gabby", 60),
("h", "Mark", 61),
("i", "Gunter", 62),
("j", "Marit", 63)], ["id", "name", "age"])
edges = sqlContext.createDataFrame([
("a", "b", "friend"),
("b", "a", "follow"),
("c", "a", "follow"),
("c", "f", "follow"),
("g", "h", "follow"),
("h", "i", "friend"),
("h", "j", "friend"),
("j", "h", "friend"),
("e", "e1", "friend")
], ["src", "dst", "relationship"])
g = GraphFrame(vertices, edges)
PlotGraph(g.edges)
参见 PYSPARK: 如何可视化 GraphFrame?
igraph
解决方案,但plot(ig)
引发错误:AttributeError: Plotting not available; please install pycairo or cairocffi
。
所以我只是将图表保存为 SVG,这工作正常:
ig.write_svg('/tmp/ig.svg')
干杯