我想从格式化如下的边缘的CSV文件中计算PageRank
:
12,13,1.0
12,14,1.0
12,15,1.0
12,16,1.0
12,17,1.0
...
我的代码:
var filename = "<filename>.csv"
val graph = Graph.fromCsvReader[Long,Double,Double](
env = env,
pathEdges = filename,
readVertices = false,
hasEdgeValues = true,
vertexValueInitializer = new MapFunction[Long, Double] {
def map(id: Long): Double = 0.0 } )
val ranks = new PageRank[Long](0.85, 20).run(graph)
我从 Flink Scala Shell 收到以下错误:
error: type mismatch;
found : org.apache.flink.graph.scala.Graph[Long,_23,_24] where type _24 >: Double with _22, type _23 >: Double with _21
required: org.apache.flink.graph.Graph[Long,Double,Double]
val ranks = new PageRank[Long](0.85, 20).run(graph)
^
我做错了什么?
(每个顶点的初始值 0.0和每条边的初始值 1.0 是否正确?
问题是你把 Scala org.apache.flink.graph.scala.Graph
给了PageRank.run
它期待 Java org.apache.flink.graph.Graph
。
为了运行 Scala Graph
对象的GraphAlgorithm
,您必须使用 GraphAlgorithm
调用 Scala Graph
的 run
方法。
graph.run(new PageRank[Long](0.85, 20))
更新
对于PageRank
算法,请务必注意该算法需要类型为 Graph[K, java.lang.Double, java.lang.Double]
的实例。由于Java的Double
类型与Scala的Double
类型不同(在类型检查方面(,因此必须考虑到这一点。
对于示例代码,这意味着
val graph = Graph.fromCsvReader[Long,java.lang.Double,java.lang.Double](
env = env,
pathEdges = filename,
readVertices = false,
hasEdgeValues = true,
vertexValueInitializer = new MapFunction[Long, java.lang.Double] {
def map(id: Long): java.lang.Double = 0.0 } )
.asInstanceOf[Graph[Long, java.lang.Double, java.lang.Double]]