Spark 错误:map() 中缺少参数类型



我正在尝试通过复制此处的代码来学习Windows 10上的Spark GraphX。该代码是使用旧版本的 Spark 开发的,我找不到创建顶点的解决方案。以下是代码

import scala.util.MurmurHash
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
val path = "F:/Soft/spark/2008.csv"
val df_1 = spark.read.option("header", true).csv(path)
val flightsFromTo = df_1.select($"Origin",$"Dest")
val airportCodes = df_1.select($"Origin", $"Dest").flatMap(x => Iterable(x(0).toString, x(1).toString))
// error caused by the following line
val airportVertices: RDD[(VertexId, String)] = airportCodes.distinct().map(x => (MurmurHash.stringHash(x), x))

以下是错误消息:

<console>:57: error: missing parameter type
       val airportVertices: RDD[(VertexId, String)] = airportCodes.distinct().map(x => (MurmurHash.stringHash(x), x))
                                                                                  ^

我认为语法已经过时,我试图在官方文件上找到最新的语法,但没有帮助。数据集可以从这里下载。

更新:

基本上,我正在尝试创建一个顶点和边缘,最终创建一个图形,如教程所示。我也是Map-Reduce范式的新手。

以下代码行对我有用。

// imported latest library - works without this too, just gives a warning
import scala.util.hashing.MurmurHash3
// datasets are set to rdd - this is the cause of the error
val flightsFromTo = df_1.select($"Origin",$"Dest").rdd
val airportCodes = df_1.select($"Origin", $"Dest").flatMap(x => Iterable(x(0).toString, x(1).toString)).rdd
val airportVertices: RDD[(VertexId, String)] = airportCodes.distinct().map(x => (MurmurHash3.stringHash(x), x))

你可以试试: val airportVertices: RDD[(VertexId, String(] = airportCodes.distinct((.map(x => (MurmurHash.stringHash(x(0((, x(1(((

//为了应用 map((,只需尝试将变量转换为 RDD。

val airportVertices: RDD[(VertexId, String(] = airportCodes.rdd.distinct((.map(x => (MurmurHash3.stringHash(x(, x((

相关内容

最新更新