如何写spark dataframe到Postgres数据库



我使用Spark 1.3.0假设我在Spark中有一个数据框,我需要将其存储到64位ubuntu机器上的Postgres DB (postgresql-9.2.18-1-linux-x64)。我还使用postgresql9.2jdbc41.jar作为驱动程序连接到postgres

我可以使用以下命令从postgres DB读取数据

import org.postgresql.Driver
val url="jdbc:postgresql://localhost/postgres?user=user&password=pwd"
val driver = "org.postgresql.Driver"
val users = {
  sqlContext.load("jdbc", Map(
    "url" -> url,
    "driver" -> driver,
    "dbtable" -> "cdimemployee",
    "partitionColumn" -> "intempdimkey",
    "lowerBound" -> "0",
    "upperBound" -> "500",
    "numPartitions" -> "50"
  ))
}
val get_all_emp = users.select("*")
val empDF = get_all_emp.toDF
get_all_emp.foreach(println)

我想在经过一些处理后把这个DF写回postgres。下面的代码对吗?

empDF.write.jdbc("jdbc:postgresql://localhost/postgres", "test", Map("user" -> "user", "password" -> "pwd"))

您应该遵循下面的代码。

val database = jobConfig.getString("database")
val url: String = s"jdbc:postgresql://localhost/$database"
val tableName: String = jobConfig.getString("tableName")
val user: String = jobConfig.getString("user")
val password: String = jobConfig.getString("password")
val sql = jobConfig.getString("sql")
val df = sc.sql(sql)
val properties = new Properties()
properties.setProperty("user", user)
properties.setProperty("password", password)
properties.put("driver", "org.postgresql.Driver")
df.write.mode(SaveMode.Overwrite).jdbc(url, tableName, properties)

相关内容

  • 没有找到相关文章

最新更新