我需要在 spark scala 中使用 rest API 在 hdfs 中存储 json 文件



>我有如下链接的API:

https://api.themoviedb.org/3/search/movie?api_key=29216073ebe788cab8978c4fcbbbad23&query=Kesari

我想将此结果存储为 JSON 文件。

implicit val formats = org.json4s.DefaultFormats
import org.apache.spark.sql.SparkSession
import org.json4s.jackson.JsonMethods.parse
import scala.io.Source.fromURL
case class Markets(
        vote_count: String,
        id: String,
        video:String,
        vote_average:String,
        title:String,
        popularity:String,
        poster_path:String,
        original_language:String,
        original_title:String,
        genre_ids:String,
        backdrop_path:String,
        adult:String,
        overview:String,
        release_date:String
            )
case class Result(success: Boolean,
              message: String,
              result: List[Markets])
val parsedData = parse(fromURL("https://api.themoviedb.org/3/search/movie?api_key=29216073ebe788cab8978c4fcbbbad23&query=Kesari").mkString).extract[Array[Result]]

向项目添加依赖项:

libraryDependencies ++= Seq(
  "com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "0.45.2" % Compile,
  "com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "0.45.2" % Provided // required only in compile-time
)

然后定义模型,请求电影,解析响应并将其存储到文件中:

import java.nio.file._
import java.time.LocalDate
import com.github.plokhotnyuk.jsoniter_scala.macros._
import com.github.plokhotnyuk.jsoniter_scala.core._
import scala.io.BufferedSource
import scala.io.Source.fromURL
case class Movie(
  vote_count: Double,
  id: Double,
  video: Boolean,
  vote_average: Double,
  title: String,
  popularity: Double,
  poster_path: String,
  original_language: String,
  original_title: String,
  genre_ids: List[Double],
  backdrop_path: Option[String],
  adult: Boolean,
  overview: String,
  release_date: LocalDate)
case class Response(
  page: Double,
  total_results: Double,
  total_pages: Double,
  results: List[Movie])
implicit val codec: JsonValueCodec[Response] = JsonCodecMaker.make(CodecMakerConfig())
val parsedData = {
  val source: BufferedSource = fromURL("https://api.themoviedb.org/3/search/movie?api_key=29216073ebe788cab8978c4fcbbbad23&query=Kesari")
  try readFromString(source.mkString)
  finally source.close()
}
Files.write(Paths.get("/tmp/movies.json"), writeToArray(parsedData))

最新更新