我正在尝试从mongo集合中读取,并且需要在读取日期时应用过滤器,
例如,示例mongo查询将看起来像:
db.mongo_coll.find( {dataDate :{$gte:ISODate("2021-01-01")})
我如何将此应用于spark mongo read?
val df = spark.read.format("com.mongodb.spark.sql.DefaultSource").
option("spark.sql.caseSensetive", "true").
option("sampleSize",3000).
option("url", ${hostConn}).load()
提前谢谢。
请以这种方式尝试。
val pipeline = "{'$match':{'dataDate':{'$gt':{'$date':'2021-01-01T00:00:00Z'}}}}"
val df = spark.read.format("com.mongodb.spark.sql.DefaultSource").option("spark.sql.caseSensetive", "true").option("sampleSize",3000).option("pipeline",pipeline).option("url", ${hostConn}).load()
String match_query = "[{'$match':{'dataDate':{'$gt':DATE}}}]";
spark.
.read()
.format("mongo")
.option("uri", "{URI}")
.option("collection", "{Collection}")
.option("database","{DATABASE}")
.option("pipeline", match_query)
.load();