我目前正在使用Spark连接到Presto。我们的查询在60m
之后超时,为了增加查询执行时间,我在getDBProperties()
中设置了query.max-execution-time
参数,如下所示
private def constructPrestoDataFrame(sparkSession : SparkSession, jobConfig : Config, query : String) : DataFrame = {
sparkSession
.read
.jdbc(getPrestoConnectionUrl(jobConfig), query, getDBProperties(jobConfig))
}
private def getDBProperties(jobConfig : Config) : Properties = {
val dbProperties = new Properties
dbProperties.put("user", jobConfig.getString("presto.user"))
dbProperties.put("password", jobConfig.getString("presto.password"))
dbProperties.put("Driver", jobConfig.getString("presto.driver.name"))
dbProperties.put("query.max-execution-time", "2d")
dbProperties
}
private def getPrestoConnectionUrl(jobConfig : Config) : String = {
s"jdbc:presto://${jobConfig.getString("presto.host")}:8443/${jobConfig.getString("presto.catalogue.name")}?SSL=true&SSLTrustStorePath=${jobConfig.getString("sslTrustStorePath")}"+
"&SSLTrustStorePassword="+URLEncoder.encode(jobConfig.getString("sslTrustStorePassword"))
}
当我运行作业时,我得到一个异常,说exception caught: Cause = null Message = Unrecognized connection property 'query.max-execution-time'
我们使用apache-spark-2.3.x
、presto-jdbc-driver-300
。
将MAX_EXECUTION_TIME
添加到URL中的sessionVariables
对我来说很有用:
jdbc:mysql://{host}:{port}/{database}?sessionVariables=MAX_EXECUTION_TIME=123456666
查询验证:
SELECT @@max_execution_time
预期输出:
+--------------------+
|@@max_execution_time|
+--------------------+
| 123456666 |
+--------------------+