如何在Spark中增加Presto的查询执行时间



我目前正在使用Spark连接到Presto。我们的查询在60m之后超时,为了增加查询执行时间,我在getDBProperties()中设置了query.max-execution-time参数,如下所示

private def constructPrestoDataFrame(sparkSession : SparkSession, jobConfig : Config, query : String) : DataFrame = {
sparkSession
.read
.jdbc(getPrestoConnectionUrl(jobConfig), query, getDBProperties(jobConfig))
}
private def getDBProperties(jobConfig : Config) : Properties = {
val dbProperties = new Properties
dbProperties.put("user", jobConfig.getString("presto.user"))
dbProperties.put("password", jobConfig.getString("presto.password"))
dbProperties.put("Driver", jobConfig.getString("presto.driver.name"))
dbProperties.put("query.max-execution-time", "2d")
dbProperties
}
private def getPrestoConnectionUrl(jobConfig : Config) : String = {
s"jdbc:presto://${jobConfig.getString("presto.host")}:8443/${jobConfig.getString("presto.catalogue.name")}?SSL=true&SSLTrustStorePath=${jobConfig.getString("sslTrustStorePath")}"+
"&SSLTrustStorePassword="+URLEncoder.encode(jobConfig.getString("sslTrustStorePassword"))
}

当我运行作业时,我得到一个异常,说exception caught: Cause = null Message = Unrecognized connection property 'query.max-execution-time'

我们使用apache-spark-2.3.xpresto-jdbc-driver-300

MAX_EXECUTION_TIME添加到URL中的sessionVariables对我来说很有用:

jdbc:mysql://{host}:{port}/{database}?sessionVariables=MAX_EXECUTION_TIME=123456666

查询验证:

SELECT @@max_execution_time

预期输出:

+--------------------+ 
|@@max_execution_time| 
+--------------------+ 
| 123456666          | 
+--------------------+

相关内容

  • 没有找到相关文章

最新更新