我需要第一个UDF(GetOtherTriggers(的值作为第二个UDF(GetTriggerType(的参数。
以下代码不起作用:
val df = sql.sql(
"select GetOtherTriggers(categories) as other_triggers, GetTriggerType(other_triggers) from my_table")
返回以下异常:org.apache.spark.sql.AnalysisException:无法解析给定输入列的"other_triggers":[my_table列];
您可以使用子查询:
val df = sql.sql("""select GetTriggerType(other_triggers), other_triggers
from (
select GetOtherTriggers(categories) as other_triggers, *
from my_table
) withOther """)
测试:
val df = sc.parallelize (1 to 10).map(x => (x, x*2, x*3)).toDF("nr1", "nr2", "nr3");
df.createOrReplaceTempView("nr");
spark.udf.register("x3UDF", (x: Integer) => x*3);
spark.sql("""select x3UDF(nr1x3), nr1x3, nr3
from (
select x3UDF(nr1) as nr1x3, *
from nr
) a """)
.show()
给:
+----------+-----+---+
|UDF(nr1x3)|nr1x3|nr3|
+----------+-----+---+
| 9| 3| 4|
| 18| 6| 8|
| 27| 9| 12|
| 36| 12| 16|
| 45| 15| 20|
| 54| 18| 24|
| 63| 21| 28|
| 72| 24| 32|
| 81| 27| 36|
| 90| 30| 40|
+----------+-----+---+