针对2个输入值,组合for循环中生成的2个数据帧的结果



组合在for循环中为2个输入值生成的2个数据帧的结果这是数据帧:

循环中第一个值的第一次DF:

+--------+-------------------------------+---+
|order_id|Diff                           |id |
+--------+-------------------------------+---+
|12      |order_status                   |1  |
|1       |order_customer_id order_status |1  |
|68885   |New row in DataFrame 2         |1  |
|68886   |New row in DataFrame 2         |1  |
|2       |order_customer_id              |1  |
+--------+-------------------------------+---+

循环中第一个值的第二次DF:

+--------+-------------------------------+---+
|order_id|Diff                           |id |
+--------+-------------------------------+---+
|12      |order_status                   |2  |
|1       |order_customer_id order_status |2  |
|68885   |New row in DataFrame 2         |2  |
|68886   |New row in DataFrame 2         |2  |
|2       |order_customer_id              |2  |
+--------+-------------------------------+---+

希望在最后将以上两个组合起来——也可以大于2,所以希望最终结果为组合DF。有人会有逻辑吗?

假设您有以下循环来生成一系列DataFrames:

import spark.implicits._
val dfs: Seq[DataFrame] = List(List((1,1)), List((2,2)), List((3,3))).map(l => l.toDF("a","b"))

您可以使用union功能来组合它们:

val combinedDf = dfs.reduce(_ union _)
combinedDf.show()
+---+---+
|  a|  b|
+---+---+
|  1|  1|
|  2|  2|
|  3|  3|
+---+---+

最新更新