如何根据数据集一列中数组中存在的多个值在数据集中创建新行



如何根据数据集一列数组中存在的多个值在数据集中创建新行:

我有一个包含以下数据的数据集:

+----+---------+-------------------+------------------+
|name|productId|              total|            scores|
+----+---------+-------------------+------------------+
| aaa|      200|               0.29|            [0.29]|
| bbb|      200| 1.3900000000000001|      [0.53, 0.33]|
| aaa|      100|0.22999999999999998|      [0.12, 0.11]|
+----+---------+-------------------+------------------+

我想在 scala 中将其转换为以下格式:

+----+---------+-------------------+------------------+
|name|productId|              total|            scores|
+----+---------+-------------------+------------------+
| aaa|      200|               0.29|            0.29  |
| bbb|      200| 1.3900000000000001|            0.53  |
| bbb|      200| 1.3900000000000001|            0.33  |
| aaa|      100|0.22999999999999998|            0.12  |
| aaa|      100|0.22999999999999998|            0.11  |
+----+---------+-------------------+------------------+

这正是 explode 函数的用途:

df.withColumn("score", explode('scores))

最新更新