如何根据列的位置选择一列,并在Databricks Spark Scala中与其他列一起在公式中使用



我在Databricks上使用Scala。假设我有一个如下的数据帧:

val df = Seq(
("Alex", 4.0, 3.2, 3.0),
("John", 2.0, 4.2, 1.2),
("Alice", 1.0, 5.0, 3.5),
("Mark", 3.0, 3.5, 0.5),
).toDF("Name", "Test A", "Test B", "Test C")

这给了我:

测试C5.0
名称测试A测试B
Alex4.03.23.0
John
Alice1.03.5
标记3.03.5

您可以通过DataFrame访问map,然后通过它们的位置访问Row的元素:

import org.apache.spark.sql._
import spark.implicits._
val columnNames = Seq("Name", "Test A", "Test B", "Test C")
val df = Seq(
("Alex", 4.0, 3.2, 3.0),
("John", 2.0, 4.2, 1.2),
("Alice", 1.0, 5.0, 3.5),
("Mark", 3.0, 3.5, 0.5)
).toDF(columnNames: _*)
val output = df.map{
row => {
// Dividing the numbers by position
val division = row.getDouble(3) / row.getDouble(2)
// Creating a new row with an extra element: division
(row.getString(0), row.getDouble(1), row.getDouble(2), row.getDouble(3), division)
}
}.toDF(columnNames :+ "division": _*)
output.show                                                                                                                                                                                                                                                              
+-----+------+------+------+-------------------+                                                                                                                                                                                                                                
| Name|Test A|Test B|Test C|           division|                                                                                                                                                                                                                                
+-----+------+------+------+-------------------+                                                                                                                                                                                                                                
| Alex|   4.0|   3.2|   3.0|             0.9375|                                                                                                                                                                                                                                
| John|   2.0|   4.2|   1.2| 0.2857142857142857|                                                                                                                                                                                                                                
|Alice|   1.0|   5.0|   3.5|                0.7|                                                                                                                                                                                                                                
| Mark|   3.0|   3.5|   0.5|0.14285714285714285|                                                                                                                                                                                                                                
+-----+------+------+------+-------------------+

希望这能有所帮助!

最新更新