下面是我为列名添加前缀的代码。我想排除一个或多个主键列。我的主键是一个字符串数组,可能包含 1 个或多个主键字段。
val primaryKeys = args(2).split("-")
val prefix = "w1."
val renamedColumns = df.columns.map(c=> df(c).as(s"$prefix$c"))
val dfNew = df.select(renamedColumns: _*)
val prefix2 = "w2."
val renamedColumns2 = df2.columns.map(c2=> df2(c2).as(s"$prefix2$c2"))
val df2New = df2.select(renamedColumns2: _*)
If it is just one column i was able to rename using withColumnRenamed but i am unable to do it if i have multiple primary columns.
我无法做这样的事情
for (primaryKey <- primaryKeys) {
dfNew.withColumnRenamed("$PREFIX1"+s"${primaryKey}",s"$primaryKey").toDF()
}
有人可以帮忙吗?
如果我正确理解您的问题,您可以有条件地组合renamedColumns
以仅作为非主键列的前缀,如下所示:
val df = Seq(
("1", "a", "c1", "d1"),
("2", "b", "c2", "d2"),
("3", "c", "c3", "d3")
).toDF("pk1", "pk2", "col1", "col2")
val primaryKeys = Array("pk1", "pk2")
val prefix = "w1."
val renamedColumns = df.columns.map(
c => if ( primaryKeys contains c ) df(c).as(c) else df(c).as(s"$prefix$c")
)
val dfNew = df.select(renamedColumns: _*)
dfNew.show
+---+---+-------+-------+
|pk1|pk2|w1.col1|w1.col2|
+---+---+-------+-------+
| 1| a| c1| d1|
| 2| b| c2| d2|
| 3| c| c3| d3|
+---+---+-------+-------+