火花更改 DF 架构列从点重命名为下划线



我有一个列名的数据框,其中列名有dot. 例 : df.printSchema

user.id_number
user.name.last
user.phone.mobile

等等,我想通过将dot替换为_来重命名架构。

user_id_number
user_name_last
user_phone_mobile

注意:此DF的输入数据是JSON格式(非关系NoSQL(

使用任一.map,.withColumnRenamed.替换为_

Example:

val df=Seq(("1","2","3")).toDF("user.id_number","user.name.last","user.phone.mobile")
df.toDF(df.columns.map(x =>x.replace(".","_")):_*).show()
//using replaceAll
df.toDF(df.columns.map(x =>x.replaceAll("\.","_")):_*).show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//|             1|             2|                3|
//+--------------+--------------+-----------------+

2. Using selectExpr:

val expr=df.columns.map(x =>col(s"`${x}`").alias(s"${x}".replace(".","_")).toString)
df.selectExpr(expr:_*).show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//|             1|             2|                3|
//+--------------+--------------+-----------------+

3.Using .withColumnRenamed:

df.columns.foldLeft(df){(tmpdf,col) =>tmpdf.withColumnRenamed(col,col.replace(".","_"))}.show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//|             1|             2|                3|
//+--------------+--------------+-----------------+

最新更新