输入数据帧:原始数据帧
class id name
System 0 System
Generator 1 Coal_Gen
预期输出:新列'Index'值"ST "+ class value + "(id value)">
class id name Index
System 0 System ST System(0)
Generator 1 Coal_Gen ST Generator(1)
Try withconcat
spark中的功能
Example:
df.show()
#+---------+---+--------+
#| class| id| name|
#+---------+---+--------+
#| System| 0| System|
#|Generator| 1|Coal_Gen|
#+---------+---+--------+
from pyspark.sql.functions import *
df.withColumn("index",concat(lit("ST"),lit(" "), col("class"),lit("("),col("id"),lit(")"))).
show()
#+---------+---+--------+---------------+
#| class| id| name| index|
#+---------+---+--------+---------------+
#| System| 0| System| ST System(0)|
#|Generator| 1|Coal_Gen|ST Generator(1)|
#+---------+---+--------+---------------+