如何在Spark中的DataFrame中加入字符串和列



我今天的日期是字符串。我需要将其与数据框中的列中存在的时间值相连。

当我尝试此操作时,我会得到String Index out of bounds异常。

我的代码:

val todaydate = LocalDate.now().toString()
println(todaydate)  // o/p: 2016-12-10
val todayrec_cutoff = todaydate + (" ") + df.col("colname")

预期输出:

2016-12-10 05:00 
2016-12-10 22:30
**Please refer to below Scala code for string concat in prefix and postfix way.**

import org.apache.spark.sql.functions._
val empDF =  MongoSpark.load(spark, readConfig) //dataframe empDF is loaded from Mongo DB using MongoSpark 
val prefixVal= "PrefixArkay " //variable string
val postfixVal= " PostfixArkay"
//Prefix
val finalPreDF = ipDF.withColumn("EMP", concat(lit(prefix),empDF.col("EMP")) )
println("finalPreDF.show-> " + finalPreDF.show())
//Output will be as below
+-------------------+
|                EMP|
+-------------------+
|PrefixArkay DineshS|
|+------------------+

val finalPostDF = ipDF.withColumn("EMP", concat(empDF.col("EMP"),lit(postfixVal)) )
println("finalPostDF.show-> " + finalPostDF .show())
//Output will be as below
+--------------------+
|                 EMP|
+--------------------+
|DineshS PostfixArkay|
|+-------------------+

您可以像下面一样做。

import java.time.LocalDate
val df = Seq(("05:00"), ("22:30")).toDF("time")
df.show
val todaydate = LocalDate.now().toString()
val df2 = df.select(concat(lit(todaydate+ " "),df.col("time"))).toDF("datetime");
df2.show

这会给你

+----------------+
|        datetime|
+----------------+
|2016-12-10 05:00|
|2016-12-10 22:30|
+----------------+

相关内容

  • 没有找到相关文章

最新更新