我有一个数据帧,其列名称包含"."我想过滤列以获取包含"."的列名称,然后对其进行选择。任何帮助将不胜感激。这是数据集
//dataset
time.1,col.1,col.2,col.3
2015-12-06 12:40:00,2,2,3
2015-12-07 12:41:35,3,3,4
val spark = SparkSession.builder.master("local").appName("my-spark-app").getOrCreate()
val df1 = spark.read.option("inferSchema", "true").option("header", "true").csv("C:/Users/mhattabi/Desktop/dataTestCsvFile/dataTest2.txt")
val columnContainingdots=df1.schema.fieldNames.filter(p=>p.contains('.'))
df1.select(columnContainingdots)
在列名中使用点将要求您用"'"字符将名称括起来。请参阅下面的代码,这应该可以满足您的目的。
val columnContainingDots = df1.schema.fieldNames.collect({
// since the column names has "." character, we must enclose the column names with "`", otherwise dataframe select will cause exception
case column if column.contains('.') => s"`${column}`"
})
df1.select(columnContainingDots.head, columnContainingDots.tail:_*)