如何从一系列数据帧中删除空数据帧?在下面的代码段中,TwoColdf中有许多空的数据帧。还有下面的循环问题,有没有办法使我有效?我尝试将其重写为下线,但没有工作
//finalDF2 = (1 until colCount).flatMap(j => groupCount(j).map( y=> finalDF.map(a=>a.filter(df(cols(j)) === y)))).toSeq.flatten
var twoColDF: Seq[Seq[DataFrame]] = null
if (colCount == 2 )
{
val i = 0
for (j <- i + 1 until colCount) {
twoColDF = groupCount(j).map(y => {
finalDF.map(x => x.filter(df(cols(j)) === y))
})
}
}finalDF = twoColDF.flatten
给定了一组数据框,您可以访问每个数据框的基础RDD,并使用isEmpty
过滤空框架:
val input: Seq[DataFrame] = ???
val result = input.filter(!_.rdd.isEmpty())
至于您的另一个问题 - 我无法理解您的代码试图做什么,但是我首先尝试将其转换为更多功能(删除使用var
S和命令性有条件的条件)。如果我猜测您输入的含义,这可能等同于您要做的事情:
var input: Seq[DataFrame] = ???
// map of column index to column values -
// for each combination we'd want a new DF where that column has that value
// I'm assuming values are Strings, can be anything else
val groupCount: Map[Int, Seq[String]] = ???
// for each combination of DF + column + value - produce the filtered DF where this column has this value
val perValue: Seq[DataFrame] = for {
df <- input
index <- groupCount.keySet
value <- groupCount(index)
} yield df.filter(col(df.columns(index)) === value)
// remove empty results:
val result: Seq[DataFrame] = perValue.filter(!_.rdd.isEmpty())