小贝子编程

查找 Spark 数据帧中非空值的计数

本文关键字：空值 Spark 数据帧查找 scala apache-spark apache-spark-sql
更新时间 : 2023-09-17
英文 : Find the count of non null values in Spark dataframe

我试过了

df.describe().filter($"summary" === "count").show

但这仅适用于整数。我尝试了一个 for 循环，如下所示：

import scala.collection.mutable.ListBuffer
var count_val = ListBuffer[Long]()
for (i<-0 to column_names.length) {
count_val += df.select(column_names(i)).where(column_names(i)+" is not null").count
}

还有其他更快的方法吗？ Dataframe 的类型为 org.apache.spark.sql.DataFrame。

Usedef count(e: org.apache.spark.sql.Column): org.apache.spark.sql.Column& 此函数将返回非空值的计数。

它将为您提供与df.describe()相同的结果count.

df.select(df.columns.map(c => count(col(c)).as(c)):_*).show(false)

火花2.4.2版本

df.describe().filter($"summary" === "count").show适用于类型string的列。

查找 Spark 数据帧中非空值的计数

相关内容

最新更新

热门标签：