过滤数字的数据



嗨,我有一个带有列CodeArticle的数据框

|CODEARTICLE|    STRUCTURE|                 DES|TYPEMARK|TYP|IMPLOC|MARQUE|GAMME|TAR|
+-----------+-------------+--------------------+--------+---+------+------+-----+---+
| GENCFFRIST|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  0| Local|      |     |   |
| GENCFFMARC|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  0| Local|      |     |   |
| GENCFFESCO|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  0| Local|      |     |   |
|  GENCFFTNA|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  0| Local|      |     |   |
| GENCFFEMBA|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  0| Local|      |     |   |
|  789600010|9999999999998|xxxxxxxxxxxxxxxxx...|       7|  1| Local|      |     |   |
|  799700040|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  1| Local|      |     |   |
|  799701000|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  1| Local|      |     |   |
|  899980490|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  9| Local|      |     |   |
|  429600010|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  1| Local|      |     |   |
|  559970040|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  0| Local|      |     |   |
|  679500010|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  1| Local|      |     |   |
|  679500040|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  1| Local|      |     |   |
|  679500060|9999999999998|xxxxxxxxxxxxxxxxx...|       0|  1| Local|      |     |   |
+-----------+-------------+--------------------+--------+---+------+------+-----+---+

我只想拿一个数字codearticler的行 //连接到表TMP_structure oracle

  val spark = sparkSession.sqlContext
  val articles_Gold = spark.load("jdbc",
    Map("url" -> "jdbc:oracle:thin:System/maher@//localhost:1521/XE",
      "dbtable" -> "IPTECH.TMP_ARTICLE")).select("CODEARTICLE", "STRUCTURE", "DES", "TYPEMARK", "TYP", "IMPLOC", "MARQUE", "GAMME", "TAR")
val filteredData =articles_Gold.withColumn("test",'CODEARTICLE.cast(IntegerType)).filter($"test"!==null)

非常感谢

使用 na.drop

articles_Gold.withColumn("test",'CODEARTICLE.cast(IntegerType)).na.drop("test")

您可以在filter函数的列上使用.isNotNull功能。您甚至不需要为逻辑创建另一列。您可以简单地执行以下

val filteredData = articles_Gold.withColumn("CODEARTICLE",'CODEARTICLE.cast(IntegerType)).filter('CODEARTICLE.isNotNull)

我希望答案有帮助

最新更新