从Pandas数据框中删除行



我的输出如下:

Date    Sex  Race                  Cause of Death                     City state
07/17/2012 Female White     1,1-Difluoroethane Toxicity                      NaN   NaN
10/01/2012   Male White                 Heroin Toxicity                 PORTLAND    CT
04/28/2013   Male White           Acute Heroin Toxicity CT(41.575155 -72.738288)   NaN
04/06/2014   Male White Heroin and Cocaine Intoxication                WATERBURY    CT
04/27/2014   Male White       Acute Heroin Intoxication               NEW LONDON    CT

我的城市col值为CT(41.575155 -72.738288),我想删除这一行。我该怎么做呢?

如果你认为一个城市的名字中没有数字:

>>> df[~df['City'].fillna('').str.findall(r'd').astype(bool)]
Date     Sex   Race                   Cause of Death        City state
0  07/17/2012  Female  White      1,1-Difluoroethane Toxicity         NaN   NaN
1  10/01/2012    Male  White                  Heroin Toxicity    PORTLAND    CT
3  04/06/2014    Male  White  Heroin and Cocaine Intoxication   WATERBURY    CT
4  04/27/2014    Male  White        Acute Heroin Intoxication  NEW LONDON    CT

Step by Step:

# Convert NaN to '' because NaN is a number :-)
>>> df['City'].fillna('')
0                              
1                    PORTLAND
2    CT(41.575155 -72.738288)
3                   WATERBURY
4                  NEW LONDON
Name: City, dtype: object
# Now find a number in the string
>>> df['City'].fillna('').str.findall(r'd')
0                                                  []
1                                                  []
2    [4, 1, 5, 7, 5, 1, 5, 5, 7, 2, 7, 3, 8, 2, 8, 8]
3                                                  []
4                                                  []
Name: City, dtype: object
# Convert to boolean. An empty list return False
>>> df['City'].fillna('').str.findall(r'd').astype(bool)
0    False
1    False
2     True
3    False
4    False
Name: City, dtype: bool
# Invert the mask with ~
>>> ~df['City'].fillna('').str.findall(r'd').astype(bool)
0     True
1     True
2    False
3     True
4     True
Name: City, dtype: bool
# Finally keep the right rows (see the answer)

相关内容

  • 没有找到相关文章

最新更新