这是我的代码:
import pandas as pd
df = pd.read_csv('E:/cnpj/socios.csv', quotechar='"', sep=',', usecols=["cnpj", "nome_socio"], warn_bad_lines=True, error_bad_lines=False, low_memory=False, nrows=100000)
#df.set_index(['cnpj'], inplace=True)
print (df.head)
result = df[df['cnpj'].isin(["191"])]
print(result)
返回结果:
<bound method NDFrame.head of cnpj nome_socio
0 191 MARCIO HAMILTON FERREIRA
1 191 NILSON MARTINIANO MOREIRA
2 191 WALTER MALIENI JUNIOR
3 191 CARLOS ALBERTO ARAUJO NETTO
4 191 ANTONIO MAURICIO MAURANO
... ... ...
99995 172561 CARLOS ALBERTO ARAUJO NETTO
99996 172561 ANTONIO MAURICIO MAURANO
99997 172561 MARCELO AUGUSTO DUTRA LABUTO
99998 172561 ROGERIO MAGNO PANCA
99999 172561 TARCISIO HUBNER
[100000 rows x 2 columns]>
Empty DataFrame
Columns: [cnpj, nome_socio]
Index: []
事情是:我想获得列'cnpj',将其与值(在本例中为' 191 ';)进行比较,当它找到列具有此值的行时,将其发送到数据框' result ';),并将其写入csv文件。
但是,正如您所看到的,pd数据帧正确地读取了文件,但是我用来进行比较并编写"结果"的代码Dataframe总是返回一个空的Dataframe
见解吗?
ps:文件的示例如下:
"cnpj","nome_socio"
191","MARCIO HAMILTON FERREIRA">
"191","NILSON martinano morera ">
<191", 172561";CARLOS ALBERTO ARAUJO NETTO"> ANTONIO MAURICIO MAURANO"> 172561","MARCELO AUGUSTO DUTRA LABUTO"> 172561", "172561","TARCISIO HUBNER"
我的朋友,你在比较"作为字符串,我认为你的数据是作为整数。试着这样修改代码:
result = df[df['cnpj'].isin([191])]
另一个建议。如果您没有比较可能值的列表,请尝试这样做:
result = df.loc[df['cnpj'] = 191]
我还建议你在过滤之前清理你的代码。删除重复项等
要保存为CSV, pandas有DataFrame。to_csv函数。
问候!