如何在 python 中使用熊猫访问字符串 "no comm" 而不是空值?


如何使用

熊猫来字符串"no comm"而不是null值?

电磁脉冲.csv

index   empno   ename   job mgr hiredate    sal comm    deptno
0,  7839,   KING,   PRESIDENT,  0,  1981-11-17,     5000,    ,  10
1,  7698,   BLAKE,  MANAGER,    7839,   1981-05-01, 2850,    ,  30
2,  7782,   CLARK,  MANAGER,    7839,   1981-05-09, 2450,    ,  10
3,  7566,   JONES,  MANAGER,    7839,   1981-04-01, 2975,    ,  20
4,  7654,   MARTIN, SALESMAN,   7698,   1981-09-10, 1250,   1400,   30
5,  7499,   ALLEN,  SALESMAN,   7698,   1981-02-11, 1600,    300,    30
6,  7844,   TURNER, SALESMAN,   7698,   1981-08-21, 1500,   0,  30
7,  7900,   JAMES,  CLERK,      7698,   1981-12-11, 950,     ,  30
8,  7521,   WARD,   SALESMAN,   7698,   1981-02-23, 1250,   500,    30
9,  7902,   FORD,   ANALYST,    7566,   1981-12-11, 3000,    ,  20
10, 7369,   SMITH,  CLERK,      7902,   1980-12-09, 800,     ,  20
11, 7788,   SCOTT,  ANALYST,    7566,    1982-12-22, 3000,   ,  20
12, 7876,   ADAMS,  CLERK,      7788,   1983-01-15, 1100,    ,  20
13, 7934,   MILLER, CLERK,      7782,   1982-01-11, 1300,    ,  10

我想得到以下关于使用熊猫的列通信的结果。

结果:

no comm
no comm
no comm
no comm
1400
300
0
no comm
500
no comm
no comm
no comm
no comm
no comm

我想使用 bleow 代码获得上面的结果。

法典:

import sys
import pandas as pd
import dateutil

import pandas as pd
import io
temp=u"""index   empno   ename   job mgr hiredate    sal comm    deptno
0,  7839,   KING,   PRESIDENT,  0,  1981-11-17,     5000,    ,  10
1,  7698,   BLAKE,  MANAGER,    7839,   1981-05-01, 2850,    ,  30
2,  7782,   CLARK,  MANAGER,    7839,   1981-05-09, 2450,    ,  10
3,  7566,   JONES,  MANAGER,    7839,   1981-04-01, 2975,    ,  20
4,  7654,   MARTIN, SALESMAN,   7698,   1981-09-10, 1250,   1400,   30
5,  7499,   ALLEN,  SALESMAN,   7698,   1981-02-11, 1600,    300,    30
6,  7844,   TURNER, SALESMAN,   7698,   1981-08-21, 1500,   0,  30
7,  7900,   JAMES,  CLERK,      7698,   1981-12-11, 950,     ,  30
8,  7521,   WARD,   SALESMAN,   7698,   1981-02-23, 1250,   500,    30
9,  7902,   FORD,   ANALYST,    7566,   1981-12-11, 3000,    ,  20
10, 7369,   SMITH,  CLERK,      7902,   1980-12-09, 800,     ,  20
11, 7788,   SCOTT,  ANALYST,    7566,    1982-12-22, 3000,   ,  20
12, 7876,   ADAMS,  CLERK,      7788,   1983-01-15, 1100,    ,  20
13, 7934,   MILLER, CLERK,      7782,   1982-01-11, 1300,    ,  10"""
#after testing replace io.StringIO(temp) to filename
emp = pd.read_csv(io.StringIO(temp), 
                 skipinitialspace=True,
                 skiprows=1, 
                 parse_dates=[5], 
                 names=['index','empno','ename', 'job','mgr','hiredate','sal','comm','deptno'])

                                               <--------------  ?  

print( emp['comm'])
这可能

只是这个网站上的格式,但看起来 1400、300、0 和 500 与其他数字的缩进级别不同,这就是为什么它不会返回任何通信

最新更新