在熊猫数据帧中插入一个零 pd.count() 结果 < 1



我正试图找到一种将零插入pandas数据帧的方法,其中.count((聚合函数的结果为<1.我尝试过设置一个条件,在该条件下它查找null/None值,并使用一个简单的<1名操作员。到目前为止,我只能计算存在分类变量的实例。下面是一些示例代码来演示我的问题:


data = {'Person': ['Jim', 'Jim', 'Jim', 'Jim', 'Jim', 'Bob','Bob','Bob','Bob','Bob',], 'Result': ['Good', 'Good','Good','Good','Good','Good','Bad','Good','Bad','Bad',]}
dtf = pd.DataFrame.from_dict(data)
names = ['Jim','Bob']
append = []
for i in names:
good = dtf[dtf['Person']==i]
good = good[good['Result']=='Good']
if good['Result'].count() > 0:
good.insert(2,"Count",good['Result'].count())
elif good['Result'].count() < 1:
good.insert(2,"Count",0)
bad = dtf[dtf['Person']==i]
bad = bad[bad['Result']=='Bad']
if bad['Result'].count() > 0:
bad.insert(2,"Count",bad['Result'].count())
elif bad['Result'].count() < 1:
bad.insert(2,"Count",0)
res = [good,bad]
res = pd.concat(res)
append.append(res)

print(res)

当前输出为:

Person Result  Count
0    Jim   Good      5
1    Jim   Good      5
2    Jim   Good      5
3    Jim   Good      5
4    Jim   Good      5
Person Result  Count
5    Bob   Good      2
7    Bob   Good      2
6    Bob    Bad      3
8    Bob    Bad      3
9    Bob    Bad      3

我试图实现的是,对于dtf["Results"]列中的"Bad"变量,Jim的计数为零。像这样:

Person Result  Count
0    Jim   Good      5
1    Jim   Good      5
2    Jim   Good      5
3    Jim   Good      5
4    Jim   Good      5
5    Jim    Bad      0
Person Result  Count
6    Bob   Good      2
7    Bob   Good      2
8    Bob    Bad      3
9    Bob    Bad      3
10   Bob    Bad      3

我希望这是有道理的。抵抗万岁!└[⑪┌]└[⑪]┘[┐∵]┘

首先从PersonResult的乘积创建一个多索引mi,以保留df中缺少的组合。然后计数(size(所有组,并通过多索引重新索引。最后,合并这两个数据帧使用来自两者的键的并集。

mi = pd.MultiIndex.from_product([df["Person"].unique(),
df["Result"].unique()],
names=["Person", "Result"])
out = df.groupby(["Person", "Result"]) 
.size() 
.reindex(mi, fill_value=0) 
.rename("Count") 
.reset_index()
out = out.merge(df, on=["Person", "Result"], how="outer")
>>> out
Person Result  Count
0     Jim   Good      5
1     Jim   Good      5
2     Jim   Good      5
3     Jim   Good      5
4     Jim   Good      5
5     Jim    Bad      0
6     Bob   Good      2
7     Bob   Good      2
8     Bob    Bad      3
9     Bob    Bad      3
10    Bob    Bad      3

输出:

names, append = list(zip(*out.groupby("Person")))
>>> names
('Bob', 'Jim')
>>> append
(   Person Result  Count
6     Bob   Good      2
7     Bob   Good      2
8     Bob    Bad      3
9     Bob    Bad      3
10    Bob    Bad      3,
Person Result  Count
0    Jim   Good      5
1    Jim   Good      5
2    Jim   Good      5
3    Jim   Good      5
4    Jim   Good      5
5    Jim    Bad      0)

最新更新