目前我正在尝试对一个数据帧计算不同的值。。示例:我导入的*.csv看起来像:
市场 | |
---|---|
dach | 可用 |
dach | 可用 |
nl | 离线 |
fr | 可用 |
nl | 离线 |
fr | in_call |
dach | 可用 |
fr | in_call |
使用计数器怎么样?
from collections import Counter
for market, group in df.groupby('market'):
print(market)
print(Counter(group.availability))
print()
dach
Counter({'available': 3})
fr
Counter({'in_call': 2, 'available': 1})
nl
Counter({'offline': 2})
或value_counts((
for market, group in df.groupby('market'):
print(market)
print(group.availability.value_counts())
print()
dach
available 3
Name: availability, dtype: int64
fr
in_call 2
available 1
Name: availability, dtype: int64
nl
offline 2
Name: availability, dtype: int64
对于一个函数,我会做一些类似的事情:
from collections import Counter
def market_status(market, df):
d = Counter(
{
"available":0,
"in_call":0,
"offline":0,
"do_not_disturb":0,
"after_call_work":0
}
)
d.update(Counter(df.loc[df.market == market].availability))
return d
print(market_status('fr', df))
Counter({'in_call': 2, 'available': 1, 'offline': 0, 'after_call_work': 0})
这将为您提供我认为您正在寻找的分组:
dfa = pd.read_csv("groupbycolumn.csv", encoding='ISO-8859-1')
dfa.groupby(["Market", "Availability"])["Availability"].count()
输出:
Market Availability
A Available 1
dont dist 2
in_call 1
B Available 1
dont dist 1
in_call 3
C Available 1
dont dist 2
H Available 1
in_call 3
J in_call 1
Name: Availability, dtype: int64
这是我创建的一个示例csv文件,用来举例说明输出的外观。