r-我如何检查大熊猫的栏中有多少种观察类型

我有一列应该只包含一个"a"；或"；b"；，

我该如何检查列中是否有其他输入

ps：我认为在R中它使用了这个

table(df$column_name)

我如何在熊猫中实现类似的输出

我认为可以先使用groupby()，然后使用size()

import pandas as pd
data = [
{"colA": "John", "colB": "a"},
{"colA": "Jane", "colB": "b"},
{"colA": "Bob", "colB": "c"},
{"colA": "Rob", "colB": "a"},
{"colA": "Hobb", "colB": "b"},
{"colA": "Greg", "colB": "b"},
{"colA": "Jennie", "colB": "a"},
{"colA": "Joe", "colB": "a"},
{"colA": "Howard", "colB": "x"},
{"colA": "Dave", "colB": "a"},
]
dataframe = pd.DataFrame(data)
print(dataframe.groupby("colB").size())

输出：

colB
a    5
b    3
c    1
x    1
dtype: int64

假设列中没有NaN值

df["your column name"].value_counts() #this gives you the unique values and how many times they have occured in your column.

或

df["your column name"].nunique() #this only gives you the number of unique values.

检查您的列是否具有NaN值

df["your column name"].isna().sum()

希望这能有所帮助。

您可以使用：

df['column_name'].isin(['a', 'b']).all()

如果将输出True，则所有值都是a或b。

如果您想查看哪些值不正确：

df[~df['column_name'].isin(['a', 'b'])]

要同时执行这两项操作，您可以将掩码保存在一个变量中：

m = df['column_name'].isin(['a', 'b'])
print(m.all())
df[~m]

相关内容

最新更新

热门标签：