获取pandas数据框中前导NaN值和尾随非NaN值的数量

我有一个数据框架，其中的行包含NaN值。df包含原始列即标题1标题2和标题3和额外列callednamed: 1 named: 2 and named: 3如图所示:

tbody> <<tr>4

标题1	标题2	标题3	未命名:1	未命名:2	未命名:3
南	34	24	45	南	南
南	南	24	45	11	南
南	南	南	45	45道明>
	南	24	南	南	南
南	南	4	南	南	南
南	34	24	南	南	南
22	34	24	南	南	南
南	34	南	45	南	南

作为备选:

df['Count'] = df[['Heading 1', 'Heading 2']].apply(lambda x: sum(x.isnull()), axis=1)
df['Count2'] = df[['Unnamed: 1', 'Unnamed: 2']].apply(lambda x: sum(x.notnull()), axis=1)
df['total']=df[['Count','Count2']].values.tolist()
output=dict(zip(df.index, df.total))
'''
{0: [1, 1], 1: [2, 1], 2: [1, 0], 3: [0, 0], 4: [2, 2], 5: [2, 1]}
'''

或

mask=list(map(list, zip(df[['Heading 1', 'Heading 2']].isnull().sum(axis=1), df[['Unnamed: 1', 'Unnamed: 2']].notnull().sum(axis=1))))
output=dict(zip(df.index,mask))
#{0: [1, 1], 1: [2, 1], 2: [1, 0], 3: [0, 0], 4: [2, 2], 5: [2, 1]}

.isna()(在Cyzanfar的回答中)为我提出了一个例外:

AttributeError: 'numpy.float64' object has no attribute 'isna'

你可以试试下面的方法:

counts = {}
for index, row in df.iterrows():
# Count the number of NaN values in the original columns
num_nan_orig = np.sum(np.isnan(row[['Heading 1', 'Heading 2']]))
# Count the number of non-NaN values in the extra columns
num_non_nan_extra = np.sum(~np.isnan(row[['Unnamed: 1', 'Unnamed: 2']]))
counts[index] = [num_nan_orig, num_non_nan_extra]
print(counts)

输出如下内容:

# {0: [1, 1], 1: [2, 1], 2: [1, 0], 3: [0, 0], 4: [2, 2], 5: [2, 1]}

<一口>的~operator(代码的最后第三行)是Python中的按位求反运算符，它将其操作数的布尔值反转。在这种情况下，它将反转np.isnan()方法产生的布尔值，以便可以计算非nan值。

相关内容

最新更新

热门标签：