熊猫,按列和行选择



我想根据列和行的某些函数总结我选择的所有值。

提出它的另一种方法是我要使用行索引和列索引的函数来确定是否应沿轴沿总和包含一个值。

有一个简单的方法吗?

可以使用语法dataframe[<list of columns>]选择列。索引(行)可用于使用dataframe.index方法进行过滤。

import pandas as pd
df = pd.DataFrame({'a': [0.1, 0.2], 'b': [0.2, 0.1]})
odd_a = df['a'][df.index % 2 == 1]
even_b = df['b'][df.index % 2 == 0]
# odd_a: 
# 1    0.2
# Name: a, dtype: float64
# even_b: 
# 0    0.2
# Name: b, dtype: float64

如果df是您的数据框:

In [477]: df
Out[477]: 
   A   s2  B
0  1    5  5
1  2    3  5
2  4    5  5

您可以访问这样的奇数:

In [478]: df.loc[1::2]
Out[478]: 
   A   s2  B
1  2    3  5

甚至这样的人:

In [479]: df.loc[::2]
Out[479]: 
   A   s2  B
0  1    5  5
2  4    5  5

要回答您的问题,均匀的行和列B将是:

In [480]: df.loc[::2,'B']
Out[480]: 
0    5
2    5
Name: B, dtype: int64

和奇数行和列A可以做:

In [481]: df.loc[1::2,'A']
Out[481]: 
1    2
Name: A, dtype: int64

我认为,即使不是最清洁的实现,这应该是相当通用的。这应该允许根据条件(在词典中在此处定义)对行和列应用单独的功能。

import numpy as np
import pandas as pd
ran = np.random.randint(0,10,size=(5,5))
df = pd.DataFrame(ran,columns = ["a","b","c","d","e"])
# A dictionary to define what function is passed
d_col = {"high":["a","c","e"], "low":["b","d"]}
d_row = {"high":[1,2,3], "low":[0,4]}
# Generate list of Pandas boolean Series
i_col = [df[i].apply(lambda x: x>5) if i in d_col["high"] else df[i].apply(lambda x: x<5) for i in df.columns]
# Pass the series as a matrix
df = df[pd.concat(i_col,axis=1)]
# Now do this again for rows
i_row = [df.T[i].apply(lambda x: x>5) if i in d_row["high"] else df.T[i].apply(lambda x: x<5) for i in df.T.columns]
# Return back the DataFrame in original shape
df = df.T[pd.concat(i_row,axis=1)].T
# Perform the final operation such as sum on the returned DataFrame
print(df.sum().sum())

最新更新