如何使用第一个字母作为参数从列表/序列/数据框中筛选单词

示例：

list = [word, christmas, banana, jupyter] #it can be also a series or dataframe.

我想选择以"；W"；以及"；C"；并将这些字放入新的列表/序列/数据帧中。

我试过这样的东西，但它不起作用：

wc_words = []
for word in list:
if word.str.startswith(('W','C')):
wc_words.append(word)

您可以尝试Python的filter函数。它是内置的：

filtered_list = list(filter(lambda string: string.startswith(('W','C')), original_list))

filter(function, iterable)函数将返回一个迭代器，列表将迭代器强制转换为所需的列表。

函数需要一个参数来判断，所以这里的lambda表达式只是一个包装器，用于将其转换为所需的形式

我会选择victrid提出的解决方案，所以这里有一个替代方案：

list(filter(lambda x: x[0].lower() in ('w', 'c'), my_list))

注意。该解决方案不区分大小写

如果你有列表，那么尝试列表理解：

out=[x.lower() for x in lst if x[0] in ['w','c']]

如果你有系列：

out=ser[ser.str[0].str.lower().isin(['w','c'])]
#You can also use str.startswith() but you have to use it 2 times with | condition

如果你有数据帧：

df.loc[df['objects'].str[0].str.lower().isin(['w','c'])]
#OR
#df.loc[df['objects'].str.startswith('w') | df['objects'].str.startswith('c')]

样本代码：

ser = pd.Series(['word', 'christmas', 'banana','jupyter'])
lst = ['word', 'christmas', 'banana','jupyter']
df=pd.DataFrame(['word', 'christmas', 'banana','jupyter'],columns=['objects'])

如果你想过滤字符串列表，你可以选择：

my_words=['word', 'christmas', 'banana', 'jupyter']
wc_words = list(filter(lambda x: x.startswith(('w', 'c')), my_words))

但我注意到你的列表中没有使用字符串。如果这些是变量名，并且您希望过滤以"开头的变量；w"；或"；c"；，您需要获得字符串形式的变量名，然后对其进行过滤：

words = 'string'
christmas = 3
banana = False
jupyter = {'a': 'b'}
my_words = [words, christmas, banana, jupyter]
wc_values = my_words.copy()
for word in my_words:
v = vars()
if not list(filter(lambda x: v.get(x)==word, v))[0].startswith(('w', 'c')):
wc_values.remove(word)

相关内容

最新更新

热门标签：