如何使用数据帧限制列中的字符串(字符/单词)数



尝试限制数据帧输出中显示的字符数。

下面是数据帧的示例:

     Abc                       XYZ
0  Hello   How are you doing today
1   Good   This is a job well done
2    Bye          See you tomorrow
3  Books  Read chapter 1 to 5 only

期望输出:

     Abc                       XYZ
0  Hello                   How are 
1   Good                   This is
2    Bye                   See you
3  Books              Read chapter

这是我尝试过的:

pd.set_option('display.max_info_rows', 2)
pd.set_option('display.max_info_columns', 2)
pd.set_option('display.max_colwidth', 2)

max_info_rowsmax_info_columns什么也没做,而max_colwidth实际上进一步扩展了角色。

无论如何要限制数据帧中的字符数?

谢谢!

试试这个:

df.XYZ.apply(lambda x : x.rsplit(maxsplit=len(x.split())-2)[0])
0         How are
1         This is
2         See you
3    Read chapter

只需将其重新分配回来:

df.XYZ = df.XYZ.apply(lambda x : x.rsplit(maxsplit=len(x.split())-2)[0])
print(df)
     Abc           XYZ
0  Hello       How are
1   Good       This is
2    Bye       See you
3  Books  Read chapter

让熊猫在每个字符串中只显示两个单词会很棘手。Python中的字符串本身并没有单独的"单词"的概念。你可以做的是将每个字符串拆分为一个字符串列表(每个单词一个字符串),然后使用 'display.max_seq_items' 选项限制 Pandas 打印的列表项数量:

import pandas as pd
d = '''     Abc                       XYZ
0  Hello   "How are you doing today"
1   Good   "This is a job well done"
2    Bye          "See you tomorrow"
3  Books  "Read chapter 1 to 5 only"'''
df = pd.read_csv(pd.compat.StringIO(d), sep='s+')
# convert the XYZ values from str to list of str
df['XYZ'] = df['XYZ'].str.split()
# only display the first 2 values in each list of word strings
with pd.option_context('display.max_seq_items', 2):
    print(df)

输出:

     Abc                   XYZ
0  Hello       [How, are, ...]
1   Good       [This, is, ...]
2    Bye       [See, you, ...]
3  Books  [Read, chapter, ...]

最新更新