具有自动换行功能的 Python 文字处理函数

我正在构建一个文字处理器，并正在尝试实现自动换行功能。

给定一行中的最大字符数，后跟单词列表，我想返回一个字符串集合，其中每行包含尽可能多的单词，并由空格连接。每个字符串的长度不应超过最大长度。

输出的每个字符串中的每个单词之间必须只有一个空格。
每个单词将由英文字母表中的小写字母组成。
不会有标点符号。
可以假定每个单词的最大长度是恒定的。
没有单个单词的长度不会超过给定的行中最大字符长度。

import sys
# Prints to standard output.
def wrapLines(line_length, words):
curr_line = ""
for word in words:
if len(curr_line) + len(word) >= line_length:
curr_line = ""
else:
curr_line += word
print curr_line

def main():
first_line = None
words = []
first_arg = True
for line in sys.stdin:
if len(line.strip()) == 0:
continue
line = line.rstrip()
if first_arg:
lineLength = line
first_arg = False
else:
words.append(line)
wrapLines(lineLength, words)
main()

输入：

13
abc
xyz
foobar
cuckoo
seven
hello

我的输出不断打印所有相互连接的单词，而不是换行。

abc
abcxyz
abcxyzfoobar
abcxyzfoobarcuckoo
abcxyzfoobarcuckooseven
abcxyzfoobarcuckoosevenhello

预期输出：

abc xyz
foobar cuckoo
seven hello

那里有几个问题—— 最重要的是您正在阅读 stdin 中的第一行，并将其用作lineLength，但您不会将其转换为数字。因此，您在lineLength(以及包装函数内的line_length)变量中的值是一个字符串 - 并且比较

if len(curr_line) + len(word) >= line_length:

始终将左侧建议的输出行的长度与字符串进行比较 - 如果您使用的是最新版本的 Python，则此行会出错，因为现在(正确)禁止使用数字和字符串。然而，在 Python 3 中，这个表达式总是 True - 数字器总是被视为比字符串<- 所以超过限制的行的代码永远不会运行。

第二个错误只是您没有将空格连接到行字符串 yu 只是将单词与+=连接起来，但不添加空格。

第三个错误是始终打印循环内正在计算的行 - 无论是否超过行长度。

最后，但并非最不重要的是 - 正如我在上面的评论中所说：不要再使用 Python 2 - 他们制作 Python 3 是有原因的，这是因为语言的发展。

而且，错误较少，但建议：您的函数应该只处理文本并返回数据 - 如果要打印结果，可以从调用者函数打印它。这样，该函数就足够通用，可以在其他上下文中使用。

此外，Python 应用程序的建议缩进大小为 4。尽管使用 2 个空格是有效的代码，但它实际上并未在任何地方使用(但在一些知名公司的私人代码中 - 但那是他们的业务)。

你的固定代码，加上推荐 - 将在Python 2和3中工作：

import sys
def wrapLines(line_length, words):
curr_line = ""
result = []
for word in words:
if len(curr_line) + len(word) + 1 >= line_length:
result.append(curr_line)
curr_line = ""
else:
curr_line += " " + word
if curr_line:
result.append(currline)
return result

def main():
first_line = None
words = []
first_arg = True
for line in sys.stdin:
if len(line.strip()) == 0:
continue
line = line.rstrip()
if first_arg:
line_length = int(line)
first_arg = False
else:
words.append(line)
print("n".join(wrapLines(line_length, words)))

main()

首先，据我所知，您没有指定所需的lineLength因此，我将根据您的预期输出假设14。我个人认为这一切都可以简化为一个函数，该函数循环遍历您的输入单词列表，如果它可以在不超过行长的情况下添加它，它将添加到字符串中，否则它会将字符串添加到我们的输出列表中(因为它无法处理下一个单词)然后重置字符串。我实现了一个while循环，因此在需要重置的迭代中，它不能简单地增加计数器(在我的情况下i)，然后它将在下一次迭代中索引相同的位置，它将是第一个添加到新重置字符串中的行。我在Python 3.X中做了这个，所以它可能在2.X中不起作用，但如果是这种情况，那将是'{}'.format，你可以使用%运算符。在循环的末尾还有另一个wrapped_words.append(current_line.strip())，以便我们也可以抓住最后一行。

我的解决方案：

words_input = ['13', 'abc', 'xyz', 'foobar', 'cuckoo', 'seven', 'hello']
def wrap_words(words_to_wrap, max_line_length):
wrapped_words = []
current_line = ''
i = 0
while i < len(words_to_wrap):
if len(current_line) + len(words_to_wrap[i]) + 1 > max_line_length:  # +1 for the space
wrapped_words.append(current_line.strip())
current_line = ''
else:
current_line += '{} '.format(words_to_wrap[i])
i += 1
if len(current_line):
wrapped_words.append(current_line.strip())
return wrapped_words
print(wrap_words(words_input, 14))

输出：

['13 ABC XYZ'， 'Foobar Cuckoo'， 'Seven hello']

相关内容

最新更新

热门标签：