我正在寻找一个python模块或一些现有的python代码,它们可以用来包装使用">"行前缀表示引用文本的文本(请参阅下面的示例)。
我知道我可以使用pythontextwrappe模块对文本段落进行换行。然而,该模块不知道这种引用前缀。
我知道如何编写一个例程来执行这种文本包装,我并不是在寻求如何编写它的建议。相反,我想知道是否有人知道任何已经存在的python代码或python模块,它们已经能够对电子邮件类型的引引文本执行这种包装。
我一直在找,但在蟒蛇身上什么也没找到。
我只是不想"重新发明轮子",如果已经写了这样的东西的话。
下面是我想要执行的文本换行的示例。假设我有以下来自假设电子邮件的文本:
Abc defg hijk lmnop.
Mary had a little lamb.
Her fleas were white as snow,
> Now is the time for all good men to come to the aid of their party.
>
> The quick
> brown fox jumped over the lazy sleeping dog.
>> When in the Course of human
>> events it
>> becomes necessary for one people to dissolve the political
>> bands
>> which have
>> connected them ...
and everywhere that Mary went,
her fleas were sure to go
... and to reproduce.
> What do you mean by this?
>> with another
>> and to assume among
>> the powers of the earth ...
> Doo wah diddy, diddy dum, diddy doo.
>> Text text text text text text text text text text text text text text text text text text text text text text text text text text text.
假设我想在第52列换行,结果文本应该是这样的:
Abc defg hijk lmnop.
Mary had a little lamb. Her fleas were white as
snow,
> Now is the time for all good men to come to the
> aid of their party.
>
> The quick brown fox jumped over the lazy sleeping
> dog.
>> When in the Course of human events it becomes
>> necessary for one people to dissolve the
>> political bands which have connected them ...
and everywhere that Mary went, her fleas were
sure to go ... and to reproduce.
> What do you mean by this?
>> with another and to assume among the powers of
>> the earth ...
> Doo wah diddy, diddy dum, diddy doo.
>> Text text text text text text text text text text
>> text text text text text text text text text text
>> text text text text text text text.
感谢您对现有python代码的任何引用。
如果"野外"不存在这样的东西,我会写下这篇文章并在这里发布我的代码。
非常感谢。
我找不到任何现有的代码来包装这种引用的文本,所以这是我写的代码。它使用re和textwrap模块。
我根据首引号或缩进字符的数量将代码分成"段落"。然后,我使用textwrap将每一行中的引号或缩进前缀都去掉,以包裹每一个"段落"。包装后,我重新为"段落"的每一行加上前缀。
总有一天我会清理代码,让它变得更优雅,但至少它看起来工作得很好
import re
import textwrap
def wrapemail(text, wrap=72):
if not text:
return ''
prefix = None
prev_prefix = None
paragraph = []
paragraphs = []
for line in text.rstrip().split('n'):
line = line.rstrip()
m = wrapemail.qprefixpat.search(line)
if m:
prefix = wrapemail.whitepat.sub('', m.group(1))
text = m.group(2)
if text and wrapemail.whitepat.search(text[0]):
prefix += text[0]
text = text[1:]
else:
m = wrapemail.wprefixpat.search(line)
if m:
prefix = m.group(1)
text = m.group(2)
else:
prefix = ''
text = line
if not text:
if paragraph and prev_prefix is not None:
paragraphs.append((prev_prefix, paragraph))
paragraphs.append((prefix, ['']))
prev_prefix = None
paragraph = []
elif prefix != prev_prefix:
if paragraph and prev_prefix is not None:
paragraphs.append((prev_prefix, paragraph))
prev_prefix = prefix
paragraph = []
paragraph.append(text)
if paragraph and prefix is not None:
paragraphs.append((prefix, paragraph))
result = ''
for paragraph in paragraphs:
prefix = paragraph[0]
text = 'n'.join(paragraph[1]).rstrip()
wraplen = wrap - len(prefix)
if wraplen < 1:
result += '{}{}n'.format(prefix, text)
elif text:
for line in textwrap.wrap(text, wraplen):
result += '{}{}n'.format(prefix, line.rstrip())
else:
result += '{}n'.format(prefix)
return result
wrapemail.qprefixpat = re.compile(r'^([s>]*>)([^>]*)$')
wrapemail.wprefixpat = re.compile(r'^(s+)(S.*)?$')
wrapemail.whitepat = re.compile(r's')
用指定为52的"wrap"将我原始消息中的文本输入它,确实会产生我上面指定的输出。
请随意改进或窃取。:)