我正在尝试将以下多行记录转换为单行记录。
之前=>
Item ID:
504246
Teddy Ruxpin,
Stuffed Animal, Bear
Item Price:
$34.50
Status:
Discontinued
Ages:
4-9
Qty:
895
Item ID:
783927
Monopoly,
Board Game
Item Price:
$29.67
Status:
Active
Ages:
8+
Qty:
190200
=>之后
Item ID: 504246, Teddy Ruxpin, Stuffed Animal, Bear, Item Price: $34.50, Status: Discontinued, Ages: 4-9, Qty: 895
Item ID:, 783927, Monopoly, Board Game, Item Price: $29.67, Status: Active, Ages: 8+, Qty: 190200
然而,每当我查找不同的Python库时,我只找到替换单词的示例,而没有找到新行。
也许这就是您想要的:
import re
datastring = """Item ID:
504246
Teddy Ruxpin,
Stuffed Animal, Bear
Item Price:
$34.50
Status:
Discontinued
Ages:
4-9
Qty:
895
Item ID:
783927
Monopoly,
Board Game
Item Price:
$29.67
Status:
Active
Ages:
8+
Qty:
190200
"""
separator=";"
for line in datastring.split("Item ID:"):
line = line.strip()
if not line:
continue
line = "Item ID: %s" % line
line = re.sub(r":w*n",": ",line,re.M)
line = re.sub(r"n","%s "%separator,line,re.M)
print line
首先,我们需要使用"Item ID:"来拆分记录。剥去每一行的前导和尾随空格,然后跳过空行。对于剩下的行,我们在前面加上"Item ID:",因为它是通过拆分删除的。然后我们执行2个正则表达式替换:
- 从包含"label"的地方用":"替换换行符,即以冒号、可能的空格和换行符结尾
- 用选定的分隔符替换所有剩余的换行符(我在代码中使用了分号(
作为for循环的最后一步,我打印行。输出是这样的:
Item ID: 504246; Teddy Ruxpin, ; Stuffed Animal, Bear; Item Price: $34.50; Status: Discontinued; Ages: 4-9; Qty: 895
Item ID: 783927; Monopoly, ; Board Game; Item Price: $29.67; Status: Active; Ages: 8+; Qty: 190200