从字符串解析多项式系数



我正在尝试构建一个正则表达式来解析字符串中多项式的系数。我以为我已经找到了一个解决方案,直到我找到了一个特定的例子,我怀疑它的格式很糟糕,它破坏了我的正则表达式。我也不确定我的解决方案是最优雅的。

以下是我需要解析的字符串的一些示例:

polys = ['1x',
'3.8546E-27x^4-2.4333E-20x^3+5.1165E-14x^2+3.7718E-6x-6.1561E-1',
'6.13159E-3x+0.348',
'0.0100708x-40',
'6.103516E-3x',
'1E-6x',
'1.4846859E-6x',
'2435',
'2.7883E-27x^4-2.2164E-20x^3+5.8443E-14x^2+7.5773E-6x-1.3147E+']

以及我尝试的模式和匹配:

pattern = r'([+-]?d+.?d+[Ee]?[+-]?d*)[x^d+]?|([+-]?d+.?d*[Ee]?[+-]?d*)x'
for poly in polys:
coeffs = []
for match in re.finditer(pattern, poly):
groups = match.groups()
coeff = groups[0] if groups[0] is not None else groups[1]
coeffs.append(float(coeff))
print(coeffs)

这似乎适用于列表中除最后一个多边形之外的所有多边形,它仅在转换为浮点数时失败。如果我将预期的 0 添加到末尾,则结果如下,这就是我正在寻找的结果。

[1.0]
[3.8546e-27, -2.4333e-20, 5.1165e-14, 3.7718e-06, -0.61561]
[0.00613159, 0.348]
[0.0100708, -40.0]
[0.006103516]
[1e-06]
[1.4846859e-06]
[2435.0]
[2.7883e-27, -2.2164e-20, 5.8443e-14, 7.5773e-06, -1.3147]

我可能会假设最后一项格式不正确,要么忽略它,要么处理它,但我忍不住认为有一个更好/更整洁的解决方案。

错误来自最后一个数字-1.3147E+的最后一行。这不是一个正确的符号,E后面的标记丢失了。

一种解决方案可能是在应用步骤之前将其替换为:

poly = re.sub(r"(E+)$", '', poly)

代码变为:

pattern = r'([+-]?d+.?d+[Ee]?[+-]?d*)[x^d+]?|([+-]?d+.?d*[Ee]?[+-]?d*)x'
for poly in polys:
poly = re.sub(r"(E+)$", '', poly)
coeffs = []
for match in re.finditer(pattern, poly):
groups = match.groups()
coeff = groups[0] if groups[0] is not None else groups[1]
coeffs.append(float(coeff))
print(coeffs)
# [1.0]
# [3.8546e-27, -2.4333e-20, 5.1165e-14, 3.7718e-06, -0.61561]
# [0.00613159, 0.348]
# [0.0100708, -40.0]
# [0.006103516]
# [1e-06]
# [1.4846859e-06]
# [2435.0]
# [2.7883e-27, -2.2164e-20, 5.8443e-14, 7.5773e-06, -1.3147]

这是另一种处理split的方法:

out = []
for poly in polys:
poly = re.sub(r"(E+)$", '', poly)
list_coef = re.split(r'x[^d]*', poly)
list_coef = [float(elt) for elt in list_coef if elt]
out.append(list_coef)
[print(o) for o in out]
# [1.0]
# [3.8546e-27, -2.4333e-20, 5.1165e-14, 3.7718e-06, -0.61561]
# [0.00613159, 0.348]
# [0.0100708, -40.0]
# [0.006103516]
# [1e-06]
# [1.4846859e-06]
# [2435.0]
# [2.7883e-27, -2.2164e-20, 5.8443e-14, 7.5773e-06, -1.3147]

最新更新