如何修正转换字符串错误呢?

我需要将基本字符串转换为目标字符串。我现在有一个工作代码，但如果有"，"字符，它说tvg-name，代码是坏的，不能工作。我怎么能修正这个错误呢?

基本工作字符串:{tvg-id: , tvg-name: A beautiful Day - 2016, tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg, group-title: 2017-16-15 Germany Cinema}

基本问题字符串:{tvg-id: , tvg-name: Antonio, ihm schmeckt's nicht! (2016), tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg, group-title: 2017-16-15 Germany Cinema}

:{"tvg-id": "None", "tvg-name": "Antonio, ihm schmeckt's nicht! (2016)", "tvg-logo": "https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg", "group-title": "2017-16-15 Germany Cinema"}

My Convert Function

def convert(example):
#split the string into a list
example= example.replace("{", "").replace("}", "").split(",")
#create a dictionary
final = {}
#loop through the list
for i in example:
#split the string into a list
i = i.split(":")
#if http or https is in the list merge with next item
if "http" in i[1] or "https" in i[1]:
i[1] = i[1] + ":" + i[2]
i.pop(2)

#remove first char whitespace
if i[0][0] == " ":
i[0]=i[0][1:]
#remove first char whitespace
if i[1][0] == " ":
i[1]=i[1][1:]

final[i[0]] = i[1]


#return the dictionary
return final

我们可以使用正则表达式代替正常的.split(',')来帮助我们处理分割。

import re
def convert(example):
kv_pairs = re.split(', (?=w+-?w+:)', example[1:-1])
result = {}
for kv_pair in kv_pairs:
key, value = kv_pair.split(': ', 1)
result[key] = value
return result

在re.split(', (?=w+-?w+:)', example[1:-1])中，我们只拆分那些后跟模式(?=w+-?w+:)的逗号，例如tvg-logo:。

在key, value = kv_pair.split(': ', 1)中，我们指定了maxsplit=1，这样我们就不需要担心值(如url)中的冒号。

希望有帮助。

如果没有一些启发式方法，你真的无法做到这一点。

这是一段有效的代码-

from typing import Dict, Optional
def convert(input: str) -> Dict[str, Optional[str]]:
input = input.strip()[1:-1]  # Remove the curly braces {...}
result: Dict[str, Optional[str]] = {}
carryover = ''
for pair in input.split(','):
kv = (carryover + pair).strip().split(':', 1)
if len(kv) == 1:
carryover += pair + ','
continue
result[kv[0]] = kv[1] if kv[1] else None
carryover = ''
return result

如果在当前字符串之前没有':'，则防止输出。

注意，如果你有像'{ab,cd:ef,gh}'这样的字符串，这将会中断，因为它不知道如何处理'gh'。这其实有点模棱两可

正确处理所有情况下,唯一的选择就是改变输入源引用的字符串,如果可能的话。如果这是不可能的，或者如果这是一次性的事情，您可以尝试扩展启发式以涵盖所有情况。

Regex做好事:

import re
def convert(s):
s = s[1:-1] # Remove {}
# Split on commas followed by a space then group of characters that end in ':'
s = re.split(', (?=S+:)', s) 
# Split each of these groups on the first ': '. Now it's basically a dict.
return dict(i.split(': ', 1) for i in s)
>>> x = '{tvg-id: , tvg-name: A beautiful Day - 2016, tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg, group-title: 2017-16-15 Germany Cinema}'
>>> print(convert(x))
# Output: 
{'tvg-id': '', 'tvg-name': 'A beautiful Day - 2016', 'tvg-logo': 'https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg', 'group-title': '2017-16-15 Germany Cinema'}
>>> x = "{tvg-id: , tvg-name: Antonio, ihm schmeckt's nicht! (2016), tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg, group-title: 2017-16-15 Germany Cinema}"
>>> print(convert(x))
# Output:
{'tvg-id': '', 'tvg-name': "Antonio, ihm schmeckt's nicht! (2016)", 'tvg-logo': 'https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg', 'group-title': '2017-16-15 Germany Cinema'}

您可以检查字符串是否以{开始并以}结束，然后匹配键值对

匹配键和值的模式:

([^s:,{}]+):s*([^,{}]*)

([^s:,{}]+)捕获组1，匹配除空白字符以外的1+字符:,{}
:s*匹配冒号后面跟着可选的空白字符
([^,{}]*)捕获组2，匹配,{}以外的可选字符

查看正则表达式演示和Python演示

import re
strings = [
"{tvg-id: , tvg-name: A beautiful Day - 2016, tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg, group-title: 2017-16-15 Germany Cinema}",
"{tvg-id: , tvg-name: Antonio, ihm schmeckt's nicht! (2016), tvg-logo: https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg, group-title: 2017-16-15 Germany Cinema}"
]
def convert(example):
pattern = r"([^s:,{}]+):s*([^,{}]*)"
dct = {}
if example.endswith and example.startswith:
for t in re.findall(pattern, example):
if t[1].strip():
dct[t[0]] = t[1]
else:
dct[t[0]] = None
return dct
for s in strings:
print(convert(s))

输出

{'tvg-id': None, 'tvg-name': 'A beautiful Day - 2016', 'tvg-logo': 'https://image.tmdb.org/t/p/w600_and_h900_bestv2/hZgsmIYUAtdUOUFKROq6rNyWXVa.jpg', 'group-title': '2017-16-15 Germany Cinema'}
{'tvg-id': None, 'tvg-name': 'Antonio', 'tvg-logo': 'https://image.tmdb.org/t/p/w600_and_h900_bestv2/dyLfGb1mF2PUd0Rz5kqKiYtQl3r.jpg', 'group-title': '2017-16-15 Germany Cinema'}

My Convert Function

相关内容

最新更新

热门标签：