将自定义字符串转换为Dict



HELLO,我需要将这种string转换为下行dict

string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"

DICT

requirements={
'Os':'Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)',
'Processor':' Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or better',
'Memory':'6 GB RAM',
'Graphics':'VIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List',
}

我试过这个

string = string.split(':')

并用类似的dict存储每个列表

requirements['Os'] = string[0]
requirements['Processor'] = string[1]

但这不是正确的做法!这给我带来了更多的错误。那么,这些东西有什么自定义的函数或模块吗?

我会使用正则表达式来捕获您想要的文本,因为输入字符串的实际格式不会改变。这应该给你想要的:


import re
string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"
matches = re.match(r'OS: (.+)Processor: (.+)Memory: (.+)Graphics: (.+)', string)
requirements = {
'Os': matches.group(1),
'Processor': matches.group(2),
'Memory': matches.group(3),
'Graphics': matches.group(4),
}
print(requirements)

regex有点不灵活,我建议将其作为一个起点。

请参阅重新匹配

这是一种替代的非正则表达式解决方案,尽管正则表达式原则上可能更高效、更干净:

input_string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"
# Splits by space
input_string = input_string.split()
# Assumes the keys are exactly like listed - including uppercase letters
key_list = ["OS", "Processor", "Memory", "Graphics"]
key_ind = []
output = {}
# Collect indices corresponding to each key
for key in key_list:
for idx, el in enumerate(input_string):
if key in el:
key_ind.append(idx)
break
# Build the dictionary
for idx, key in enumerate(key_list):
if idx + 1 >= len(key_list):
output[key] = (' ').join(input_string[key_ind[idx]+1:])
else:
lp_idx = input_string[key_ind[idx+1]].find(key_list[idx+1])
lp = input_string[key_ind[idx+1]][:lp_idx]
output[key] = (' ').join(input_string[key_ind[idx]+1:key_ind[idx+1]]) + ' ' + lp
print(output)

在这里,字符串首先根据空白进行拆分,然后代码找到包含未来字典的键标签的每个代码块的位置。在存储了每个键的索引之后,代码会基于它们构建字典,最后一个元素是特殊情况。

对于除last之外的所有元素,代码还提取下一个键之前的信息。这是假设下一个键和您要为当前键存储的文本的最后一部分之间没有空格,即它始终是(64bit versions only)Processor:而不是(64bit versions only) Processor:——如果您不能做出这种假设,则需要扩展此代码以使用空格来覆盖这些情况。

最新更新