所以我遇到了打开 .Dat 文件并尝试从中提取数字:
self.text= (open("circles.dat", "r")).readlines()
print (self.text)
输出:
['200 200 100n', '75t200t15n', ' 325t200t15n', 'n', 't200tt75 10n', '200 325 10n']
有没有办法我只能提取整数而不包含任何其他内容。编辑:不能使用 Eval()我希望输出是这样的:
[200,200,100,75,200,15,325,200,15,200,75,10,200,325,10]
假设您有所有int
并且它们之间只有空格(例如空格或制表符),那么您可以使用简单的列表推导str.split()
:
>>> with open("circles.dat", "r") as f:
... d = [int(a) for l in f for a in l.split()]
>>> d
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]
>>> import re
>>> num_list = map(int, re.findall(r'd+', open("circles.dat", "r").read()))
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]
使用 .read()
而不是 .readlines()
,因为 read() 将整个文件的内容作为单个字符串返回(可以与正则表达式一起使用),这与返回字符串列表的 readlines() 不同。
获取数字列表(字符串形式)后,使用map()
函数将列表类型转换为int
类型。
步骤说明:
>>> import re
>>> file_content = open("circles.dat", "r").read() # Read file as single string
>>> num_list = re.findall(r'd+', file_content) # Fetch all numbers from string
>>> num_list
['200', '200', '100', '75', '200', '15', '325', '200', '15', '200', '75', '10', '200', '325', '10']
>>> map(int, num_list) # Typecast list of str to list of int
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]
>>> self.text = (open("circles.dat", "r")).readlines()
>>> print self.text
['200 200 100n', '75t200t15n', ' 325t200t15n', 'n', 't200tt75 10n', '200 325 10n']
>>>
>>> ans = map(lambda s: s.rstrip().replace("t", " "), self.text)
>>> ans = " ".join(ans)
>>> ans = ans.split()
>>>
>>> final_ans = [int(a) for a in ans]
>>> final_ans = map(int, ans) # alternative
>>> print final_ans
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]
没有任何模块的解决方案
>>> x = ['200 200 100n', '75t200t15n', ' 325t200t15n', 'n', 't200tt75 10n', '200 325 10n']
>>>
>>> y = "".join(x) # join together
>>> print y
'200 200 100n75t200t15n 325t200t15nnt200tt75 10n200 325 10n'
>>>
>>> z = y.replace("t", " ").replace("n", " ") # replace tabs and new lines
>>> print z
'200 200 100 75 200 15 325 200 15 200 75 10 200 325 10 '
>>>
>>> z = z.split() # removes all whitespace by default
>>> print z
['200', '200', '100', '75', '200', '15', '325', '200', '15', '200', '75', '10', '200', '325', '10']
>>>
>>> res = map(int, z) # convert all to integers
>>> print res
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]
解决方案作为丑陋的单行代码(只有 80 个字符!
res = map(int, "".join(self.text).replace("t", " ").replace("n", " ").split())