在 Python 中只保留 int



所以我遇到了打开 .Dat 文件并尝试从中提取数字:

self.text= (open("circles.dat", "r")).readlines()
print (self.text)

输出:

['200 200 100n', '75t200t15n', '   325t200t15n', 'n', 't200tt75 10n', '200 325 10n']

有没有办法我只能提取整数而不包含任何其他内容。编辑:不能使用 Eval()我希望输出是这样的:

[200,200,100,75,200,15,325,200,15,200,75,10,200,325,10]

假设您有所有int并且它们之间只有空格(例如空格或制表符),那么您可以使用简单的列表推导str.split()

>>> with open("circles.dat", "r") as f:
...     d = [int(a) for l in f for a in l.split()]
>>> d
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]
>>> import re
>>> num_list = map(int, re.findall(r'd+', open("circles.dat", "r").read()))
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]

使用 .read() 而不是 .readlines(),因为 read() 将整个文件的内容作为单个字符串返回(可以与正则表达式一起使用),这与返回字符串列表的 readlines() 不同。

获取数字列表(字符串形式)后,使用map()函数将列表类型转换为int类型。

步骤说明

>>> import re
>>> file_content = open("circles.dat", "r").read()  # Read file as single string
>>> num_list = re.findall(r'd+', file_content)  # Fetch all numbers from string
>>> num_list
['200', '200', '100', '75', '200', '15', '325', '200', '15', '200', '75', '10', '200', '325', '10']
>>> map(int, num_list)  # Typecast list of str to list of int
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]
>>> self.text = (open("circles.dat", "r")).readlines()
>>> print self.text
['200 200 100n', '75t200t15n', '   325t200t15n', 'n', 't200tt75 10n', '200 325 10n']
>>>
>>> ans = map(lambda s: s.rstrip().replace("t", " "), self.text)
>>> ans = " ".join(ans)
>>> ans = ans.split()
>>>
>>> final_ans = [int(a) for a in ans]
>>> final_ans = map(int, ans)  # alternative
>>> print final_ans
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]

没有任何模块的解决方案

>>> x = ['200 200 100n', '75t200t15n', '   325t200t15n', 'n', 't200tt75 10n', '200 325 10n']
>>> 
>>> y = "".join(x)  # join together
>>> print y
'200 200 100n75t200t15n   325t200t15nnt200tt75 10n200 325 10n'
>>>
>>> z = y.replace("t", " ").replace("n", " ")  # replace tabs and new lines
>>> print z
'200 200 100 75 200 15    325 200 15   200  75 10 200 325 10 '
>>>
>>> z = z.split()  # removes all whitespace by default
>>> print z
['200', '200', '100', '75', '200', '15', '325', '200', '15', '200', '75', '10', '200', '325', '10']
>>> 
>>> res = map(int, z)  # convert all to integers
>>> print res
[200, 200, 100, 75, 200, 15, 325, 200, 15, 200, 75, 10, 200, 325, 10]

解决方案作为丑陋的单行代码(只有 80 个字符!

res = map(int, "".join(self.text).replace("t", " ").replace("n", " ").split())

最新更新