我有一个由自动测试/模糊测试工具AFL生成的文件。该文件表示一组可以触发被测程序中的程序错误的输入数据。
我知道这个文件应该包含7个浮点数,但是如果我用cat
读取文件,我得到了这些。
6.5
06.5
088.1
16.5
08.3
12.6
0.88.1
16.5
08.3
12.6
0.7@��25
显然,上面的列表有超过7个浮点数,甚至有无法识别的字符。所以我想这些是一些原始数据。我如何编写python脚本(或bash命令行)来获得它们的原始格式,在本例中是7个浮点数?作为参考,我可以写一个C程序来做这样的工作
#include <stdio.h>
int
main(void)
{
double x0, x1, x2, x3, x4, x5, x6;
if (scanf("%lf %lf %lf %lf %lf %lf %lf", &x0, &x1, &x2, &x3, &x4, &x5, &x6) != 7) return 2;
printf ("%g,%g,%g,%g,%g,%g,%gn", x0, x1, x2, x3, x4, x5, x6);
return 0;
}
使用上述输入运行C程序确实产生了7个浮点数&;6.5,6.5,88.1,16.5,8.3,12.6,0.88&;,但我正在寻找一个更简单,也许更优雅的python/bash解决方案。任何想法?
最好的方法是使用循环并使其健壮;检查一切下面是一个简单的例子
# Get a list of legal characters
allowed_chars = "1,2,3,4,5,6,7,8,9,0,.".split(",")
# list of lines that have been edited
legalized_lines = []
# Open the raw data file
with open("path/to/file.extension", "r") as file:
# Get all the lines in the file as a list
lines = file.read().splitlines();
# Loop through each line and check if it contains any illegal characters
for line in lines:
legalized_line = ""
point_count = 0
for char in line:
if char in allowed_chars:
legalized_line += char
# Remove the last decimal point if there are more than 1
for char in legalized_line:
if char == ".":
point_count += 1
if point_count > 1:
# Reverse the string and remove the point/s
legalized_line = legalized_line[::-1]
legalized_line = legalized_line.replace(".", "", point_count)
legalized_line = legalized_line[::-1]
legalized_lines.append(float(legalized_line))
for line in legalized_lines:
print(line)