如何使用python或bash从原始数据中获得浮点数?



我有一个由自动测试/模糊测试工具AFL生成的文件。该文件表示一组可以触发被测程序中的程序错误的输入数据。

我知道这个文件应该包含7个浮点数,但是如果我用cat读取文件,我得到了这些。

6.5
06.5
088.1
16.5
08.3
12.6
0.88.1
16.5
08.3
12.6
0.7@��25

显然,上面的列表有超过7个浮点数,甚至有无法识别的字符。所以我想这些是一些原始数据。我如何编写python脚本(或bash命令行)来获得它们的原始格式,在本例中是7个浮点数?作为参考,我可以写一个C程序来做这样的工作

#include <stdio.h>

int
main(void)
{
  double x0, x1, x2, x3, x4, x5, x6;
  if (scanf("%lf %lf %lf %lf %lf %lf %lf", &x0, &x1, &x2, &x3, &x4, &x5, &x6) != 7) return 2;
  printf ("%g,%g,%g,%g,%g,%g,%gn",   x0, x1, x2, x3, x4, x5, x6);
  return 0;
}

使用上述输入运行C程序确实产生了7个浮点数&;6.5,6.5,88.1,16.5,8.3,12.6,0.88&;,但我正在寻找一个更简单,也许更优雅的python/bash解决方案。任何想法?

最好的方法是使用循环并使其健壮;检查一切下面是一个简单的例子

# Get a list of legal characters
allowed_chars = "1,2,3,4,5,6,7,8,9,0,.".split(",")
# list of lines that have been edited
legalized_lines = []
# Open the raw data file
with open("path/to/file.extension", "r") as file:
    # Get all the lines in the file as a list
    lines = file.read().splitlines();
    # Loop through each line and check if it contains any illegal characters
    for line in lines:
        legalized_line = ""
        point_count = 0
        for char in line:
            if char in allowed_chars:
                legalized_line += char
        # Remove the last decimal point if there are more than 1
        for char in legalized_line:
            if char == ".":
                point_count += 1
        if point_count > 1:
            # Reverse the string and remove the point/s
            legalized_line = legalized_line[::-1]
            legalized_line = legalized_line.replace(".", "", point_count)
            legalized_line = legalized_line[::-1]
        legalized_lines.append(float(legalized_line))
for line in legalized_lines:
    print(line)

相关内容

最新更新