从字节到GB Python获得百分比



我正在处理一个程序,该程序检查文件夹大小,然后打印出50GB所使用的最大量的百分比。我遇到的问题是,如果数据仅为1MB或不是GB的少量,我没有获得准确的百分比。我如何改进我的代码来解决此问题。

import math, os
def get(fold):
        total_size = 0
        for dirpath, dirnames, filenames in os.walk(fold):
            for f in filenames:
                fp = os.path.join(dirpath, f)
                size = os.path.getsize(fp)
                total_size += size
        size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
        i = int(math.floor(math.log(total_size, 1024)))
        p = math.pow(1024, i)
        s = round(total_size / p, 2)
        return "%s %s" % (s, size_name[i])
per = 100*float(get(fold))/float(5e+10)
print(per)

您可能要缩减的一个地方是,您正在添加文件大小而不考虑块大小。例如,在我的系统上,分配块大小为4096字节。因此,如果我'echo 1> test.txt',此1个字节文件占4096字节。我们可以重新设计代码以尝试考虑块:

import math
import os
SIZE_NAMES = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
def get(fold):
    total_size = 0
    for dirpath, _, filenames in os.walk(fold):
        for f in filenames:
            fp = os.path.join(dirpath, f)
            stat = os.stat(fp)
            size = stat.st_blksize * math.ceil(stat.st_size / float(stat.st_blksize))
            total_size += size
    i = int(math.floor(math.log(total_size, 1024)))
    p = math.pow(1024, i)
    s = round(total_size / p, 2)
    return "%s %s" % (s, SIZE_NAMES[i])

尽管getsize()欠算会影响所有文件,但从百分比方面,它会影响较小的文件。当然,目录节点也占用空间。另外,该计算有几个问题:

per = 100*float(get(fold))/float(5e+10)

首先,它失败了,因为fold()返回了'122.23 MB'之类的字符串,而float()不喜欢。其次,它无法说明已在float()代码中调整的数字的单位,但在此处却没有调整。最后,它没有解决Gigabyte vs. Gibibyte问题(如果没有别的话,请在评论中。)即fold()代码中的1024功率减少了该空间,但在此处除以1000的功率。我的返工:

number, unit = get(fold).split()  # "2.34 MB" -> ["2.34", "MB"]
number = float(number) * 1024 ** SIZE_NAMES.index(unit)  # 2.34 * 1024 ** 2
print("{0:%}".format(number / 500e9))  # percentage of 500GB

您在代码中混合了一些东西;例如,您的功能get()返回字符串,但您尝试将其施放到float

我建议将其分开一点。首先是格式化大小的函数(我有一些想法形式的其他stackoverflow问题):

SIZE_UNITS = ['', 'K', 'M', 'G', 'T']
def format_size(size_in_bytes):
    if size_in_bytes == 0:
        return '0.0 B'
    exp = math.floor(math.log(size_in_bytes, 1024))
    size = size_in_bytes / math.pow(1024, exp)
    return '{:.1f} {}B'.format(
        size,
        SIZE_UNITS[exp])

您有一个功能可以计算目录的大小,并且可以很好地打印信息:

def get_size_of_dir(dir_path):
    total_size = 0
    for dir_path, dir_list, file_list in os.walk(dir_path):
        for filename in file_list:
            f = os.path.join(dir_path, filename)
            size = os.path.getsize(f)
            total_size += size
    return total_size
def print_info(dir_path, capacity):
    total_size = get_size_of_dir(dir_path)
    percent = total_size * 100.0 / capacity
    print()
    print('Directory:     "{}"'.format(dir_path))
    print('capacity       {:>10s}'.format(format_size(capacity)))
    print('total_size     {:>10s}'.format(format_size(total_size)))
    print('percent used   {:8.1f} %'.format(percent))

我的机器上看起来像这样:

# 1024**1 == > 1 KB
# 1024**2 == > 1 MB
# 1024**3 == > 1 GB
>>> capacity = 5 * 1024**3
>>> for folder in ('/home/ralf/Documents/', '/home/ralf/Downloads/'):
...     print_info(folder, capacity)
Directory:     "/home/ralf/Documents/"
capacity           5.0 GB
total_size       721.7 MB
percent used       14.1 %
Directory:     "/home/ralf/Downloads/"
capacity           5.0 GB
total_size         1.3 GB
percent used       25.7 %

最新更新