在使用Python读取多个文件时，如何搜索错误字符串的重复出现?

我刚刚开始使用Python，我正试图在我的环境中做一些测试…这个想法是试图创建一个简单的脚本来查找在给定时间段内错误的复发。

基本上，我想在我的每日日志中计算服务器故障的次数，如果在给定的时间段(假设30天)内故障发生的次数超过给定的次数(假设10次)，我应该能够在日志中发出警报，但是，我不是试图在30天的间隔内计算错误的重复次数……我真正想做的是计算错误发生、恢复和再次发生的次数，这样我就可以避免在问题持续数天的情况下报告多次。

例如，我们说:

file_2016_Oct_01.txt@hostname@YES
file_2016_Oct_02.txt@hostname@YES
file_2016_Oct_03.txt@hostname@NO
file_2016_Oct_04.txt@hostname@NO
file_2016_Oct_05.txt@hostname@YES
file_2016_Oct_06.txt@hostname@NO
file_2016_Oct_07.txt@hostname@NO

对于上面的场景，我希望脚本将其解释为2个失败而不是4个，因为有时服务器可能在恢复前几天呈现相同的状态，并且我希望能够识别问题的复发，而不仅仅是计算失败的总数。

郑重声明，我是这样浏览这些文件的:

# Creates an empty list
history_list = []
# Function to find the files from the last 30 days
def f_findfiles():
    # First define the cut-off day, which means the last number 
    # of days which the scritp will consider for the analysis
    cut_off_day = datetime.datetime.now() - datetime.timedelta(days=30)
    # We'll now loop through all history files from the last 30 days
    for file in glob.iglob("/opt/hc/*.txt"):
        filetime = datetime.datetime.fromtimestamp(os.path.getmtime(file))
        if filetime > cut_off_day:
            history_list.append(file)
# Just included the function below to show how I'm going 
# through the files, this is where I got stuck...
def f_openfiles(arg):
    for file in arg:
        with open(file, "r") as file:
            for line in file:
                clean_line = line.strip().split("@")
# Main function
def main():
    f_findfiles()
    f_openfiles(history_list)

我使用'with'打开文件，并在'for'中读取所有文件中的所有行，但我不确定如何通过数据导航以比较与一个文件相关的值与旧文件。

我试过把所有的数据放在一个字典里，在一个列表上，或者只是枚举和比较，但我失败了所有这些方法:-(

有什么最好的方法吗?谢谢你！

我最好使用shell实用程序(即uniq)来处理这些问题，但是，只要您喜欢使用python:

用最小的努力，你可以处理它创建适当的dict对象与字符串(如'file_2016_Oct_01.txt@hostname@YES')作为键。在log上迭代，检查字典中是否存在相应的键(使用if 'file_2016_Oct_01.txt@hostname@YES' in my_log_dict)，然后适当地赋值或增加字典值。

简短示例:

data_log = {}
lookup_string = 'foobar'
if lookup_string in data_log:
    data_log[lookup_string] += 1
else:
    data_log[lookup_string] = 1

或者(一行，但在python中大多数时候看起来很难看，我已经编辑它使用换行符来显示):

data_log[lookup_string] = data_log[lookup_string] + 1 
    if lookup_string in data_log 
    else 1

相关内容

最新更新

热门标签：