使用Python2合并CSV行并将数据从单个任意列中保存



我知道这个主题有很多问题,但是答案并不是很好解释,因此很难适应我的用例。这里的一个似乎很有希望,但是语法相当复杂,我很难理解和适应它。

我需要将RAW CSV输出从NESSUS转换为标准格式,该格式本质上倾倒了许多列仅保留每个发现的严重性,IP地址和输出。我将一个脚本放在一起,仅执行>,但如果发现在多个主机/端口上,则每个主机/端口都有不同的行。

我需要的是根据漏洞名称合并行,而仅保留IP地址数据。

示例输入 - 缩短以轻松

High,10.10.10.10,MS12-345(this is the name),Hackers can do bad things
High,10.10.10.11,MS12-345(this is the name),Hackers can do bad things

示例输出

High,10.10.10.10 10.10.10.11,MS12-345(this is the name),Hackers can do bad things

下面是我迄今为止的脚本。如果您为将来的读者提供容易适应的答案(阅读:愚蠢的),我将不胜感激,我敢肯定他们也会。

奖金:

有时对于具有相同名称的发现,输出字段有所不同,有时是相同的。如果您有一些时间掌握在手上,为什么不帮助男人对此进行检查,并以与IP地址相同的方式附加到输出差异呢?

import sys
import csv
def manipulate(inFile):
    with open(inFile, 'rb') as csvFile:
        fileReader = csv.reader(csvFile, dialect='excel')
        # Check for multiple instances of findings and merge the rows
        # This happens when the finding is on multiple hosts/ports
        //YOUR CODE WILL GO HERE (Probably...)
        # Place findings into lists: crits, highs, meds, lows for sorting later
        crits = []
        highs = []
        meds = []
        lows = []
        for row in fileReader:
            if row[3] == "Critical":    
                crits.append(row)
            elif row[3] == "High":
                highs.append(row)
            elif row[3] == "Medium":
                meds.append(row)
            elif row[3] == "Low":
                lows.append(row)
        # Open an output file for writing
        with open('output.csv', 'wb') as outFile: 
            fileWriter = csv.writer(outFile)
            # Add in findings from lists in order of severity. Only relevant columns included
            for c in crits:
                fileWriter.writerow( (c[3], c[4], c[7], c[12]) )
            for h in highs:
                fileWriter.writerow( (h[3], h[4], h[7], h[12]) )
            for m in meds:
                fileWriter.writerow( (m[3], m[4], m[7], m[12]) )
            for l in lows:
                fileWriter.writerow( (l[3], l[4], l[7], l[12]) )

# Input validation
if len(sys.argv) != 2:
    print 'You must provide a csv file to process'
    raw_input('Example: python nesscsv.py foo.csv')
else:
    print "Working..."
    # Store filename for use in manipulate function
    inFile = str(sys.argv[1])
    # Call manipulate function passing csv
    manipulate(inFile)
print "Done!"   
raw_input("Output in output.csv. Hit return to finish.")

这是一种解决方案,该解决方案使用有序的dict以保留其订单的方式收集行,同时还允许通过其漏洞名称查找任何行。

import sys
import csv
from collections import OrderedDict
def manipulate(inFile):
    with open(inFile, 'rb') as csvFile:
        fileReader = csv.reader(csvFile, dialect='excel')
        # Check for multiple instances of findings and merge the rows
        # This happens when the finding is on multiple hosts/ports
        # Dictionary mapping vulns to merged rows.
        # It's ordered to preserve the order of rows.
        mergedRows = OrderedDict()
        for newRow in fileReader:
            vuln = newRow[7]
            if vuln not in mergedRows:
                # Convert the host and output fields into lists so we can easily
                # append values from rows that get merged with this one.
                newRow[4] = [newRow[4], ]
                newRow[12] = [newRow[12], ]
                # Add row for new vuln to dict.
                mergedRows[vuln] = newRow
            else:
                # Look up existing row for merging.
                mergedRow = mergedRows[vuln]
                # Append values of host and output fields, if they're new.
                if newRow[4] not in mergedRow[4]:
                    mergedRow[4].append(newRow[4])
                if newRow[12] not in mergedRow[12]:
                    mergedRow[12].append(newRow[12])
        # Flatten the lists of host and output field values into strings.
        for row in mergedRows.values():
            row[4] = ' '.join(row[4])
            row[12] = ' // '.join(row[12])
        # Place findings into lists: crits, highs, meds, lows for sorting later
        crits = []
        highs = []
        meds = []
        lows = []
        for row in mergedRows.values():
            if row[3] == "Critical":
                crits.append(row)
            elif row[3] == "High":
                highs.append(row)
            elif row[3] == "Medium":
                meds.append(row)
            elif row[3] == "Low":
                lows.append(row)
        # Open an output file for writing
        with open('output.csv', 'wb') as outFile:
            fileWriter = csv.writer(outFile)
            # Add in findings from lists in order of severity. Only relevant columns included
            for c in crits:
                fileWriter.writerow( (c[3], c[4], c[7], c[12]) )
            for h in highs:
                fileWriter.writerow( (h[3], h[4], h[7], h[12]) )
            for m in meds:
                fileWriter.writerow( (m[3], m[4], m[7], m[12]) )
            for l in lows:
                fileWriter.writerow( (l[3], l[4], l[7], l[12]) )

# Input validation
if len(sys.argv) != 2:
    print 'You must provide a csv file to process'
    raw_input('Example: python nesscsv.py foo.csv')
else:
    print "Working..."
    # Store filename for use in manipulate function
    inFile = str(sys.argv[1])
    # Call manipulate function passing csv
    manipulate(inFile)
print("Done!")
raw_input("Output in output.csv. Hit return to finish.")

最新更新