比较两个文本文件中的每一行



我希望我能帮助解决这个问题:

我有两个由大约10000行组成的文本文件(比如说File1和File2),它们来自FEM分析。文件的结构为:

File1

        ....
     Element           Facet            Node  CNORMF.Magnitude     CNORMF.CNF1     CNORMF.CNF2     CNORMF.CNF3          CPRESS         CSHEAR1         CSHEAR2  CSHEARF.Magnitude    CSHEARF.CSF1    CSHEARF.CSF2    CSHEARF.CSF3
         881               3            6619              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         881               3            6648              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         881               3            6653              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         930               3            6452              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         930               3            6483              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         930               3            6488              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        1244               2            7722              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        1244               2            7724              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        1244               2            7754              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        2380               2            3757     304.326E-06    -123.097E-06    -203.689E-06    -189.663E-06     564.697E-06    -281.448E-06     22.5357E-06     152.710E-06     144.843E-06    -26.7177E-06    -40.3387E-06
        2380               2            3826     226.603E-06    -85.9859E-06    -161.270E-06    -133.967E-06     270.594E-06    -134.865E-06     10.7988E-06     117.700E-06     116.217E-06    -4.67318E-06    -18.0298E-06
        2380               2            3848     10.4740E-03    -2.01174E-03    -6.63900E-03    -7.84743E-03     771.739E-06    -384.638E-06     30.7983E-06     5.24148E-03     5.12795E-03    -541.446E-06    -940.251E-06
        2894               2            8253              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        2894               2            8255              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        2894               2            8270              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3372               2            5920              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3372               2            5961     52.7705E-03     12.2948E-03    -40.8019E-03    -31.1251E-03     7.36309E-03    -2.56505E-03    -502.055E-06     18.8167E-03     17.9038E-03     2.12060E-03     5.38774E-03
        3372               2            5996              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3936               3            6782              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3936               3            6852              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3936               3            6857              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3937               4            6410              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3937               4            6452              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3937               4            6488              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3955               2            6940              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3955               2            6941              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3955               2            6993              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        4024               2            8027              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        4024               2            8050              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0. 
        ....

File2

        ....
        Node  COORD.Magnitude     COORD.COOR1     COORD.COOR2     COORD.COOR3     U.Magnitude            U.U1            U.U2            U.U3
           1         131.691         14.5010        -92.2190        -92.8868         1.93638     188.252E-03        -1.64949    -996.662E-03
           2         131.336         10.9038        -92.2281        -92.8663         1.93341     188.250E-03        -1.64672    -995.468E-03
           3         132.130         18.7534        -92.4681        -92.5002         1.93968     188.190E-03        -1.65258    -997.959E-03
           4         130.769         1.97638        -92.5186        -92.3953         1.92580     188.179E-03        -1.63965    -992.387E-03
           5         130.560        -4.04517        -93.1433        -91.3993         1.92030     188.026E-03        -1.63459    -990.122E-03
           6         132.422         24.0768        -93.9662        -90.1454         1.94282     187.819E-03        -1.65564    -999.062E-03
           7         130.377        -8.39503        -94.1640        -89.7827         1.91586     187.774E-03        -1.63054    -988.235E-03
           8         126.321         13.6556        -88.0641        -89.5278         1.93579     192.554E-03        -1.64736    -998.202E-03
           9         125.963         4.31065        -88.6558        -89.3771         1.92786     192.145E-03        -1.64012    -994.852E-03
          10         130.037         3.02359        -94.4877        -89.2894         1.92501     187.692E-03        -1.63909    -991.871E-03
          11         126.692         18.5888        -88.1164        -89.1107         1.93970     192.653E-03        -1.65097    -999.810E-03
          12         125.751        -1.96189        -89.1238        -88.6928         1.92231     192.010E-03        -1.63500    -992.572E-03
          13         125.719        -3.46723        -89.2798        -88.4437         1.92094     191.971E-03        -1.63373    -992.005E-03
          14         130.026         7.42596        -95.0372        -88.4289         1.92818     187.556E-03        -1.64210    -993.086E-03
          15         130.736         16.3557        -95.3755        -87.9092         1.93527     187.472E-03        -1.64873    -995.891E-03
          16         130.251        -12.8122        -95.5572        -87.5783         1.91105     187.430E-03        -1.62618    -986.163E-03
          17         130.250         12.8770        -95.6602        -87.4548         1.93216     187.401E-03        -1.64586    -994.616E-03
          18         125.609        -7.73838        -90.1949        -87.0785         1.91668     191.718E-03        -1.62985    -990.191E-03
          19         124.466        -6.21492        -88.8834        -86.9075         1.91827     192.783E-03        -1.63095    -991.270E-03
          20         126.958         23.9470        -89.5421        -86.7584         1.94289     192.337E-03        -1.65406        -1.00096
          21         121.210         6.64491        -84.7929        -86.3587         1.92993     196.112E-03        -1.64059    -997.316E-03
          22         121.369         12.5781        -84.3620        -86.3434         1.93495     196.450E-03        -1.64514    -999.468E-03 
        ....

我想做以下步骤:

  1. 从文件中删除前两列1
  2. 比较两个文件的节点标签
  3. 以"rpt"格式编写一个输出文本文件,其中包含并排具有相同"节点标签"的行

这是我用过的代码。看起来它适用于小文件。但对于大文件来说,它需要大量的时间。

nodEl = open("P:/File1.rpt", "r")
uniNod = open("P:/File2.rpt", "r")
row_nodEl  = nodEl.readlines()
row_uniNod = uniNod.readlines()
nodEl.close()
uniNod.close()
output = open("P:/output.rpt", "w")
for index, line in enumerate(row_nodEl):
    if index > 23081 and index < 40572 and index !=23083 and index !=23084:
        temp  = line.strip()
        temp2 = " ".join(temp.split()) 
        var   = temp2.split(" ",3) 
        for index2, line2 in enumerate(row_uniNod):
            if index2 > 11412 and index2 < 21258 and index2 != 11414 and index2 !=11415: 
                temp3 = line.strip()
                temp4 = " ".join(temp3.split())
                var2  = temp4.split(" ",1)
                if var[2] == var2[0]:
                    output.write("%s" %var[2]) + " " + "%s" %var[3] + " " + "%s" %var2[1])

欢迎任何建议!

您正在比较一个文件的每一行(具有m行)和另一个文件(具有n行)。这导致了时间复杂性CCD_ 3。这意味着两个文件,每个文件有10000行,将产生100000000个比较。

如果您改变读取值的方式,您可以加快代码的速度。考虑将文件读入字典,而不是读入列表。字典中的每个键都是一个节点号,每个值都是完整的一行。

使用这种方法,您可以执行以下操作:

  1. 将第一个文件加载到字典中
  2. 将第二个文件加载到字典中
  3. 对于第一个字典中的每个节点,在第二个字典中找到相应的节点

使用Python,它看起来类似于这个

file_contents_1 = load_file("P:/File1.rpt")
file_contents_2 = load_file("P:/File2.rpt")
for node_label in file_contents_1:
    # Skip processing node which doesn't have corresponding values in the second file
    if not node_label in file_contents_2:
        continue
    # Do something

这种方法的好处是可以单独加载文件,这意味着时间复杂性现在变成了线性O(m+n)。当在第二个文件中查找相应的节点时,由于字典的实现方式(即哈希表),您的时间复杂性是恒定的。

这将使您的代码更快。

相关内容

  • 没有找到相关文章