在字符串中搜索子字符串



嘿,伙计们,我有下面的代码可以在一个大约有70万个字母的文件中搜索子字符串。我相信,它对ArrayList来说很好,但对LinkedList来说,它需要很长时间才能完成。有人知道为什么要花那么多时间吗=S

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.LinkedList;
import java.util.List;
import java.util.Scanner;
public class CountSubstrings {
    private static int sumAL=0;
    private static int sumLL=0;
    private static List<Character> sAList= new ArrayList<Character>();
    private static List<Character> sLList= new LinkedList<Character>();
    private static List<Character> pattAL= new ArrayList<Character>();
    private static List<Character> pattLL= new LinkedList<Character>();
    private static int index=0;
    private static double timer=0;
    private static double Otimer=0;
    /*
     * Returns the lowest index at which substring pattern begins in text (or
     * else -1).
     */
    private static int findBrute(List<Character> text, List<Character> pattern,  int position) {
        int n = text.size();
        int m = pattern.size();
        for (int i = position; i <= n - m; i++) { // try every starting index 
                                 // within text
            int k = 0; // k is index into pattern
            while (k < m && (text.get(i + k) == pattern.get(k)))
            {   // kth character of pattern matches
                k++;
                if (k == m )
                {   index=i;
                    return i;} // substring text[i..i+m-1] is a match
                }
            }
        return -1; // search failed
    }
    public static void main (String[] args)
    {
        Scanner sc1= new Scanner(System.in);
        Scanner sc2= new Scanner(System.in);
        System.out.print("Please enter the path for the input file: ");
        String fileName= sc1.next();
        System.out.print("Enter the pattern to look for: ");
        String subString= sc2.next();
        for(char c: subString.toCharArray())
        {
            pattAL.add(c);
            pattLL.add(c);
        }
        System.out.println("current time "+System.currentTimeMillis()+" milliseconds");
        try (BufferedReader OpenFile = new BufferedReader(new FileReader(fileName)))
        {
            // file is opened here and we can access everything in there.
            String sSLine;
            String content = new Scanner(new File(fileName)).useDelimiter("\Z").next();
            //System.out.println(content);
     // find int answer line by line not complete
            while ((sSLine = OpenFile.readLine()) != null) {
                sSLine.replace('n', ',');// making sure we add every word alone even when we encounter n
                for(char c: sSLine.toCharArray())
                {
                    sAList.add(c);
                    sLList.add(c);
                }
            }
        } catch (IOException e) 
        {
            e.printStackTrace();
        } 
        //Array List by pointer
        //starting ARRAY LIST
        Otimer=System.currentTimeMillis();
         while(findBrute(sAList,pattAL,index)!=-1)
         {
                index=index+pattAL.size();
                sumAL++;
         }
         timer=System.currentTimeMillis()-Otimer;
         Otimer=System.currentTimeMillis();
         index=0; // resetting the index  OR  we can make 2 other variables indexAL  indexLL  if magic numbers were so bad
         System.out.println("Using ArrayList: "+sumAL+" matches, derived in "+timer+ " milliseconds");
         while(findBrute(sLList,pattLL,index)!=-1)
         {
            System.out.println("index"+index+" char: "+sLList.get(index));
            index=index+pattLL.size();
            //if(sLList.get(index))
            sumLL++;
            System.out.println("index"+index+" char: "+sLList.get(index+1));
         }
         timer=System.currentTimeMillis()-Otimer;
    System.out.println("Using Linked List: matches "+sumLL+" time, derived in "+timer+ " milliseconds");
      }
}

我认为您需要了解链表是如何工作的。链表中的每一项都引用列表中的下一项(在Java的情况下,也引用上一项)。因此,要获取链接列表中特定索引处的项目,必须从列表的任一端遍历所有项目,直到找到正确的索引。相比之下,ArrayList是建立在数组上的,因此可以非常快速地访问任意索引。

让我们看看LinkedList:的文档

所有操作的执行都与双链表所预期的一样索引到列表中的操作将从开始或结束遍历列表,以更接近指定索引的为准。

对于ArrayList:

size、isEmpty、get、set、迭代器和listIterator操作在恒定时间中运行。

在代码中,在findBrute方法的循环中使用get方法。

...                    ↓                     ↓
while (k < m && (text.get(i + k) == pattern.get(k)))
...

以及在main方法中的while循环中:

...                                                ↓
System.out.println("index"+index+" char: "+sLList.get(index));
...

因此,由于链表的工作方式,与ArrayList相比,此代码使用链表将花费更多时间。

相关内容

  • 没有找到相关文章

最新更新