LastIndexOf and java.lang.IndexOutOfBoundsException



我有一个字符串CCAATA CCGT,我正在尝试获取连续子序列的固定长度 n。然后,我想得到这样的东西:

该字符串中每个子序列的索引。 0-3、1-4、2-5 等

0 thru 3 : CCAA 
1 thru 4 : CAAT 
2 thru 5 : AATA 
3 thru 6 : ATAC 
4 thru 7 : TACC 
5 thru 8 : ACCG 
6 thru 9 : CCGT 

列表大小为 7。在这里,我正在循环浏览列表并获取索引和lastIndexOf。之后,3 thru 6 : ATAC,我得到

线程"main"中的异常 java.lang.IndexOutOfBoundsException: Index: 7, Size: 7

for (int i = 0; i < list.size(); i++) {
            System.out.println(ss.indexOf(list.get(i)) 
             + " thru " + ss.lastIndexOf(list.get(i + n - 1)) + " : " 
            + list.get(i));

演示:

import java.util.ArrayList;
public class Subsequences {
    public static void main(String[] args) {
        String s = "CCAATA CCGT";
        ArrayList<String> list = new ArrayList<String>(); // list of subsequence
        int n = 4; // subsequences of length
        String ss = s.replaceAll("\s+", "");
        String substr = null;
        for (int i = 0; i <= ss.length() - n; i++) {
            substr = ss.substring(i, i + n);
            list.add(substr);
        }
        for (int i = 0; i < list.size(); i++) {
            System.out.println(ss.indexOf(list.get(i)) 
             + " thru " + ss.lastIndexOf(list.get(i + n - 1)) + " : " 
            + list.get(i));
        }
    }
}

有什么提示吗?

您无需将

n添加到lastIndexOf中,因为您将substring隔开了 4。List中的每个条目由 4 个字符组成。将索引检查更改为此

(ss.lastIndexOf(list.get(i)) + n - 1)

最后看起来像这样

 for (int i = 0; i < list.size(); i++) {
        System.out.println(ss.indexOf(list.get(i))
                + " thru " + (ss.lastIndexOf(list.get(i)) + n - 1) + " : "
                + list.get(i));
    }

输出:

0 thru 3 : CCAA   
1 thru 4 : CAAT   
2 thru 5 : AATA   
3 thru 6 : ATAC   
4 thru 7 : TACC   
5 thru 8 : ACCG  
6 thru 9 : CCGT   
我相信

你的问题在list.get(i + n - 1).您当前正在迭代,以便每个子序列的开头范围从 0list.size() - 1 。最后一个有意义的子序列是位置list.size() - nlist.size() - 1n字符。

for (int i = 0; i < list.size() - n; i++) {
    System.out.println(ss.indexOf(list.get(i)) 
        + " thru " + ss.lastIndexOf(list.get(i + n - 1)) + " : " 
        + list.get(i));
    }
<</div> div class="one_answers">

删除所有空格,循环:

String data = "CCAATA CCGT";
String replaced = data.replaceAll("\s", "");
for (int i = 0; i < replaced.length() - 4 + 1; i++) {
    System.out.println(replaced.subSequence(i, i + 4));
}

输出:

CCAA
CAAT
AATA
ATAC
TACC
ACCG
CCGT

在你的循环中

for (int i = 0; i < list.size(); i++) { 
   System.out.println(ss.indexOf(list.get(i)) 
   + " thru " + ss.lastIndexOf(list.get(i + n - 1))
   + " : " + list.get(i));
}

当你做list.get(i + n - 1)并且你的i是4时,成瘾的结果将是4 + 4 - 1 = 7,并且你无法获得具有相同或更大索引的列表的成员你的list.size(),所以系统抛出异常

要获得预期的结果,您可以执行以下操作:

import java.util.ArrayList;
public class Subsequences {
public static void main(String[] args) {
    String s = "CCAATA CCGT";
    ArrayList<String> list = new ArrayList<String>(); // list of subsequence
    int n = 4; // subsequences of length
    String ss = s.replaceAll("\s+", "");
    String substr = null;
    for (int i = 0; i <= ss.length() - n; i++) {
        substr = ss.substring(i, i + n);
        list.add(substr);
    }
    // --------Here the edits-------
    for (int i = 0; i < list.size(); i++) 
        System.println(i + " thru " + (i+n-1) + " : " + list.get(i))
    // -----------------------------
}
}

您也可以使用简单的正则表达式来执行此操作。删除空格并运行此正则表达式:

(?=(.{4}))

样本:

package com.see;
import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
    private static final String TEST_STR = "CCAATA CCGT";
    public ArrayList<String> getMatchedStrings(String input) {
        ArrayList<String> matches = new ArrayList<String>();
        input = input.replaceAll("\s", "");
        Pattern pattern = Pattern.compile("(?=(.{4}))");
        Matcher matcher = pattern.matcher(input);
        while (matcher.find())
            matches.add(matcher.group(1));
        return matches;
    }
    public static void main(String[] args) {
        RegexTest rt = new RegexTest();
        for (String string : rt.getMatchedStrings(TEST_STR)) {
            System.out.println(string);
        }
    }
}

最新更新