C#字符串数组单词过滤器,我的数组在索引之外



我脑子里有个屁。。我做错了什么。。。我的阵列坏了?

 public static string CleanBadwordsFromString(string text) { 
            string badWords = "bunch,of,words,that,do,not,need,to,be,seen";
            string[] badChars = badWords.Split(',');
            string[] words = text.Split(' ');
            int iLength = 0;
            string sAttachtoEnd = null;
            string cleanedString = "";
            int x = 0;
            int i = 0;
            //loop through our array of bad words
            for (i = 0; i <= badChars.Length; i++)
            {
                //get the length of the bad word
                iLength = badChars[i].Length;
                //we are going to keep the first letter of the bad word and replace all the other
                //letters with *, so we need to find out how many * to use
                for (x = 1; x <= iLength - 1; x++)
                {
                    sAttachtoEnd = sAttachtoEnd + "*";
                }
                //replace any occurences of the bad word with the first letter of it and the
                //rest of the letters replace with *
                foreach (string s in words)
                {
                    cleanedString =cleanedString +   s.Replace(s, s.Substring(s.Length-1) + sAttachtoEnd);  //should be: shit = s***
                }
                sAttachtoEnd = "";
            }
            return cleanedString;

    }

我尝试使用i < badChar.Length解决方案运行您的代码,尽管它运行时没有出现错误,但结果并不是我所期望的。

我试着运行这个:

CleanBadwordsFromString("Seen or not seen: Bunch, bunching, or bunched?")

我得到了:

n****r****t****:****,****,****r****?****n*r*t*:*,*,*r*?*n****r****t****:****,****,****r****?****n***r***t***:***,***,***r***?***n*r*t*:*,*,*r*?*n**r**t**:**,**,**r**?**n***r***t***:***,***,***r***?***n*r*t*:*,*,*r*?*n*r*t*:*,*,*r*?*n***r***t***:***,***,***r***?***

显然这是不对的。

我知道你的问题是关于数组索引的,但我认为你应该让代码正确工作。所以我想如何重写才能让它发挥作用。以下是我的想法:

public static string CleanBadwordsFromString(string text)
{
    var badWords =
        "bunch,of,words,that,do,not,need,to,be,seen"
            .Split(',').Select(w => w.ToLowerInvariant()).ToArray();
    var query =
        from i in Enumerable.Range(0, text.Length)
        let rl = text.Length - i
        from bw in badWords
        let part = text
            .Substring(i, Math.Min(rl, bw.Length))
        where bw == part.ToLowerInvariant()
        select new
        {
            Index = i,
            Replacement = part
                .Substring(0, 1)
                .PadRight(part.Length, '*')
                .ToCharArray(),
        };
    var textChars = text.ToCharArray();
    foreach (var x in query)
    {
        Array.Copy(
            x.Replacement, 0,
            textChars, x.Index, x.Replacement.Length);
    }
    return new String(textChars);
}

现在我的结果是:

S*** or n** s***: B****, b****ing, or b****ed?

这对我来说很好。

我的方法不依赖于空格分割,所以会选择标点符号和后缀。如果源文本包含大写字母,它也会起作用。

for (i = 0; i <= badChars.Length; i++) // Only < and not <=

条件只是i < badChars.Length;。如果数组长度为n,则其访问权限为从0n-1

如果数组长度为5,则在循环中,您试图访问它的第五个索引,而该索引实际上并不存在。

iLength = badChars[i].Length;  // 5 <= 5 => true. But valid index is from 0 to 4

这导致您数组出现越界异常。

最新更新