C#多线程WebRequests上的错误



我正在尝试使用多个线程制作WebRequests,但是如果我尝试超过2个,我会得到错误

Index was outside the bonds of the array

在这条线上:

string username = ScrapeBox1.Lines[NamesCounter].ToString();

这是代码:

while (working)
{
    while (usernamescount > NamesCounter)
    {
        string username = ScrapeBox1.Lines[NamesCounter].ToString();
        string url = "http://www.someforum.com/members/" + username + ".html";
        var request = (HttpWebRequest)(WebRequest.Create(url));
        var response = request.GetResponse();
        request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:16.0) Gecko/20100101 Firefox/16.0";
        using (var responseStream = response.GetResponseStream())
        {
            using (var responseStreamReader = new StreamReader(responseStream))
            {
                var serverResponse = responseStreamReader.ReadToEnd();
                int startpoint = serverResponse.IndexOf("Contact Info</span>");
                try
                {
                    string strippedResponse = serverResponse.Remove(0, startpoint);
                    ExtractEmails(strippedResponse);
                }
                catch { }

            }
        }
        NamesCounter++;
        textBox1.Text = NamesCounter.ToString();
    }
}

此代码不是线程安全。

您需要执行httpwebrequest的代码是原子,并且在通过集合中循环的背景之外。

例如

public void MakeHttpWebRequest(string userName)
{
    string url = "http://www.someforum.com/members/" + userName + ".html";
    var request = (HttpWebRequest)(WebRequest.Create(url));
    var response = request.GetResponse();
    request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:16.0) Gecko/20100101 Firefox/16.0";
    using (var responseStream = response.GetResponseStream())
    {
        using (var responseStreamReader = new StreamReader(responseStream))
        {
            var serverResponse = responseStreamReader.ReadToEnd();
            int startpoint = serverResponse.IndexOf("Contact Info</span>");
            try
            {
                string strippedResponse = serverResponse.Remove(0, startpoint);
                ExtractEmails(strippedResponse);
            }
            catch { }

        }
    }
}

假设scrapebox.lines实现了iEnumerable,我将使用Parallel.Foreach和ScrapeBox.lines作为IEnumerable进行迭代。

现在,还有一个其他问题,读取HTTPWebrequest响应的代码仍然需要将其输出写入共享位置。以线程安全的方式实现这一目标。一种常见的方法是使用信号量。您需要一个可以访问每个线程实例的对象。类级私人变量private object sharedMutex = new object();将起作用。然后,代码ExtractEmails(strippedResponse);应该更改为 lock(sharedMutex) { ExtractEmails(strippedResponse); }

没有ExtractEmails(<string>)方法的代码,我无法为此提供线程安全实现,因此该解决方案仍然可能引起问题。

相关内容

  • 没有找到相关文章

最新更新