如何在C#中提高foreach循环的性能

class Student
{
public int ID { get; set; }
public string Name { get; set; }
public string Email { get; set; }
}
class Program
{
private static object _lockObj = new object();
static void Main(string[] args)
{
List<int> collection = Enumerable.Range(1, 100000).ToList();


List<Student> students = new List<Student>(100000);
var options = new ParallelOptions()
{
MaxDegreeOfParallelism = 1
};
var sp = System.Diagnostics.Stopwatch.StartNew();
sp.Start();
Parallel.ForEach(collection, options, action =>
{
lock (_lockObj)
{
var dt = collection.FirstOrDefault(x => x == action);
if (dt > 0)
{
Student student = new Student();
student.ID = dt;
student.Name = "Zoyeb";
student.Email = "ShaikhZoyeb@Gmail.com";
students.Add(student);
Console.WriteLine(@"value of i = {0}, thread = {1}", 
action,Thread.CurrentThread.ManagedThreadId);
}
}
});
sp.Stop();
double data = Convert.ToDouble(sp.ElapsedMilliseconds / 1000);
Console.WriteLine(data);

}
}

我想尽快循环浏览100000条记录我尝试了foreach循环，但它不太适合循环100000条记录，然后在我尝试实现Parallel.foreach((以提高我的性能后，在实际场景中，我将有id的集合，我需要查找集合中的id是否退出，如果退出则添加。演出状况良好当我注释条件时，执行大约需要3秒，当我取消注释条件时大约需要24秒，所以我的问题是，有没有任何方法可以通过在集合中查找id来提高我的性能

//var dt = collection.FirstOrDefault(x => x == action);
//if (dt > 0)
//{
Student student = new Student();
student.ID = 1;
student.Name = "Zoyeb";
student.Email = "ShaikhZoyeb@Gmail.com";
students.Add(student);
Console.WriteLine(@"value of i = {0}, thread = {1}", 
action,Thread.CurrentThread.ManagedThreadId);
//}

您的原始代码是在Parallel.ForEach中执行lock。这实质上是采用并行代码并强制其串行运行。

我的机器需要40秒。

这实际上相当于这样做：

foreach (var action in collection)
{
var dt = collection.FirstOrDefault(x => x == action);
if (dt > 0)
{
Student student = new Student();
student.ID = dt;
student.Name = "Zoyeb";
student.Email = "ShaikhZoyeb@Gmail.com";
students.Add(student);
}
}

这也需要40秒。

然而，如果你只是这样做：

foreach (var action in collection)
{
Student student = new Student();
student.ID = action;
student.Name = "Zoyeb";
student.Email = "ShaikhZoyeb@Gmail.com";
students.Add(student);
}

这需要1毫秒才能运行。它大约快40000倍。

在这种情况下，您可以通过迭代集合一次来获得更快的循环，而不是以嵌套的方式，也不使用Parallel.ForEach。

我的ap0ologies，因为缺少关于id的那一点不存在。

试试这个：

HashSet<int> hashSet = new HashSet<int>(collection);
List<Student> students = new List<Student>(100000);
var sp = System.Diagnostics.Stopwatch.StartNew();
sp.Start();
foreach (var action in collection)
{
if (hashSet.Contains(action))
{
Student student = new Student();
student.ID = action;
student.Name = "Zoyeb";
student.Email = "ShaikhZoyeb@Gmail.com";
students.Add(student);
}
}
sp.Stop();

它在3毫秒内运行。

另一种选择是使用类似这样的join：

foreach (var action in
from c in collection
join dt in collection on c equals dt
select dt)
{
Student student = new Student();
student.ID = action;
student.Name = "Zoyeb";
student.Email = "ShaikhZoyeb@Gmail.com";
students.Add(student);
}

它在25毫秒内运行。

问题1

您在并行foreach中使用的是锁，而不是并发集合。强制并行foreach等待，以访问锁，从而逐个执行。

将您的列表更改为ConcurrentBag，并从ParallelForEach 中删除lock

// using System.Collections.Concurrent; // at the top
var students = new ConcurrentBag<Student>()

问题2

如果你想按Id选择，FirstOrDefault()的性能不是很好。使用字典。由于字典进行哈希匹配的速度比FirstOrDefault快得多。请参阅此问题。

将您的收藏更改为词典：

var collection = Enumerable.Range(1, 100000)
.ToDictionary(x=> x);

将循环中的访问权限更改为：

if(collection.TryGetValue(action, out var dt))
{
//....
}

问题3

秒表不是一个基准测试工具。请使用Benchmark.Net或其他库。

问题1

问题2

问题3

相关内容

最新更新

热门标签：