我有一个Tuple<string,string>
对象的列表,我想删除重复项,例如,元组(a,b)
和(b,a)
被认为是相同的(这些是图形的边缘)。有什么好方法可以做到这一点?
您需要创建一个比较器,该比较器可以以这样的方式比较元组,以使项目的顺序无关紧要:
public class UnorderedTupleComparer<T> : IEqualityComparer<Tuple<T, T>>
{
private IEqualityComparer<T> comparer;
public UnorderedTupleComparer(IEqualityComparer<T> comparer = null)
{
this.comparer = comparer ?? EqualityComparer<T>.Default;
}
public bool Equals(Tuple<T, T> x, Tuple<T, T> y)
{
return comparer.Equals(x.Item1, y.Item1) && comparer.Equals(x.Item2, y.Item2) ||
comparer.Equals(x.Item1, y.Item2) && comparer.Equals(x.Item2, y.Item1);
}
public int GetHashCode(Tuple<T, T> obj)
{
return comparer.GetHashCode(obj.Item1) ^ comparer.GetHashCode(obj.Item2);
}
}
请注意,独占或哈希代码是一种操作,无论操作数的顺序如何,它都会产生相同的结果,因此在这里是可取的(但在大多数哈希代码生成算法中不是,因为它通常是一个不需要的属性)。 至于Equals
,只需要检查两种可能的配对。
一旦你有了它,你就可以做到:
var query = data.Distinct(new UnorderedTupleComparer<string>());
您可能需要创建一个实现IEqualityComparer<Tuple<string, string>>
的类:
public class TupleComparer : IEqualityComparer<Tuple<string, string>>
{
public bool Equals(Tuple<string, string> x, Tuple<string, string> y)
{
if (ReferenceEquals(x, y))
{
return true;
}
if (ReferenceEquals(x, null) || ReferenceEquals(y, null))
{
return false;
}
if (x.Item1.Equals(y.Item2) && x.Item2.Equals(y.Item1))
{
return true;
}
return x.Item1.Equals(y.Item1) && x.Item2.Equals(y.Item2);
}
public int GetHashCode(Tuple<string, string> tuple)
{
// implementation
}
}
然后,可以使用如下所示的 Distinct()
LINQ 方法:
List<Tuple<string, string>> list = new List<Tuple<string, string>> { Tuple.Create("a", "b"), Tuple.Create("a", "c"), Tuple.Create("b", "a") };
var result = list.Distinct(new TupleComparer());
尝试使用字典并组成一个表示每个元组的键。 您是否有一个不会出现在字符串中的字符,可以用作分隔符? 在这个例子中,我选择了":":
static void Main(string[] args)
{
// original list of data
var list = new List<Tuple<string, string>> { };
list.Add(new Tuple<string, string>("a", "b"));
list.Add(new Tuple<string, string>("b", "a"));
// dictionary to hold unique tuples
var dict = new Dictionary<string, Tuple<string, string>>();
foreach (var item in list)
{
var key1 = string.Concat(item.Item1, ":", item.Item2);
var key2 = string.Concat(item.Item2, ":", item.Item1);
// if dict doesnt contain tuple, add it.
if (!dict.ContainsKey(key1) && !dict.ContainsKey(key2))
dict.Add(key1, item);
}
// print unique tuples
foreach (var item in dict)
{
var tuple = item.Value;
Console.WriteLine(string.Concat(tuple.Item1, ":", tuple.Item2));
}
Console.ReadKey();
}
要保留原始元素,请使用 group by 而不是 Distinct ,以便我们仍然可以访问组的第一个元素:
实时代码:https://dotnetfiddle.net/LYZItb
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
static List<Tuple<string, string>> myList = new List<Tuple<string, string>>()
{
Tuple.Create<string, string>("B", "A"),
Tuple.Create<string, string>("A", "B"), // duplicate
Tuple.Create<string, string>("C", "B"),
Tuple.Create<string, string>("C", "B"), // duplicate
Tuple.Create<string, string>("A", "D"),
Tuple.Create<string, string>("E", "F"),
Tuple.Create<string, string>("F", "E"), // duplicate
};
public static void Main()
{
var result =
from y in
from x in myList
select new { Original = x, SortedPair = new[] { x.Item1, x.Item2 }.OrderBy(s => s).ToArray() }
group y by new { NormalizedTuple = Tuple.Create<string,string>(y.SortedPair[0], y.SortedPair[1]) } into grp
select new { Pair = grp.Key.NormalizedTuple, Original = grp.First().Original };
foreach(var item in result)
{
Console.WriteLine("Pair: {0} {1}", item.Original.Item1, item.Original.Item2);
}
}
}
输出:
Pair: B A
Pair: C B
Pair: A D
Pair: E F
实时代码:https://dotnetfiddle.net/LUErFj
首先对元组对进行排序,然后执行 Distinct:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
static List<Tuple<string, string>> myList = new List<Tuple<string, string>>()
{
Tuple.Create<string, string>("A", "B"),
Tuple.Create<string, string>("B", "A"), // duplicate
Tuple.Create<string, string>("C", "B"),
Tuple.Create<string, string>("C", "B"), // duplicate
Tuple.Create<string, string>("A", "D")
};
public static void Main()
{
myList
.Select(x => new[] { x.Item1, x.Item2 }.OrderBy(s => s).ToArray())
.Select(x => Tuple.Create<string,string>(x[0], x[1]))
.Distinct()
.Dump();
}
}
输出:
Dumping object(System.Linq.<DistinctIterator>d__81`1[Tuple`2[String,String]])
[
{
Item1 : A
Item2 : B
ToString(): (A, B)
},
{
Item1 : B
Item2 : C
ToString(): (B, C)
},
{
Item1 : A
Item2 : D
ToString(): (A, D)
}
]