我有一个代码,它在数据表中搜索特定字符串,并根据特定条件(唯一id字段)返回整个数据表的子集。下面的示例代码在数据表中搜索"First"one_answers"Employee",只返回具有相同l_id值的条目。
//Get all Id's that satisfy all conditions:
List<int> results = dtResult.AsEnumerable()
// Get all Id's:
.Select(dataRow => dataRow.Field<int>("l_id"))
// Filter the Id's :
.Where(id =>
// the Id should be greater than one.
id > 1
// and the datatable rows with this Id should have a record with W_Text = First
&& dtResult.AsEnumerable().Any(dataRow => dataRow.Field<string>("W_Text") == "First" && dataRow.Field<int>("l_id") == id)
// and the datatable rows with this Id should have a record with W_Text = Employee
&& dtResult.AsEnumerable().Any(dataRow => dataRow.Field<string>("W_Text") == "Employee" && dataRow.Field<int>("l_id") == id))
.Distinct().ToList();
// Get all datatable rows filtered by the list of Id's.
dtCopy = dtResult.AsEnumerable().Where(dataRow => results.Contains((dataRow.Field<int>("l_id")))).CopyToDataTable();
现在,如果我已经动态地获得了搜索文本,那么我该如何进行呢?对于例如,在上面的代码中,如果我必须通过"First"、"Employee"one_answers"salary",那么我该如何处理?
如何使其通用?
提前感谢您的帮助!
编辑-看起来我在帖子中还不清楚,所以让我再次改写
我有一个数据表,看起来像这个
l | t | r | b | x | y | w_text | l_id
-------------------------------------------------------------
70 | 314 | 141 | 328 | 1 | 5 | First | 4
149 | 318 | 194 | 328 | 2 | 5 | Employe| 4
204 | 311 | 254 | 326 | 3 | 5 | John | 4
264 | 311 | 325 | 326 | 4 | 5 | Smith | 4
1924 | 310 | 2000 | 329 | 5 | 5 | First | 5
70 | 341 | 109 | 355 | 1 | 6 | step | 5
115 | 340 | 130 | 355 | 2 | 6 | of | 5
136 | 340 | 175 | 355 | 3 | 6 | Linq | 5
185 | 339 | 320 | 356 | 4 | 6 | Last | 6
70 | 394 | 101 | 411 | 1 | 8 | Employe| 6
114 | 390 | 199 | 405 | 2 | 8 | John | 6
210 | 390 | 269 | 405 | 3 | 8 | Doe | 6
我手头上唯一的搜索条件是"W_Text"。所以我想搜索一个独特的短语"第一个员工"。只有一个l_id(在本例中为l_id=4)同时包含单词"First"one_answers"Employee"。如果我分别搜索"First"one_answers"Employee",那么我会得到一个更大的数据集,这并不能解决我的目的。我的目标是在搜索"第一员工"时获得以下唯一数据集
l | t | r | b | x | y | w_text | l_id
-------------------------------------------------------------
70 | 314 | 141 | 328 | 1 | 5 | First | 4
149 | 318 | 194 | 328 | 2 | 5 | Employe| 4
204 | 311 | 254 | 326 | 3 | 5 | John | 4
264 | 311 | 325 | 326 | 4 | 5 | Smith | 4
在SQL术语中,这类似于
Select * From Table where l_id in (Select l_id from Table where W_Text in ('First','Employee') group by l_id having count(l_id) > 1)
我提到的上面的代码(得到了一位好心人的帮助)运行得很好,并返回了上面的数据集。问题是它只适用于"第一员工"。我在搜索"林克的第一步"时遇到了麻烦。搜索短语在运行时传递给程序,并且可以是有多少单词。我曾尝试分离Wheres,但"have"条件未命中,整个数据集再次返回到该条件。
因此,我请求大家帮助我解决这个问题。我是林克的新手,我正在努力克服困难。在此期间,如果我能得到任何帮助,我们将不胜感激。谢谢
---EDIT使用这个代码(在某人的帮助下)
List<string> wTextFilter = new List<string>();
foreach (string sf in strInputString.Split(' ')) //array by splitting on white space
{
wTextFilter.Add(sf);
}
// Get all Id's that satisfy all conditions:
List<int> results = dtResult.AsEnumerable()
// Get all Id's:
.Select(dataRow => dataRow.Field<int>("l_id"))
// Filter the Id's :
.Where(id =>
// the Id should be greater than one.
id > 1 &&
// check if all W_Text entries has a record in the datatable with the same Id.
wTextFilter.All(W_Text => dtResult.AsEnumerable().Any(dataRow => dataRow.Field<string>("W_Text") == W_Text && dataRow.Field<int>("l_id") == id)))
.Distinct().ToList();
// Get all datatable rows filtered by the list of Id's.
dtCopy = dtResult.AsEnumerable().Where(dataRow => results.Contains((dataRow.Field<int>("l_id")))).CopyToDataTable();
您的代码很少有问题:
-
您在筛选之前选择了ID。这意味着您最终会得到一个
int
的集合,这意味着您无法按另一列进行筛选。您应该过滤,然后选择所需的列 -
您不需要最后检查列
l_id
。我们已经检查了l_id == id
,所以很明显该列存在 -
您当前的查询不正确。如果数据集中的任何行匹配,则返回该行:
dtResult.AsEnumerable().Any(dataRow => dataRow.Field<string>("W_Text") == "First" && dataRow.Field<int>("l_id") == id)
这意味着,对于每一行,检查是否有任何行匹配。如果是,则返回该行。您的查询将返回整个数据集,或者什么也不返回。
您可以链接.Where()
子句。例如:
public List<int> DoIt(int id, params string[] searchFor)
{
var results = dtResult.AsEnumerable()
// Filter the Id's :
.Where(dr => dr.id > 1)
.Where(dr => dr.Field<int>("l_id") == id);
foreach (var sf in searchFor)
results = results.Where(dr => dr.Field<string>("W_Text") == sf);
results = results.Select(dataRow => dataRow.Field<int>("l_id"))
return results.Distinct().CopyToDataTable();
}
从基本子句开始:
results = results.Where(id => id > 1);
然后根据需要动态添加子句:
if (/**some condition**/)
results = results.Where(id => dtResult.AsEnumerable().Any(dataRow => dataRow.Field<string>("W_Text") == "First" && dataRow.Field<int>("l_id") == id));
if (/**another condition**/)
results = results.Where(id => dtResult.AsEnumerable().Any(dataRow => dataRow.Field<string>("W_Text") == "Employee" && dataRow.Field<int>("l_id") == id))
等等。对于要动态添加的每个条件,添加一个新的.Where()
子句。您可以随心所欲地链接其中的任意多个子句,从逻辑上讲,它与单个.Where()
中的一系列&&
子句相同。
以下是更接近SQL子查询的内容:
var q = dtResult.AsEnumerable() // from Table
.Where(r => new[] { "First", "Employee" }.Contains(r["W_Text"])) // where W_Text in ('First','Employee')
.GroupBy(r => (int)r["l_id"]) // group by l_id
.Where(g => g.Count() > 1) // having count(l_id) > 1
.Select(g => g.Key); // Select l_id
这里有一个更高效的版本:
var words = new HashSet<string> { "First", "Employee" }; // optional HashSet instead of List for a bit faster .Contains
int iId = dtResult.Columns.IndexOf("l_id");
int iText = dtResult.Columns.IndexOf("W_Text");
var iRows = dtResult.Rows.Cast<DataRow>(); // a bit faster than dtResult.AsEnumerable()
var results = new HashSet<int>( // optional HashSet instead of List for faster .Contains
iRows
.Where(r => words.Contains(r[iText])) // filter the rows that contain the words
.ToLookup(r => (int)r[iId]) // group by l_id
.Where(g => g.Count() >= words.Count) // filter the groups that contain all words
.Select(g => g.Key) // select l_id
);
var dtCopy = iRows.Where(r => results.Contains((int)r[iId])).CopyToDataTable(); // InvalidOperationException if no DataRows
但如果所有数据都已经在DataTable中,那么您可以按l_id
分组,并获得包含所有单词的组:
string[] words = { "First", "Employee" };
int iId = dtResult.Columns.IndexOf("l_id");
int iText = dtResult.Columns.IndexOf("W_Text");
var iRows = dtResult.Rows.Cast<DataRow>();
var idGroups = iRows.ToLookup(r => (int)r[iId]); // group by id
var result = idGroups.Where(g => !words.Except(g.Select(r => r[iText] as string)).Any());
var dtCopy = result.SelectMany(g => g).CopyToDataTable();