我试图从HTML输入的c#程序中搜索以searchResult1, searchResult2开始的所有节点,直到searchResult10。这是我的代码
var results = hdoc.DocumentNode
.Descendants("div")
.Where(x => x.Attributes.Contains("id") &&
x.Attributes["id"].Value.Contains(""searchResult")).ToList();
for (int i = 0; i < results.Count; i++)
{
rawdata[i] = results[i].InnerHtml.Trim();
}
我的HTMl是这样的
<div id="searchResultTable" class="searchReturnData"> some junk html
<li id="searchResult1" class="searchResult searchResultsData_OFF"> searchResult1 html </li>
<li id="searchResult2" class="searchResult searchResultsData_OFF">searchResult2 html </li>
<li id="searchResult3" class="searchResult searchResultsData_OFF">searchResult3 html </li>
</div>
我想只打印searchResult1,searchResult2,searchResult3 html,而不是一些垃圾html。我该怎么做呢?
谢谢Rashmi
如果你可以使用HTMLAgilityPack来解析HTML。你可以这样做
HtmlDocument doc = new HtmlDocument();
doc.Load(@"C:file.html");
var root = doc.DocumentNode;
var a_nodes = root.Descendants("li").Where(c=>c.GetAttributeValue("id","")
.Contains("searchResult")).ToList()