作者: 乔克斯
查看: 2650|回复: 1
打印 上一主题 下一主题

[教程] 【解析HTML各节点】C#解析HTML

[复制链接]
楼主
jefferic 发表于 2014-9-26 14:41:39 | 显示全部楼层
本帖最后由 jefferic 于 2014-9-26 20:43 编辑

使用 HTML Agility Pack 搭配 ScrapySharp 会更便捷
类似jquery筛选器的写法
[C#] 纯文本查看 复制代码
Uri uri = new Uri("http://www.baidu.com");
ScrapingBrowser browser1 = new ScrapingBrowser();
String htmlStr = browser1.DownloadString(uri);
HtmlAgilityPack.HtmlDocument htmlDocument = new HtmlAgilityPack.HtmlDocument();
htmlDocument.LoadHtml(htmlStr);

HtmlNode html = htmlDocument.DocumentNode;
IEnumerable<HtmlNode> nodes = html.CssSelect("div");  //all div elements
//nodes = html.CssSelect("div.content"); //all div elements with css class ‘content’
//nodes = html.CssSelect("div.widget.monthlist"); //all div elements with the both css class
//nodes = html.CssSelect("#postPaging"); //all HTML elements with the id postPaging
//nodes = html.CssSelect("div#postPaging.testClass"); // all HTML elements with the id postPaging and css class testClass 
//nodes = html.CssSelect("div.content > p.para"); //p elements who are direct children of div elements with css class ‘content’ 
//nodes = html.CssSelect("input[type=text].login"); // textbox with css class login 
//nodes = html.CssSelect("p.para").CssSelectAncestors("div.content > div.widget");


评分

参与人数 1金钱 +1 收起 理由
乔克斯 + 1 很给力!

查看全部评分

您需要登录后才可以回帖 登录 | 加入CSkin博客

本版积分规则

QQ|申请友链|小黑屋|手机版|Archiver|CSkin ( 粤ICP备13070794号

Powered by Discuz! X3.2  © 2001-2013 Comsenz Inc.  Designed by ARTERY.cn
GMT+8, 2024-6-19 03:40, Processed in 0.563137 second(s), 33 queries , Gzip On.

快速回复 返回顶部 返回列表