C#实现抓取网站页面内容的示例方法

时间：2021-11-16 来源：互联网编辑：宝哥软件园浏览：次

抢新浪新闻栏目。com，如图：。

使用谷歌浏览器查看源代码：通过分析，我们知道我们要找的内容在以下两个标签之间：复制代码如下：-publish _ helper name=' news-news ' p _ id=' 1 ' t _ id=' 850 ' d _ id=' 1 '-内容。-publish _ helper name=' news-finance ' p _ id=' 30 ' t _ id=' 98 ' d _ id=' 1 '-如图所示：。

内容。

使用VS创建一个网站，如图所示：。

我们主要通过WebClient类下载网络数据。使用以下源代码获取我们选择的内容：复制代码如下： protected void enter _ click(对象发送者，event args e){ web client we=new web client()；//主要使用WebClient类byte[]myDataBuffer；myDataBuffer=we。DownloadData(txtURL。文本)；//方法返回一个字节数组，所以需要定义一个byte[]string download=encoding . default . getstring(mydatabuffer)；//对下载的数据进行编码//通过查询源代码，得到int startIndex=download两个值之间的新闻内容。IndexOf('！-publish _ helper name='重要新闻' p _ id=' 1 ' t _ id=' 850 ' d _ id=' 1 '-')；int endIndex=下载。IndexOf('！-publish _ helper name=' news-finance ' p _ id=' 30 ' t _ id=' 98 ' d _ id=' 1 '-')；字符串temp=下载。子串(startIndex，endIndex-startIndex 1)；//截取新闻内容lblMessage。Text=temp//显示截获的新闻内容}效果如下：。

最后，不仅可以将下载的数据保存为文本，还可以保存为文件类型和流类型。复制代码如下： WebClient WC=new WebClient()；厕所。下载文件(文本框1。文本，@ ' f : \ test . txt ')；标签1。文本=“文件下载完成”；复制代码如下： WebClient WC=new WebClient()；流s=wc。OpenRead(TextBox1。文本)；StreamReader sr=新的stream reader；标签1。text=Sr . ReadToEnd()；

C#实现抓取网站页面内容的示例方法

新天龙八部

热门手游排行榜