在 VB.Net 中使用 HtmlAgilityPack 从网站获取文本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31528980/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 19:19:06  来源:igfitidea点击:

Using HtmlAgilityPack in VB.Net to Get Text from a Website

vb.nethtml-agility-pack

提问by RockGuitarist1

I am writing a program for my girlfriend that allows her to open the program and it will automatically gather her a quote from a Horoscope website and display that line of text in a TextBox.

我正在为我的女朋友编写一个程序,允许她打开该程序,它会自动从星座网站收集她的报价并在文本框中显示该行文本。

As of what I have now, it basically displays the entire website in HTML, which is not what I want. This is the HTML line that I need to grab.

就我现在所拥有的,它基本上以 HTML 格式显示整个网站,这不是我想要的。这是我需要抓取的 HTML 行。

<div class="fontdef1" style="padding-right:10px;" id="textline">
"You might have the desire for travel, perhaps to visit a friend who lives far away, Gemini. You may actually set the wheels in motion to make it happen. Social events could take up your time this evening, and you could meet some interesting people. A friend might need a sympathetic ear. Today you're especially sensitive to others, so be prepared to hear a sad story. Otherwise, your day should go well. 
</div>

The code that I have so far is.

我到目前为止的代码是。

Imports System.Net
Imports System.IO
Imports HtmlAgilityPack

Public Class Form1

    Private Function getHTML(ByVal Address As String) As String
        Dim rt As String = ""

        Dim wRequest As WebRequest
        Dim wResponse As WebResponse

        Dim SR As StreamReader

        wRequest = WebRequest.Create(Address)
        wResponse = wRequest.GetResponse

        SR = New StreamReader(wResponse.GetResponseStream)

        rt = SR.ReadToEnd
        SR.Close()

        Return rt
    End Function

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
        Label2.Text = Date.Now.ToString("MM/dd/yyyy")
        TextBox1.Text = getHTML("http://my.horoscope.com/astrology/free-daily-horoscope-gemini.html")
    End Sub
End Class

Thank you for any help that I can get. I honestly have no idea where to go with this program now. It's been 3 days with no progress.

感谢您为我提供的任何帮助。老实说,我现在不知道去哪里使用这个程序。已经3天了,没有任何进展。

采纳答案by har07

Learn XPathor LINQto pull out certain information from an HTML document using HtmlAgilityPack. This is a console application example which using XPath selector :

学习XPathLINQ以使用 HtmlAgilityPack 从 HTML 文档中提取某些信息。这是一个使用 XPath 选择器的控制台应用程序示例:

Imports System
Imports System.Xml
Imports HtmlAgilityPack

Public Module Module1
    Public Sub Main()
        Dim link As String = "http://my.horoscope.com/astrology/free-daily-horoscope-gemini.html"
        'download page from the link into an HtmlDocument'
        Dim doc As HtmlDocument = New HtmlWeb().Load(link)
        'select <div> having class attribute equals fontdef1'
        Dim div As HtmlNode = doc.DocumentNode.SelectSingleNode("//div[@class='fontdef1']")
        'if the div is found, print the inner text'
        If Not div Is Nothing Then
            Console.WriteLine(div.InnerText.Trim())
        End If
    End Sub
End Module

Dotnetfiddle Demo

Dotnetfiddle Demo

output :

输出 :

You might have the desire for travel, perhaps to visit a friend who lives far away, Gemini. You may actually set the wheels in motion to make it happen. Social events could take up your time this evening, and you could meet some interesting people. A friend might need a sympathetic ear. Today you're especially sensitive to others, so be prepared to hear a sad story. Otherwise, your day should go well.

你可能有旅行的愿望,也许是去拜访一位住在很远的朋友,双子座。您实际上可以让轮子运动来实现它。今晚的社交活动可能会占用您的时间,您可能会遇到一些有趣的人。朋友可能需要有同情心的耳朵。今天你对别人特别敏感,所以准备好听一个悲伤的故事。否则,你的一天应该会很顺利。