使用 C# 搜索网页内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/537214/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 07:15:09  来源:igfitidea点击:

Search Web Content with C#

c#

提问by localhost

How do you search a websites source code with C#? hard to explain, heres the source for doing it in python

如何使用 C# 搜索网站源代码?很难解释,这是在 python 中执行它的源代码

import urllib2, re
word = "How to ask"
source = urllib2.urlopen("http://stackoverflow.com").read()
if re.search(word,source):
     print "Found it "+word

回答by Canavar

Here is the source for getting HTML code of a page, you can add your search method later :

这是获取页面 HTML 代码的来源,您可以稍后添加搜索方法:

string url = "http://someurl.com/default.aspx";
WebRequest webRequest=WebRequest.Create(url);
WebResponse response=webRequest.GetResponse();

Stream str=response.GetResponseStream();
StreamReader reader=new StreamReader(str);
string source=reader.ReadToEnd();

Hope this helps.

希望这可以帮助。

回答by Wolfwyrd

If you want to access the raw HTML from a web page you need to do the following:

如果要从网页访问原始 HTML,则需要执行以下操作:

  1. Use a HttpWebRequest to connect to the file
  2. Open the connection and read the response stream into a string
  3. Search the response for your content
  1. 使用 HttpWebRequest 连接到文件
  2. 打开连接并将响应流读入字符串
  3. 在回复中搜索您的内容

So code something like:

所以代码如下:

string pageContent = null;
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create("http://example.com/page.html");
HttpWebResponse myres = (HttpWebResponse)myReq.GetResponse();

using (StreamReader sr = new StreamReader(myres.GetResponseStream()))
{
    pageContent = sr.ReadToEnd();
}

if (pageContent.Contains("YourSearchWord"))
{
    //Found It
}

回答by JohannesH

I guess this is as close as you'll get in C# to your python code.

我想这与您在 C# 中获得的 Python 代码非常接近。

using System;
using System.Net;

class Program
{
    static void Main()
    {
        string word = "How to ask";
        string source = (new WebClient()).DownloadString("http://stackoverflow.com/");
        if(source.Contains(word))
            Console.WriteLine("Found it " + word);
    }
}

I'm not sure if re.search(#, #) is case sensitive or not. If it's not you could use...

我不确定 re.search(#, #) 是否区分大小写。如果不是你可以用...

if(source.IndexOf(word, StringComparison.InvariantCultureIgnoreCase) > -1)

instead.

反而。