使用 C# 搜索网页内容
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/537214/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Search Web Content with C#
提问by localhost
How do you search a websites source code with C#? hard to explain, heres the source for doing it in python
如何使用 C# 搜索网站源代码?很难解释,这是在 python 中执行它的源代码
import urllib2, re
word = "How to ask"
source = urllib2.urlopen("http://stackoverflow.com").read()
if re.search(word,source):
print "Found it "+word
回答by Canavar
Here is the source for getting HTML code of a page, you can add your search method later :
这是获取页面 HTML 代码的来源,您可以稍后添加搜索方法:
string url = "http://someurl.com/default.aspx";
WebRequest webRequest=WebRequest.Create(url);
WebResponse response=webRequest.GetResponse();
Stream str=response.GetResponseStream();
StreamReader reader=new StreamReader(str);
string source=reader.ReadToEnd();
Hope this helps.
希望这可以帮助。
回答by Wolfwyrd
If you want to access the raw HTML from a web page you need to do the following:
如果要从网页访问原始 HTML,则需要执行以下操作:
- Use a HttpWebRequest to connect to the file
- Open the connection and read the response stream into a string
- Search the response for your content
- 使用 HttpWebRequest 连接到文件
- 打开连接并将响应流读入字符串
- 在回复中搜索您的内容
So code something like:
所以代码如下:
string pageContent = null;
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create("http://example.com/page.html");
HttpWebResponse myres = (HttpWebResponse)myReq.GetResponse();
using (StreamReader sr = new StreamReader(myres.GetResponseStream()))
{
pageContent = sr.ReadToEnd();
}
if (pageContent.Contains("YourSearchWord"))
{
//Found It
}
回答by JohannesH
I guess this is as close as you'll get in C# to your python code.
我想这与您在 C# 中获得的 Python 代码非常接近。
using System;
using System.Net;
class Program
{
static void Main()
{
string word = "How to ask";
string source = (new WebClient()).DownloadString("http://stackoverflow.com/");
if(source.Contains(word))
Console.WriteLine("Found it " + word);
}
}
I'm not sure if re.search(#, #) is case sensitive or not. If it's not you could use...
我不确定 re.search(#, #) 是否区分大小写。如果不是你可以用...
if(source.IndexOf(word, StringComparison.InvariantCultureIgnoreCase) > -1)
instead.
反而。