Python 通过网页搜索
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4925966/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Searching through webpage
提问by AustinM
Hey I'm working on a Python project that requires I look through a webpage. I want to look through to find a specific text and if it finds the text, then it prints something out. If not, it prints out an error message. I've already tried with different modules such as libxml but I can't figure out how I would do it.
嘿,我正在处理一个需要我浏览网页的 Python 项目。我想通过查找特定文本,如果找到该文本,则它会打印出一些内容。如果没有,它会打印出一条错误消息。我已经尝试过使用不同的模块,例如 libxml,但我不知道该怎么做。
Could anybody lend some help?
有人可以帮忙吗?
采纳答案by dplouffe
You could do something simple like:
你可以做一些简单的事情,比如:
import urllib2
import re
html_content = urllib2.urlopen('http://www.domain.com').read()
matches = re.findall('regex of string to find', html_content);
if len(matches) == 0:
print 'I did not find anything'
else:
print 'My string is in the html'
回答by Bassdread
lxml is awesome: http://lxml.de/parsing.html
lxml 很棒:http: //lxml.de/parsing.html
I use it regularly with xpath for extracting data from the html.
我经常将它与 xpath 一起用于从 html 中提取数据。
The other option is http://www.crummy.com/software/BeautifulSoup/which is great as well.
另一个选项是http://www.crummy.com/software/BeautifulSoup/,这也很棒。

