Python 通过网页搜索

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4925966/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 18:10:17  来源:igfitidea点击:

Searching through webpage

pythonsearchtextfindwebpage

提问by AustinM

Hey I'm working on a Python project that requires I look through a webpage. I want to look through to find a specific text and if it finds the text, then it prints something out. If not, it prints out an error message. I've already tried with different modules such as libxml but I can't figure out how I would do it.

嘿,我正在处理一个需要我浏览网页的 Python 项目。我想通过查找特定文本,如果找到该文本,则它会打印出一些内容。如果没有,它会打印出一条错误消息。我已经尝试过使用不同的模块,例如 libxml,但我不知道该怎么做。

Could anybody lend some help?

有人可以帮忙吗?

采纳答案by dplouffe

You could do something simple like:

你可以做一些简单的事情,比如:


import urllib2
import re

html_content = urllib2.urlopen('http://www.domain.com').read()

matches = re.findall('regex of string to find', html_content);

if len(matches) == 0: 
   print 'I did not find anything'
else:
   print 'My string is in the html'

回答by Bassdread

lxml is awesome: http://lxml.de/parsing.html

lxml 很棒:http: //lxml.de/parsing.html

I use it regularly with xpath for extracting data from the html.

我经常将它与 xpath 一起用于从 html 中提取数据。

The other option is http://www.crummy.com/software/BeautifulSoup/which is great as well.

另一个选项是http://www.crummy.com/software/BeautifulSoup/,这也很棒。