Python 通过网页搜索

Question

提问by AustinM

Hey I'm working on a Python project that requires I look through a webpage. I want to look through to find a specific text and if it finds the text, then it prints something out. If not, it prints out an error message. I've already tried with different modules such as libxml but I can't figure out how I would do it.

嘿，我正在处理一个需要我浏览网页的 Python 项目。我想通过查找特定文本，如果找到该文本，则它会打印出一些内容。如果没有，它会打印出一条错误消息。我已经尝试过使用不同的模块，例如 libxml，但我不知道该怎么做。

Could anybody lend some help?

有人可以帮忙吗？

Answer 1

采纳答案by dplouffe

You could do something simple like:

你可以做一些简单的事情，比如：


import urllib2
import re

html_content = urllib2.urlopen('http://www.domain.com').read()

matches = re.findall('regex of string to find', html_content);

if len(matches) == 0: 
   print 'I did not find anything'
else:
   print 'My string is in the html'

Answer 2

回答by Bassdread

lxml is awesome: http://lxml.de/parsing.html

lxml 很棒：http: //lxml.de/parsing.html

I use it regularly with xpath for extracting data from the html.

我经常将它与 xpath 一起用于从 html 中提取数据。

The other option is http://www.crummy.com/software/BeautifulSoup/which is great as well.

另一个选项是http://www.crummy.com/software/BeautifulSoup/，这也很棒。

Python 通过网页搜索

提问by AustinM

采纳答案by dplouffe

回答by Bassdread

相关推荐

最近更新

标签

Python 通过网页搜索

提问by AustinM

采纳答案by dplouffe

回答by Bassdread

相关推荐

Python 在字典中按相同的值查找所有关键元素

Python 什么是更快的操作，re.match/search 或 str.find？

Python 自定义类型的对象作为字典键

Python PIL 如何根据图像大小缩放文本大小

相关推荐

最近更新

标签