Python WebDriver 如何打印整页源代码 (html)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27411915/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:46:33  来源:igfitidea点击:

Python WebDriver how to print whole page source (html)

pythonselenium-webdriverwebdriver

提问by wmarchewka

I'm using Python 2.7 with Selenium WebDriver. My question is how to print whole page source with printmethod. There is webdriver method page_sourcebut it returns WebDriver and I don't know how to convert it to String or just print it in terminal

我将 Python 2.7 与 Selenium WebDriver 一起使用。我的问题是如何使用print方法打印整页源代码。有 webdriver 方法,page_source但它返回 WebDriver,我不知道如何将其转换为 String 或只是在终端中打印它

采纳答案by alecxe

.page_sourceon a webdriverinstance is what you need:

.page_source在一个webdriver实例上是你需要的:

>>> from selenium import webdriver
>>> driver = webdriver.Firefox()
>>> driver.get('http://google.com')
>>> print(driver.page_source)
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" itemtype="http://schema.org/WebPage" itemscope=""><head><meta name="descri
...
:before,.vscl.vslru div.vspib{top:-4px}</style></body></html>

回答by Myke

You can also get the HTML page source without using a browser. The requests module allows you to do that.

您还可以在不使用浏览器的情况下获取 HTML 页面源代码。requests 模块允许您这样做。

 import requests

 res = requests.get('https://google.com')
 res.raise_for_status()  # this line trows an exception if an error on the 
                         # connection to the page occurs. 
 print(res.text)