在 Python 中使用 Selenium 在 Firefox 上保存网页

Question

提问by Tommy N

I am trying to use Seleniumin Pythonto save webpages on MacOS Firefox.

我正在尝试使用SeleniuminPython将网页保存在MacOS Firefox.

So far, I have managed to click COMMAND + Sto pop up the SAVE AS window. However,

到目前为止，我已经设法点击COMMAND + S弹出SAVE AS window. 然而，

I don't know how to:

我不知道如何：

change the directory of the file,
change the name of the file, and
click the SAVE AS button.

更改文件目录，
更改文件名，以及
单击另存为按钮。

Could someone help?

有人可以帮忙吗？

Below is the code I have use to click COMMAND + S:

下面是我用来点击的代码COMMAND + S：

ActionChains(browser).key_down(Keys.COMMAND).send_keys("s").key_up(Keys.COMMAND).perform()

Besides, the reason for me to use this method is that I encounter Unicode Encode Errorwhen I :-

此外，我使用这种方法的原因是当我遇到Unicode 编码错误时：-

write the page_source to a html file and
store scrapped information to a csv file.

将 page_source 写入 html 文件并
将报废的信息存储到 csv 文件。

Write to a html file:

写入一个 html 文件：

file_object = open(completeName, "w")
html = browser.page_source
file_object.write(html)
file_object.close()

Write to a csv file:

写入 csv 文件：

csv_file_write.writerow(to_write)

Error:

错误：

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in position 1: ordinal not in range(128)

Answer 1

回答by misantroop

with open('page.html', 'w') as f:
    f.write(driver.page_source)

Answer 2

回答by RemcoW

What you are trying to achieve is impossible to do with Selenium. The dialog that opens is not something Selenium can interact with.

使用 Selenium 无法实现您想要实现的目标。打开的对话框不是 Selenium 可以与之交互的。

The closes thing you could do is collect the page_sourcewhich gives you the entire HTML of a single page and save this to a file.

您可以做的关闭的事情是收集page_source为您提供单个页面的整个 HTML 并将其保存到文件中。

import codecs

completeName = os.path.join(save_path, file_name)
file_object = codecs.open(completeName, "w", "utf-8")
html = browser.page_source
file_object.write(html)

If you really need to save the entire website you should look into using a tool like AutoIT. This will make it possible to interact with the save dialog.

如果你真的需要保存整个网站，你应该考虑使用像 AutoIT 这样的工具。这将使与保存对话框交互成为可能。

Answer 3

回答by Mobrockers

You cannot interact with system dialogs like save file dialog. If you want to save the page html you can do something like this:

您无法与保存文件对话框等系统对话框进行交互。如果要保存页面 html，可以执行以下操作：

page = driver.page_source
file_ = open('page.html', 'w')
file_.write(page)
file_.close()

Answer 4

回答by Martin Thoma

This is a complete, working example of the answer RemcoW provided:

这是 RemcoW 提供的答案的完整工作示例：

You first have to install a webdriver, e.g. pip install selenium chromedriver_installer.

您首先必须安装一个网络驱动程序，例如pip install selenium chromedriver_installer.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

# core modules
import codecs
import os

# 3rd party modules
from selenium import webdriver


def get_browser():
    """Get the browser (a "driver")."""
    # find the path with 'which chromedriver'
    path_to_chromedriver = ('/usr/local/bin/chromedriver')
    browser = webdriver.Chrome(executable_path=path_to_chromedriver)
    return browser


save_path = os.path.expanduser('~')
file_name = 'index.html'
browser = get_browser()

url = "https://martin-thoma.com/"
browser.get(url)

complete_name = os.path.join(save_path, file_name)
file_object = codecs.open(complete_name, "w", "utf-8")
html = browser.page_source
file_object.write(html)
browser.close()

在 Python 中使用 Selenium 在 Firefox 上保存网页

提问by Tommy N

回答by misantroop

回答by RemcoW

回答by Mobrockers

回答by Martin Thoma

相关推荐

最近更新

标签

在 Python 中使用 Selenium 在 Firefox 上保存网页

提问by Tommy N

回答by misantroop

回答by RemcoW

回答by Mobrockers

回答by Martin Thoma

相关推荐

Python TypeError: 'newline' 是此函数的无效关键字参数

如何使用python按特定顺序对文件名进行排序

Python Seaborn Lineplot 模块对象没有属性“Lineplot”

Python 如何使用 pip 安装 opencv？

相关推荐

最近更新

标签