在新标签页中打开网页 Selenium + Python

Question

提问by Robert W. Hunter

So I am trying to open websites on new tabs inside my WebDriver. I want to do this, because opening a new WebDriver for each website takes about 3.5secs using PhantomJS, I want more speed...

所以我试图在我的 WebDriver 中的新选项卡上打开网站。我想这样做，因为使用 PhantomJS 为每个网站打开一个新的 WebDriver 大约需要 3.5 秒，我想要更快的速度......

I'm using a multiprocess python script, and I want to get some elements from each page, so the workflow is like this:

我使用的是多进程python脚本，我想从每个页面中获取一些元素，所以工作流程是这样的：

Open Browser

Loop throught my array
For element in array -> Open website in new tab -> do my business -> close it

But I can't find any way to achieve this.

但我找不到任何方法来实现这一目标。

Here's the code I'm using. It takes forever between websites, I need it to be fast... Other tools are allowed, but I don't know too many tools for scrapping website content that loads with JavaScript (divs created when some event is triggered on load etc) That's why I need Selenium... BeautifulSoup can't be used for some of my pages.

这是我正在使用的代码。它需要在网站之间永远进行，我需要它很快......允许使用其他工具，但我不知道有太多用于抓取使用 JavaScript 加载的网站内容的工具（在加载时触发某些事件时创建的 div 等）为什么我需要 Selenium... BeautifulSoup 不能用于我的某些页面。

#!/usr/bin/env python
import multiprocessing, time, pika, json, traceback, logging, sys, os, itertools, urllib, urllib2, cStringIO, mysql.connector, shutil, hashlib, socket, urllib2, re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from PIL import Image
from os import listdir
from os.path import isfile, join
from bs4 import BeautifulSoup
from pprint import pprint

def getPhantomData(parameters):
    try:
        # We create WebDriver
        browser = webdriver.Firefox()
        # Navigate to URL
        browser.get(parameters['target_url'])
        # Find all links by Selector
        links = browser.find_elements_by_css_selector(parameters['selector'])

        result = []
        for link in links:
            # Extract link attribute and append to our list
            result.append(link.get_attribute(parameters['attribute']))
        browser.close()
        browser.quit()
        return json.dumps({'data': result})
    except Exception, err:
        browser.close()
        browser.quit()
        print err

def callback(ch, method, properties, body):
    parameters = json.loads(body)
    message = getPhantomData(parameters)

    if message['data']:
        ch.basic_ack(delivery_tag=method.delivery_tag)
    else:
        ch.basic_reject(delivery_tag=method.delivery_tag, requeue=True)

def consume():
    credentials = pika.PlainCredentials('invitado', 'invitado')
    rabbit = pika.ConnectionParameters('localhost',5672,'/',credentials)
    connection = pika.BlockingConnection(rabbit)
    channel = connection.channel()

    # Conectamos al canal
    channel.queue_declare(queue='com.stuff.images', durable=True)
    channel.basic_consume(callback,queue='com.stuff.images')

    print ' [*] Waiting for messages. To exit press CTRL^C'
    try:
        channel.start_consuming()
    except KeyboardInterrupt:
        pass

workers = 5
pool = multiprocessing.Pool(processes=workers)
for i in xrange(0, workers):
    pool.apply_async(consume)

try:
    while True:
        continue
except KeyboardInterrupt:
    print ' [*] Exiting...'
    pool.terminate()
    pool.join()

Answer 1

采纳答案by aberna

You can achieve the opening/closing of a tab by the combination of keys COMMAND+ Tor COMMAND+ W(OSX). On other OSs you can use CONTROL+ T/ CONTROL+ W.

您可以通过组合键COMMAND+T或COMMAND+ W(OSX)实现选项卡的打开/关闭。在其他操作系统上，您可以使用CONTROL+ T/ CONTROL+ W。

In selenium you can emulate such behavior. You will need to create one webdriver and as many tabs as the tests you need.

在 selenium 中，您可以模拟这种行为。您将需要创建一个 webdriver 和与您需要的测试一样多的选项卡。

Here it is the code.

这是代码。

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()
driver.get("http://www.google.com/")

#open tab
driver.find_element_by_tag_name('body').send_keys(Keys.COMMAND + 't') 
# You can use (Keys.CONTROL + 't') on other OSs

# Load a page 
driver.get('http://stackoverflow.com/')
# Make the tests...

# close the tab
# (Keys.CONTROL + 'w') on other OSs.
driver.find_element_by_tag_name('body').send_keys(Keys.COMMAND + 'w') 


driver.close()

Answer 2

回答by Supratik Majumdar

browser.execute_script('''window.open("http://bings.com","_blank");''')

Where browseris the webDriver

当浏览器是的webdriver

Answer 3

回答by Ziad abbas

After struggling for so long the below method worked for me:

经过这么长时间的努力，下面的方法对我有用：

driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 't')
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.TAB)

windows = driver.window_handles

time.sleep(3)
driver.switch_to.window(windows[1])

Answer 4

回答by yucer

This is a common code adapted from another examples:

这是改编自另一个示例的通用代码：

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()
driver.get("http://www.google.com/")

#open tab
# ... take the code from the options below

# Load a page 
driver.get('http://bings.com')
# Make the tests...

# close the tab
driver.quit()

the possible ways were:

可能的方法是：

Sending <CTRL> + <T>to one element

#open tab
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 't')

Sending <CTRL> + <T>via Action chains

ActionChains(driver).key_down(Keys.CONTROL).send_keys('t').key_up(Keys.CONTROL).perform()

Execute a javascript snippet
```
driver.execute_script('''window.open("http://bings.com","_blank");''')
```
In order to achieve this you need to ensure that the preferences browser.link.open_newwindowand browser.link.open_newwindow.restrictionare properly set. The default values in the last versions are ok, otherwise you supposedly need:
```
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.link.open_newwindow", 3)
fp.set_preference("browser.link.open_newwindow.restriction", 2)

driver = webdriver.Firefox(browser_profile=fp)
```
the problem is that those preferences preset to other valuesand are frozenat least selenium 3.4.0. When you use the profile to set them with the java binding there comes an exceptionand with the python binding the new values are ignored.
In Java there is a way to set those preferences without specifying a profile object when talking to geckodriver, but it seem to be not implemented yet in the python binding:
```
FirefoxOptions options = new FirefoxOptions().setProfile(fp);
options.addPreference("browser.link.open_newwindow", 3);
options.addPreference("browser.link.open_newwindow.restriction", 2);
FirefoxDriver driver = new FirefoxDriver(options);
```

发送<CTRL> + <T>到一个元素

#open tab
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 't')

<CTRL> + <T>通过动作链发送

ActionChains(driver).key_down(Keys.CONTROL).send_keys('t').key_up(Keys.CONTROL).perform()

执行一个 javascript 片段
```
driver.execute_script('''window.open("http://bings.com","_blank");''')
```
为了实现这一点，您需要确保正确设置了首选项browser.link.open_newwindow和browser.link.open_newwindow.restriction。最后一个版本的默认值是可以的，否则你应该需要：
```
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.link.open_newwindow", 3)
fp.set_preference("browser.link.open_newwindow.restriction", 2)

driver = webdriver.Firefox(browser_profile=fp)
```
问题是这些首选项预设为其他值并且至少在 selenium 3.4.0 中被冻结。当您使用配置文件通过 java 绑定设置它们时，会出现异常，并且使用 python 绑定，新值将被忽略。
在 Java 中，有一种方法可以在与geckodriver交谈时在不指定配置文件对象的情况下设置这些首选项，但似乎还没有在 python 绑定中实现：
```
FirefoxOptions options = new FirefoxOptions().setProfile(fp);
options.addPreference("browser.link.open_newwindow", 3);
options.addPreference("browser.link.open_newwindow.restriction", 2);
FirefoxDriver driver = new FirefoxDriver(options);
```

The third option did stop workingfor python in selenium 3.4.0.

第三个选项在 selenium 3.4.0 中停止为 python工作。

The first two options also did seem to stop workingin selenium 3.4.0. They do depend on sending CTRL key event to an element. At first glance it seem that is a problem of the CTRL key, but it is failing because of the new multiprocess feature of Firefox. It might be that this new architecture impose new ways of doing that, or maybe is a temporary implementation problem. Anyway we can disable it via:

前两个选项似乎也停止在 selenium 3.4.0 中工作。它们确实依赖于向元素发送 CTRL 键事件。乍一看似乎是 CTRL 键的问题，但由于Firefox的新多进程功能而失败。可能是这种新架构强加了新的方法来做到这一点，或者可能是一个临时的实现问题。无论如何，我们可以通过以下方式禁用它：

fp = webdriver.FirefoxProfile()
fp.set_preference("browser.tabs.remote.autostart", False)
fp.set_preference("browser.tabs.remote.autostart.1", False)
fp.set_preference("browser.tabs.remote.autostart.2", False)

driver = webdriver.Firefox(browser_profile=fp)

... and then you can use successfully the first way.

...然后您可以成功使用第一种方式。

Answer 5

回答by DebanjanB

In a discussion, Simon clearly mentioned that:

在一次讨论中，西蒙清楚地提到：

While the datatype used for storing the list of handles may be ordered by insertion, the order in which the WebDriver implementation iterates over the window handles to insert them has no requirement to be stable. The ordering is arbitrary.

虽然用于存储句柄列表的数据类型可以按插入排序，但 WebDriver 实现迭代窗口句柄以插入它们的顺序不需要稳定。排序是任意的。

Using Selenium v3.xopening a website in a New Tabthrough Pythonis much easier now. We have to induce an WebDriverWaitfor number_of_windows_to_be(2)and then collect the window handles every time we open a new tab/window and finally iterate through the window handles and switchTo().window(newly_opened)as required. Here is a solution where you can open http://www.google.co.inin the initial TABand https://www.yahoo.comin the adjacent TAB:

现在使用Selenium v3.x在新标签页中通过Python打开网站要容易得多。我们必须在每次打开新选项卡/窗口时引入一个WebDriverWaitfornumber_of_windows_to_be(2)然后收集窗口句柄，最后switchTo().window(newly_opened)根据需要遍历窗口句柄。这里是一个解决方案，您可以打开http://www.google.co.in在最初的TAB和https://www.yahoo.com在相邻的TAB：

Code Block:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("http://www.google.co.in")
print("Initial Page Title is : %s" %driver.title)
windows_before  = driver.current_window_handle
print("First Window Handle is : %s" %windows_before)
driver.execute_script("window.open('https://www.yahoo.com')")
WebDriverWait(driver, 10).until(EC.number_of_windows_to_be(2))
windows_after = driver.window_handles
new_window = [x for x in windows_after if x != windows_before][0]
driver.switch_to_window(new_window)
print("Page Title after Tab Switching is : %s" %driver.title)
print("Second Window Handle is : %s" %new_window)

Console Output:

Initial Page Title is : Google
First Window Handle is : CDwindow-B2B3DE3A222B3DA5237840FA574AF780
Page Title after Tab Switching is : Yahoo
Second Window Handle is : CDwindow-D7DA7666A0008ED91991C623105A2EC4

Browser Snapshot:

代码块：

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("http://www.google.co.in")
print("Initial Page Title is : %s" %driver.title)
windows_before  = driver.current_window_handle
print("First Window Handle is : %s" %windows_before)
driver.execute_script("window.open('https://www.yahoo.com')")
WebDriverWait(driver, 10).until(EC.number_of_windows_to_be(2))
windows_after = driver.window_handles
new_window = [x for x in windows_after if x != windows_before][0]
driver.switch_to_window(new_window)
print("Page Title after Tab Switching is : %s" %driver.title)
print("Second Window Handle is : %s" %new_window)

控制台输出：

Initial Page Title is : Google
First Window Handle is : CDwindow-B2B3DE3A222B3DA5237840FA574AF780
Page Title after Tab Switching is : Yahoo
Second Window Handle is : CDwindow-D7DA7666A0008ED91991C623105A2EC4

浏览器快照：

multiple__tabs

多个__标签

Outro

奥特罗

You can find the javabased discussion in Best way to keep track and iterate through tabs and windows using WindowHandles using Selenium

您可以在Best way to keep track and iterate through tabs and windows using WindowHandles using Selenium 中找到基于Java的讨论

Answer 6

回答by astroben

I tried for a very long time to duplicate tabs in Chrome running using action_keys and send_keys on body. The only thing that worked for me was an answer here. This is what my duplicate tabs def ended up looking like, probably not the best but it works fine for me.

我尝试了很长时间在 Chrome 中使用 action_keys 和 send_keys 在 body 上复制标签。唯一对我有用的是这里的答案。这就是我的重复选项卡 def 最终的样子，可能不是最好的，但对我来说效果很好。

def duplicate_tabs(number, chromewebdriver):
#Once on the page we want to open a bunch of tabs
url = chromewebdriver.current_url
for i in range(number):
    print('opened tab: '+str(i))
    chromewebdriver.execute_script("window.open('"+url+"', 'new_window"+str(i)+"')")

It basically runs some java from inside of python, it's incredibly useful. Hope this helps somebody.

它基本上从python内部运行一些java，它非常有用。希望这可以帮助某人。

Note: I am using Ubuntu, it shouldn't make a difference but if it doesn't work for you this could be the reason.

注意：我使用的是 Ubuntu，它不应该有什么不同，但如果它对你不起作用，这可能是原因。

Answer 7

回答by KOLANICH

Strangely, so many answers, and all of them are using surrogates like JS and keyboard shortcuts instead of just using a selenium feature:

奇怪的是，这么多答案，而且所有答案都使用 JS 和键盘快捷键之类的替代品，而不仅仅是使用 selenium 功能：

def newTab(driver, url="about:blank"):
    wnd = driver.execute(selenium.webdriver.common.action_chains.Command.NEW_WINDOW)
    handle = wnd["value"]["handle"]
    driver.switch_to.window(handle)
    driver.get(url) # changes the handle
    return driver.current_window_handle

Answer 8

回答by Jeremy Anifacc

OS: Win 10,
Python 3.8.1
- selenium==3.141.0

操作系统：Win 10，
蟒蛇 3.8.1
- 硒==3.141.0

from selenium import webdriver
import time

driver = webdriver.Firefox(executable_path=r'TO\Your\Path\geckodriver.exe')
driver.get('https://www.google.com/')

# Open a new window
driver.execute_script("window.open('');")
# Switch to the new window
driver.switch_to.window(driver.window_handles[1])
driver.get("http://stackoverflow.com")
time.sleep(3)

# Open a new window
driver.execute_script("window.open('');")
# Switch to the new window
driver.switch_to.window(driver.window_handles[2])
driver.get("https://www.reddit.com/")
time.sleep(3)
# close the active tab
driver.close()
time.sleep(3)

# Switch back to the first tab
driver.switch_to.window(driver.window_handles[0])
driver.get("https://bing.com")
time.sleep(3)

# Close the only tab, will also close the browser.
driver.close()

Reference: Need Help Opening A New Tab in Selenium

参考：需要帮助在 Selenium 中打开一个新标签

Answer 9

回答by Capitaine

The other solutions do not work for chrome driver v83.

其他解决方案不适用于chrome driver v83。

Instead, it works as follows:

相反，它的工作原理如下：

driver.execute_script("window.open('');")
driver.switch_to.window(driver.window_handles[1])
driver.get("https://www.google.com")

在新标签页中打开网页 Selenium + Python

提问by Robert W. Hunter

采纳答案by aberna

回答by Supratik Majumdar

回答by Ziad abbas

回答by yucer

回答by DebanjanB

Outro

奥特罗

回答by astroben

回答by KOLANICH

回答by Jeremy Anifacc

回答by Capitaine

相关推荐

最近更新

标签

在新标签页中打开网页 Selenium + Python

提问by Robert W. Hunter

采纳答案by aberna

回答by Supratik Majumdar

回答by Ziad abbas

回答by yucer

回答by DebanjanB

Outro

奥特罗

回答by astroben

回答by KOLANICH

回答by Jeremy Anifacc

回答by Capitaine

相关推荐

Python 如何使用 Pillow 将图像粘贴到更大的图像上？

Python 在新的多索引级别下连接 Pandas 列

用 Python 在远程机器上执行命令

Python 删除 CSV 文件的第一行

相关推荐

最近更新

标签