Python 使用 Selenium Webdriver 向下滚动页面
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21753130/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Scrolling down a page with Selenium Webdriver
提问by Saheb
I have a dynamic page that loads products when the user scrolls down a page. I want to get the total number of products rendered on the display page. Currently I am using the following code to get to the bottom until all the products are being displayed.
我有一个动态页面,当用户向下滚动页面时加载产品。我想获取显示页面上呈现的产品总数。目前,我正在使用以下代码深入了解所有产品,直到显示所有产品。
elems = WebDriverWait(self.driver, 30).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "x")))
print len(elems)
a = len(elems)
self.driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(4)
elem1 = WebDriverWait(self.driver, 30).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "x")))
b = len(elem1)
while b > a:
self.driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(4)
elem1 = WebDriverWait(self.driver, 30).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "x")))
a = b
b = len(elem1)
print b
This is working nicely, but I want to know whether there is any better option of doing this?
这工作得很好,但我想知道是否有更好的选择?
回答by ArtOfWarfare
I think you could condense your code down to this:
我认为您可以将代码压缩为:
prior = 0
while True:
self.driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
current = len(WebDriverWait(self.driver, 30).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "x"))))
if current == prior:
return current
prior = current
I did away with all the identical lines by moving them all into the loop, which necessitated making the loop a while True:and moving the condition checking into the loop (because unfortunately, Python lacks any do-while).
我通过将所有相同的行移到循环中来消除所有相同的行,这需要使循环 awhile True:并将条件检查移到循环中(因为不幸的是,Python 缺少任何do-while)。
I also threw out the sleep and print statements - I'm not sure what their purpose was, but on my own page, I have found that the same number of elements load whether I sleep between scrolls or not. Further, in my own case, I don't need to know the count at any point, I just need to know when it has exhausted the list (but I added in a return variable so you can get the final count if you happen to need it. If you really want to print ever intermediate count, you can print current right after it's assigned in the loop.
我还丢弃了 sleep 和 print 语句 - 我不确定它们的目的是什么,但在我自己的页面上,我发现无论我是否在滚动之间睡觉,加载的元素数量都相同。此外,在我自己的情况下,我在任何时候都不需要知道计数,我只需要知道它何时用尽列表(但我添加了一个返回变量,以便您可以在碰巧时获得最终计数需要它。如果你真的想打印中间计数,你可以在循环中分配当前值后立即打印。
回答by aomoore3
If you have no idea how many elements might be added to the page, but you just want to get all of them, it might be good to loop thusly:
如果您不知道可以向页面添加多少元素,但只想获取所有元素,那么这样循环可能会很好:
- scroll down as described above
- wait a few seconds
- save the size of the page source (xxx.page_source)
- if the size of the page source is larger than the last page source size saved, loop back and scroll down some more
- 如上所述向下滚动
- 等待几秒钟
- 保存页面源的大小(xxx.page_source)
- 如果页面源的大小大于上次保存的页面源大小,则循环返回并向下滚动更多
I suppose that screenshot size might work fine too, depending upon the page you're loading, but this is working in my current program.
我想屏幕截图大小也可以正常工作,具体取决于您正在加载的页面,但这在我当前的程序中有效。
回答by Ayoub
You can perform this action easily using this line of code
您可以使用这行代码轻松执行此操作
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
And if you want to scroll down for ever you should try this.
如果你想永远向下滚动,你应该试试这个。
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Firefox()
driver.get("https://twitter.com/BarackObama")
while True:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(3)
I am notsure about time.sleep(x value) cause loading data my take longer .. or less .. for more information please check the official Doc page
我不能肯定time.sleep(x值)原因加载数据我需要更长的时间..以下..了解更多信息请查看官方文档页面
have fun :)
玩得开心 :)

