Python Beautiful Soup:“ResultSet”对象没有“find_all”属性?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24108507/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:59:52  来源:igfitidea点击:

Beautiful Soup: 'ResultSet' object has no attribute 'find_all'?

pythonbeautifulsoup

提问by Anton

I am trying to scrape a simple table using Beautiful Soup. Here is my code:

我正在尝试使用 Beautiful Soup 刮一张简单的桌子。这是我的代码:

import requests
from bs4 import BeautifulSoup

url = 'https://gist.githubusercontent.com/anonymous/c8eedd8bf41098a8940b/raw/c7e01a76d753f6e8700b54821e26ee5dde3199ab/gistfile1.txt'
r = requests.get(url)

soup = BeautifulSoup(r.text)
table = soup.find_all(class_='dataframe')

first_name = []
last_name = []
age = []
preTestScore = []
postTestScore = []

for row in table.find_all('tr'):
    col = table.find_all('td')

    column_1 = col[0].string.strip()
    first_name.append(column_1)

    column_2 = col[1].string.strip()
    last_name.append(column_2)

    column_3 = col[2].string.strip()
    age.append(column_3)

    column_4 = col[3].string.strip()
    preTestScore.append(column_4)

    column_5 = col[4].string.strip()
    postTestScore.append(column_5)

columns = {'first_name': first_name, 'last_name': last_name, 'age': age, 'preTestScore': preTestScore, 'postTestScore': postTestScore}
df = pd.DataFrame(columns)
df

However, whenever I run it, I get this error:

但是,每当我运行它时,都会出现此错误:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-116-a900c2872793> in <module>()
     14 postTestScore = []
     15 
---> 16 for row in table.find_all('tr'):
     17     col = table.find_all('td')
     18 

AttributeError: 'ResultSet' object has no attribute 'find_all'

I have read around a dozen StackOverflow questions about this error, and I cannot figure out what I am doing wrong.

我已经阅读了十几个关于这个错误的 StackOverflow 问题,但我无法弄清楚我做错了什么。

回答by otus

table = soup.find_all(class_='dataframe')
table = soup.find_all(class_='dataframe')

This gives you a result set – i.e. allthe elements that match the class. You can either iterate over them or, if you know you only have one dataFrame, you can use findinstead. From your code it seems the latter is what you need, to deal with the immediate problem:

这为您提供了一个结果集——即与类匹配的所有元素。您可以迭代它们,或者,如果您知道只有一个dataFrame,则可以find改用。从您的代码看来,后者是您需要的,以处理眼前的问题:

table = soup.find(class_='dataframe')

However, that is not all:

然而,这还不是全部:

for row in table.find_all('tr'):
    col = table.find_all('td')

You probably want to iterate over the tds in the row here, rather than the whole table. (Otherwise you'll just see the first row over and over.)

您可能想td在这里遍历行中的s,而不是整个表。(否则你只会一遍又一遍地看到第一行。)

for row in table.find_all('tr'):
    for col in row.find_all('td'):

回答by Ralf Haring

The tablevariable contains an array. You would need to call find_allon its members (even though you know it's an array with only one member), not on the entire thing.

table变量包含一个数组。你需要调用find_all它的成员(即使你知道它是一个只有一个成员的数组),而不是整个事情。

>>> type(table)
<class 'bs4.element.ResultSet'>
>>> type(table[0])
<class 'bs4.element.Tag'>
>>> len(table[0].find_all('tr'))
6
>>>

回答by Padraic Cunningham

Iterate over table and use rowfind_all('td')

迭代表并使用 rowfind_all('td')

   for row in table:
        col = row.find_all('td')