使用 Python Selenium 遍历表行并打印列文本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31812537/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:34:50  来源:igfitidea点击:

Iterate through table rows and print column text with Python Selenium

pythonseleniumselenium-webdriverhtml-table

提问by Riaz Ladhani

I have a table (<table>) with values in each row (<tr>) from its body (<tbody>).

我有一个表 ( <table>),<tr>它的主体 ( <tbody>) 的每一行( ) 都有值。

The value I would lile to print out is in the <span>inside a <div>tag.

我想打印出来的值在标签<span>内部<div>

Inspecting the html, I see the value e.g. "Name" is in row 1 (tr[1]), column 2 (td[2]):

检查 html,我看到值,例如“名称”在第 1 行(tr[1])第 2 列(td[2])中:

<tr class="GAT4PNUFG GAT4PNUMG" __gwt_subrow="0" __gwt_row="0">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUHG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
                <div __gwt_cell="cell-gwt-uid-324" style="outline-style:none;">
                    <span class="linkhover" title="Name" style="white-space:nowrap;overflow:hidden;text-overflow:ellipsis;empty-cells:show;display:block;color:#00A;cursor:pointer;">Name</span>
                </div>
            </td>

I would like to loop through the table each row and print out the value in columns 2, td[2]

我想遍历表格的每一行并打印出第 2 列 td[2] 中的值

I am using Python with Selenium Webdriver

我正在使用 Python 和 Selenium Webdriver

The full Xpath to the table row 1, column 2 is:

表第 1 行第 2 列的完整 Xpath 为:

html/body/div[2]/div[2]/div/div[4]/div/div[2]/div/div[3]/div/div[5]/div/div[3]/div/div[4]/div/div[2]/div/div[4]/div/div[3]/div/div[2]/div/div/table/tbody/tr[1]/td[2]/div/span

I was thinking if i can start from the table, xpath as follows:

我在想是否可以从表开始,xpath 如下:

html/body/div[2]/div[2]/div/div[4]/div/div[2]/div/div[3]/div/div[5]/div/div[3]/div/div[4]/div/div[2]/div/div[4]/div/div[3]/div/div[2]/div/div/table/tbody

I can then use a for loop and use an index for the tr and td e.g for row1 use tr[i], for col2 use td[2].

然后我可以使用 for 循环并为 tr 和 td 使用索引,例如对于 row1 使用 tr[i],对于 col2 使用 td[2]。

html/body/div[2]/div[2]/div/div[4]/div/div[2]/div/div[3]/div/div[5]/div/div[3]/div/div[4]/div/div[2]/div/div[4]/div/div[3]/div/div[2]/div/div/table/tbody/tr[i]/td[2]/div/span

How can i loop through this table and print out the value of the Span class tag which is always in column 2 of the table?

如何遍历该表并打印出始终位于表第 2 列的 Span 类标记的值?

I tried to get the start of the table into a variable and then I could maybe use this to loop through the rows and columns. I need some help please.

我试图将表的开头放入一个变量中,然后我可以使用它来遍历行和列。我需要一些帮助。

table = self.driver.find_element(By.XPATH, 'html/body/div[2]/div[2]/div/div[4]/div/div[2]/div/div[3]/div/div[5]/div/div[3]/div/div[4]/div/div[2]/div/div[4]/div/div[3]/div/div[2]/div/div/table/tbody')

Here's the full HTML:

这是完整的 HTML:

    <table cellspacing="0" style="table-layout: fixed; width: 100%;">
    <colgroup>
    <tbody>
        <tr class="GAT4PNUFG GAT4PNUMG" __gwt_subrow="0" __gwt_row="0">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUHG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
                <div __gwt_cell="cell-gwt-uid-324" style="outline-style:none;">
                    <span class="linkhover" title="Name" style="white-space:nowrap;overflow:hidden;text-overflow:ellipsis;empty-cells:show;display:block;color:#00A;cursor:pointer;">Name</span>
                </div>
            </td>
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUNG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUBH GAT4PNUNG">
        </tr>
        <tr class="GAT4PNUEH" __gwt_subrow="0" __gwt_row="1">
            <td class="GAT4PNUEG GAT4PNUFH GAT4PNUHG">
            <td class="GAT4PNUEG GAT4PNUFH">
                <div __gwt_cell="cell-gwt-uid-324" style="outline-style:none;">
                    <span class="linkhover" title="Address" style="white-space:nowrap;overflow:hidden;text-overflow:ellipsis;empty-cells:show;display:block;color:#00A;cursor:pointer;">Address</span>
                </div>
            </td>
            <td class="GAT4PNUEG GAT4PNUFH">
            <td class="GAT4PNUEG GAT4PNUFH">
            <td class="GAT4PNUEG GAT4PNUFH">
            <td class="GAT4PNUEG GAT4PNUFH GAT4PNUBH">
        </tr>
        <tr class="GAT4PNUFG" __gwt_subrow="0" __gwt_row="2">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUHG">
            <td class="GAT4PNUEG GAT4PNUGG">
                <div __gwt_cell="cell-gwt-uid-324" style="outline-style:none;">
                    <span class="linkhover" title="DOB" style="white-space:nowrap;overflow:hidden;text-overflow:ellipsis;empty-cells:show;display:block;color:#00A;cursor:pointer;">DOB</span>
                </div>
            </td>
            <td class="GAT4PNUEG GAT4PNUGG">
            <td class="GAT4PNUEG GAT4PNUGG">
            <td class="GAT4PNUEG GAT4PNUGG">
            <td class="GAT4PNUEG GAT4PNUGG GAT4PNUBH">
        </tr>
        <tr class="GAT4PNUEH" __gwt_subrow="0" __gwt_row="3">
            ---
        <tr class="GAT4PNUFG" __gwt_subrow="0" __gwt_row="4">       
            ---
    </tbody>
</table>

采纳答案by Riaz Ladhani

The developer has put an ID into the table. I have it working now. It is printing all the cell values from column 2. The code is:

开发人员已将 ID 放入表中。我现在工作了。它正在打印第 2 列中的所有单元格值。代码是:

table_id = self.driver.find_element(By.ID, 'data_configuration_feeds_ct_fields_body0')
rows = table_id.find_elements(By.TAG_NAME, "tr") # get all of the rows in the table
for row in rows:
    # Get the columns (all the column 2)        
    col = row.find_elements(By.TAG_NAME, "td")[1] #note: index start from 0, 1 is col 2
    print col.text #prints text from the element

回答by alecxe

The XPath you currently using is quite fragilesince it depends on the complete document structure and the relative position of the elements. It can easily break in the future.

您当前使用的 XPath非常脆弱,因为它取决于完整的文档结构和元素的相对位置。它在未来很容易破裂。

Instead, locate the rows using their classor other attributes. For instance:

相反,使用行的class属性或其他属性来定位行。例如:

for row in driver.find_elements_by_css_selector("tr.GAT4PNUFG.GAT4PNUMG"):
    cell = row.find_elements_by_tag_name("td")[1]
    print(cell.text)