pandas 使用第一行作为列名?熊猫 read_html

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28206556/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:53:42  来源:igfitidea点击:

Use first row as column names? Pandas read_html

pythonparsingpandas

提问by nicholas.reichel

I have this simple one line script:

我有这个简单的单行脚本:

from pandas import read_html

print read_html('http://money.cnn.com/data/hotstocks/', flavor = 'bs4')

Which works, fine, but the column names are missing, they are being identified as 1, 2, 3. Is there an easy way to tell pandas to use the first row as the column names? I know I could just store the names as a list and set them, and then skip the first row, but am wondering if there is an easier/better way.

哪个有效,但缺少列名,它们被标识为 1、2、3。有没有一种简单的方法可以告诉 Pandas 使用第一行作为列名?我知道我可以将名称存储为列表并设置它们,然后跳过第一行,但我想知道是否有更简单/更好的方法。

Currently it prints:

目前它打印:

                           0       1       2         3
0                    Company   Price  Change  %?Change
1             AAPL Apple Inc  115.31   +6.17    +5.65%
2   BAC Bank of America Corp   15.20   -0.43    -2.75%
3            YHOO Yahoo! Inc   46.46   -1.53    -3.19%
4        MSFT Microsoft Corp   41.19   -1.47    -3.45%
5            FB Facebook Inc   76.24   +0.46    +0.61%
6     GE General Electric Co   23.84   -0.54    -2.21%
7                 T AT&T Inc   32.68   -0.13    -0.40%
8            F Ford Motor Co   14.46   -0.24    -1.63%
9            INTC Intel Corp   33.78   -0.41    -1.20%
10    CSCO Cisco Systems Inc   26.80   -0.09    -0.35%

回答by JAB

'read_html` takes a header parameter. You can pass a row index:

“read_html”采用标头参数。您可以传递行索引:

read_html('http://money.cnn.com/data/hotstocks/', header =0, flavor = 'bs4')

Worth noting this caveat in the docs:

值得注意的是文档中的这个警告:

For example, you might need to manually assign column names if the column names are converted to NaN when you pass the header=0 argument

例如,如果在传递 header=0 参数时将列名转换为 NaN,您可能需要手动分配列名

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.html.read_html.html

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.html.read_html.html