pandas 使用第一行作为列名?熊猫 read_html
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28206556/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Use first row as column names? Pandas read_html
提问by nicholas.reichel
I have this simple one line script:
我有这个简单的单行脚本:
from pandas import read_html
print read_html('http://money.cnn.com/data/hotstocks/', flavor = 'bs4')
Which works, fine, but the column names are missing, they are being identified as 1, 2, 3. Is there an easy way to tell pandas to use the first row as the column names? I know I could just store the names as a list and set them, and then skip the first row, but am wondering if there is an easier/better way.
哪个有效,但缺少列名,它们被标识为 1、2、3。有没有一种简单的方法可以告诉 Pandas 使用第一行作为列名?我知道我可以将名称存储为列表并设置它们,然后跳过第一行,但我想知道是否有更简单/更好的方法。
Currently it prints:
目前它打印:
0 1 2 3
0 Company Price Change %?Change
1 AAPL Apple Inc 115.31 +6.17 +5.65%
2 BAC Bank of America Corp 15.20 -0.43 -2.75%
3 YHOO Yahoo! Inc 46.46 -1.53 -3.19%
4 MSFT Microsoft Corp 41.19 -1.47 -3.45%
5 FB Facebook Inc 76.24 +0.46 +0.61%
6 GE General Electric Co 23.84 -0.54 -2.21%
7 T AT&T Inc 32.68 -0.13 -0.40%
8 F Ford Motor Co 14.46 -0.24 -1.63%
9 INTC Intel Corp 33.78 -0.41 -1.20%
10 CSCO Cisco Systems Inc 26.80 -0.09 -0.35%
回答by JAB
'read_html` takes a header parameter. You can pass a row index:
“read_html”采用标头参数。您可以传递行索引:
read_html('http://money.cnn.com/data/hotstocks/', header =0, flavor = 'bs4')
Worth noting this caveat in the docs:
值得注意的是文档中的这个警告:
For example, you might need to manually assign column names if the column names are converted to NaN when you pass the header=0 argument
例如,如果在传递 header=0 参数时将列名转换为 NaN,您可能需要手动分配列名
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.html.read_html.html
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.html.read_html.html

