在 Pandas Dataframe 中转换 HTML 表格
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39120853/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
converting an HTML table in Pandas Dataframe
提问by Manu Sharma
I am reading an HTML table with pd.read_html but the result is coming in a list, I want to convert it inot a pandas dataframe, so I can continue further operations on the same. I am using the following script
我正在使用 pd.read_html 读取 HTML 表,但结果出现在一个列表中,我想将其转换为 Pandas 数据帧,以便我可以继续对其进行进一步操作。我正在使用以下脚本
import pandas as pd
import html5lib
data=pd.read_html('http://www.espn.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2',skiprows=1)
and since My results are coming as 1 list, I tried to convert it into a data frame with
由于我的结果是 1 个列表,因此我尝试将其转换为数据框
data1=pd.DataFrame(Data)
and result came as 0
结果为 0
0 0 1 2 3 4...
and because of result as a list, I can't apply any functions such as rename, dropna, drop.
并且由于结果为列表,我无法应用任何函数,例如重命名、dropna、drop。
I will appreciate every help
我会感谢每一个帮助
回答by jezrael
I think you need add [0]
if need select first item of list, because read_html
return list of DataFrames
:
我认为[0]
如果需要选择列表的第一项,您需要添加,因为read_html
返回list of DataFrames
:
So you can use:
所以你可以使用:
import pandas as pd
data1 = pd.read_html('http://www.espn.com/nhl/statis??tics/player/??_/stat/point??s/sort/point??s/year/2015&??#47;seasontype/2??',skiprows=1)[0]
print (data1)
0 1 2 3 4 5 6 7 8 9 \
0 RK PLAYER TEAM GP G A PTS +/- PIM PTS/G
1 1 Jamie Benn, LW DAL 82 35 52 87 1 64 1.06
2 2 John Tavares, C NYI 82 38 48 86 5 46 1.05
3 3 Sidney Crosby, C PIT 77 28 56 84 5 47 1.09
4 4 Alex Ovechkin, LW WSH 81 53 28 81 10 58 1.00
5 NaN Jakub Voracek, RW PHI 82 22 59 81 1 78 0.99
6 6 Nicklas Backstrom, C WSH 82 18 60 78 5 40 0.95
7 7 Tyler Seguin, C DAL 71 37 40 77 -1 20 1.08
8 8 Jiri Hudler, LW CGY 78 31 45 76 17 14 0.97
9 NaN Daniel Sedin, LW VAN 82 20 56 76 5 18 0.93
10 10 Vladimir Tarasenko, RW STL 77 37 36 73 27 31 0.95
11 NaN PP SH NaN NaN NaN NaN NaN NaN NaN
12 RK PLAYER TEAM GP G A PTS +/- PIM PTS/G
13 NaN Nick Foligno, LW CBJ 79 31 42 73 16 50 0.92
14 NaN Claude Giroux, C PHI 81 25 48 73 -3 36 0.90
15 NaN Henrik Sedin, C VAN 82 18 55 73 11 22 0.89
16 14 Steven Stamkos, C TB 82 43 29 72 2 49 0.88
17 NaN Tyler Johnson, C TB 77 29 43 72 33 24 0.94
18 16 Ryan Johansen, C CBJ 82 26 45 71 -6 40 0.87
19 17 Joe Pavelski, C SJ 82 37 33 70 12 29 0.85
20 NaN Evgeni Malkin, C PIT 69 28 42 70 -2 60 1.01
21 NaN Ryan Getzlaf, C ANA 77 25 45 70 15 62 0.91
22 20 Rick Nash, LW NYR 79 42 27 69 29 36 0.87
23 NaN PP SH NaN NaN NaN NaN NaN NaN NaN
24 RK PLAYER TEAM GP G A PTS +/- PIM PTS/G
25 21 Max Pacioretty, LW MTL 80 37 30 67 38 32 0.84
26 NaN Logan Couture, C SJ 82 27 40 67 -6 12 0.82
27 23 Jonathan Toews, C CHI 81 28 38 66 30 36 0.81
28 NaN Erik Karlsson, D OTT 82 21 45 66 7 42 0.80
29 NaN Henrik Zetterberg, LW DET 77 17 49 66 -6 32 0.86
30 26 Pavel Datsyuk, C DET 63 26 39 65 12 8 1.03
31 NaN Joe Thornton, C SJ 78 16 49 65 -4 30 0.83
32 28 Nikita Kucherov, RW TB 82 28 36 64 38 37 0.78
33 NaN Patrick Kane, RW CHI 61 27 37 64 10 10 1.05
34 NaN Mark Stone, RW OTT 80 26 38 64 21 14 0.80
35 NaN PP SH NaN NaN NaN NaN NaN NaN NaN
36 RK PLAYER TEAM GP G A PTS +/- PIM PTS/G
37 NaN Alexander Steen, LW STL 74 24 40 64 8 33 0.86
38 NaN Kyle Turris, C OTT 82 24 40 64 5 36 0.78
39 NaN Johnny Gaudreau, LW CGY 80 24 40 64 11 14 0.80
40 NaN Anze Kopitar, C LA 79 16 48 64 -2 10 0.81
41 35 Radim Vrbata, RW VAN 79 31 32 63 6 20 0.80
42 NaN Jaden Schwartz, LW STL 75 28 35 63 13 16 0.84
43 NaN Filip Forsberg, C NSH 82 26 37 63 15 24 0.77
44 NaN Jordan Eberle, RW EDM 81 24 39 63 -16 24 0.78
45 NaN Ondrej Palat, LW TB 75 16 47 63 31 24 0.84
46 40 Zach Parise, LW MIN 74 33 29 62 21 41 0.84
10 11 12 13 14 15 16
0 SOG PCT GWG G A G A
1 253 13.8 6 10 13 2 3
2 278 13.7 8 13 18 0 1
3 237 11.8 3 10 21 0 0
4 395 13.4 11 25 9 0 0
5 221 10.0 3 11 22 0 0
6 153 11.8 3 3 30 0 0
7 280 13.2 5 13 16 0 0
8 158 19.6 5 6 10 0 0
9 226 8.9 5 4 21 0 0
10 264 14.0 6 8 10 0 0
11 NaN NaN NaN NaN NaN NaN NaN
12 SOG PCT GWG G A G A
13 182 17.0 3 11 15 0 0
14 279 9.0 4 14 23 0 0
15 101 17.8 0 5 20 0 0
16 268 16.0 6 13 12 0 0
17 203 14.3 6 8 9 0 0
18 202 12.9 0 7 19 2 0
19 261 14.2 5 19 12 0 0
20 212 13.2 4 9 17 0 0
21 191 13.1 6 3 10 0 2
22 304 13.8 8 6 6 4 1
23 NaN NaN NaN NaN NaN NaN NaN
24 SOG PCT GWG G A G A
25 302 12.3 10 7 4 3 2
26 263 10.3 4 6 18 2 0
27 192 14.6 7 6 11 2 1
28 292 7.2 3 6 24 0 0
29 227 7.5 3 4 24 0 0
30 165 15.8 5 8 16 0 0
31 131 12.2 0 4 18 0 0
32 190 14.7 2 2 13 0 0
33 186 14.5 5 6 16 0 0
34 157 16.6 6 5 8 1 0
35 NaN NaN NaN NaN NaN NaN NaN
36 SOG PCT GWG G A G A
37 223 10.8 5 8 16 0 0
38 215 11.2 6 4 12 1 0
39 167 14.4 4 8 13 0 0
40 134 11.9 4 6 18 0 0
41 267 11.6 7 12 11 0 0
42 184 15.2 4 8 8 0 2
43 237 11.0 6 6 13 0 0
44 183 13.1 2 6 15 0 0
45 139 11.5 5 3 8 1 1
46 259 12.7 3 11 5 0 0