在 Pandas Dataframe 中转换 HTML 表格

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39120853/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:53:09  来源:igfitidea点击:

converting an HTML table in Pandas Dataframe

pythonhtmlpandasdataframe

提问by Manu Sharma

I am reading an HTML table with pd.read_html but the result is coming in a list, I want to convert it inot a pandas dataframe, so I can continue further operations on the same. I am using the following script

我正在使用 pd.read_html 读取 HTML 表,但结果出现在一个列表中,我想将其转换为 Pandas 数据帧,以便我可以继续对其进行进一步操作。我正在使用以下脚本

import pandas as pd
import html5lib
data=pd.read_html('http://www.espn.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2',skiprows=1)

and since My results are coming as 1 list, I tried to convert it into a data frame with

由于我的结果是 1 个列表,因此我尝试将其转换为数据框

data1=pd.DataFrame(Data)

and result came as 0

结果为 0

0       0                       1     2    3    4...

and because of result as a list, I can't apply any functions such as rename, dropna, drop.

并且由于结果为列表,我无法应用任何函数,例如重命名、dropna、drop。

I will appreciate every help

我会感谢每一个帮助

回答by jezrael

I think you need add [0]if need select first item of list, because read_htmlreturn list of DataFrames:

我认为[0]如果需要选择列表的第一项,您需要添加,因为read_html返回list of DataFrames

So you can use:

所以你可以使用:

import pandas as pd

data1 = pd.read_html('http://www.espn.com/nhl/statis??tics/player/??_/stat/point??s/sort/point??s/year/2015&??#47;seasontype/2??',skiprows=1)[0]
print (data1)

     0                       1     2    3    4    5    6    7    8      9   \
0    RK                  PLAYER  TEAM   GP    G    A  PTS  +/-  PIM  PTS/G   
1     1          Jamie Benn, LW   DAL   82   35   52   87    1   64   1.06   
2     2         John Tavares, C   NYI   82   38   48   86    5   46   1.05   
3     3        Sidney Crosby, C   PIT   77   28   56   84    5   47   1.09   
4     4       Alex Ovechkin, LW   WSH   81   53   28   81   10   58   1.00   
5   NaN       Jakub Voracek, RW   PHI   82   22   59   81    1   78   0.99   
6     6    Nicklas Backstrom, C   WSH   82   18   60   78    5   40   0.95   
7     7         Tyler Seguin, C   DAL   71   37   40   77   -1   20   1.08   
8     8         Jiri Hudler, LW   CGY   78   31   45   76   17   14   0.97   
9   NaN        Daniel Sedin, LW   VAN   82   20   56   76    5   18   0.93   
10   10  Vladimir Tarasenko, RW   STL   77   37   36   73   27   31   0.95   
11  NaN                      PP    SH  NaN  NaN  NaN  NaN  NaN  NaN    NaN   
12   RK                  PLAYER  TEAM   GP    G    A  PTS  +/-  PIM  PTS/G   
13  NaN        Nick Foligno, LW   CBJ   79   31   42   73   16   50   0.92   
14  NaN        Claude Giroux, C   PHI   81   25   48   73   -3   36   0.90   
15  NaN         Henrik Sedin, C   VAN   82   18   55   73   11   22   0.89   
16   14       Steven Stamkos, C    TB   82   43   29   72    2   49   0.88   
17  NaN        Tyler Johnson, C    TB   77   29   43   72   33   24   0.94   
18   16        Ryan Johansen, C   CBJ   82   26   45   71   -6   40   0.87   
19   17         Joe Pavelski, C    SJ   82   37   33   70   12   29   0.85   
20  NaN        Evgeni Malkin, C   PIT   69   28   42   70   -2   60   1.01   
21  NaN         Ryan Getzlaf, C   ANA   77   25   45   70   15   62   0.91   
22   20           Rick Nash, LW   NYR   79   42   27   69   29   36   0.87   
23  NaN                      PP    SH  NaN  NaN  NaN  NaN  NaN  NaN    NaN   
24   RK                  PLAYER  TEAM   GP    G    A  PTS  +/-  PIM  PTS/G   
25   21      Max Pacioretty, LW   MTL   80   37   30   67   38   32   0.84   
26  NaN        Logan Couture, C    SJ   82   27   40   67   -6   12   0.82   
27   23       Jonathan Toews, C   CHI   81   28   38   66   30   36   0.81   
28  NaN        Erik Karlsson, D   OTT   82   21   45   66    7   42   0.80   
29  NaN   Henrik Zetterberg, LW   DET   77   17   49   66   -6   32   0.86   
30   26        Pavel Datsyuk, C   DET   63   26   39   65   12    8   1.03   
31  NaN         Joe Thornton, C    SJ   78   16   49   65   -4   30   0.83   
32   28     Nikita Kucherov, RW    TB   82   28   36   64   38   37   0.78   
33  NaN        Patrick Kane, RW   CHI   61   27   37   64   10   10   1.05   
34  NaN          Mark Stone, RW   OTT   80   26   38   64   21   14   0.80   
35  NaN                      PP    SH  NaN  NaN  NaN  NaN  NaN  NaN    NaN   
36   RK                  PLAYER  TEAM   GP    G    A  PTS  +/-  PIM  PTS/G   
37  NaN     Alexander Steen, LW   STL   74   24   40   64    8   33   0.86   
38  NaN          Kyle Turris, C   OTT   82   24   40   64    5   36   0.78   
39  NaN     Johnny Gaudreau, LW   CGY   80   24   40   64   11   14   0.80   
40  NaN         Anze Kopitar, C    LA   79   16   48   64   -2   10   0.81   
41   35        Radim Vrbata, RW   VAN   79   31   32   63    6   20   0.80   
42  NaN      Jaden Schwartz, LW   STL   75   28   35   63   13   16   0.84   
43  NaN       Filip Forsberg, C   NSH   82   26   37   63   15   24   0.77   
44  NaN       Jordan Eberle, RW   EDM   81   24   39   63  -16   24   0.78   
45  NaN        Ondrej Palat, LW    TB   75   16   47   63   31   24   0.84   
46   40         Zach Parise, LW   MIN   74   33   29   62   21   41   0.84   

     10    11   12   13   14   15   16  
0   SOG   PCT  GWG    G    A    G    A  
1   253  13.8    6   10   13    2    3  
2   278  13.7    8   13   18    0    1  
3   237  11.8    3   10   21    0    0  
4   395  13.4   11   25    9    0    0  
5   221  10.0    3   11   22    0    0  
6   153  11.8    3    3   30    0    0  
7   280  13.2    5   13   16    0    0  
8   158  19.6    5    6   10    0    0  
9   226   8.9    5    4   21    0    0  
10  264  14.0    6    8   10    0    0  
11  NaN   NaN  NaN  NaN  NaN  NaN  NaN  
12  SOG   PCT  GWG    G    A    G    A  
13  182  17.0    3   11   15    0    0  
14  279   9.0    4   14   23    0    0  
15  101  17.8    0    5   20    0    0  
16  268  16.0    6   13   12    0    0  
17  203  14.3    6    8    9    0    0  
18  202  12.9    0    7   19    2    0  
19  261  14.2    5   19   12    0    0  
20  212  13.2    4    9   17    0    0  
21  191  13.1    6    3   10    0    2  
22  304  13.8    8    6    6    4    1  
23  NaN   NaN  NaN  NaN  NaN  NaN  NaN  
24  SOG   PCT  GWG    G    A    G    A  
25  302  12.3   10    7    4    3    2  
26  263  10.3    4    6   18    2    0  
27  192  14.6    7    6   11    2    1  
28  292   7.2    3    6   24    0    0  
29  227   7.5    3    4   24    0    0  
30  165  15.8    5    8   16    0    0  
31  131  12.2    0    4   18    0    0  
32  190  14.7    2    2   13    0    0  
33  186  14.5    5    6   16    0    0  
34  157  16.6    6    5    8    1    0  
35  NaN   NaN  NaN  NaN  NaN  NaN  NaN  
36  SOG   PCT  GWG    G    A    G    A  
37  223  10.8    5    8   16    0    0  
38  215  11.2    6    4   12    1    0  
39  167  14.4    4    8   13    0    0  
40  134  11.9    4    6   18    0    0  
41  267  11.6    7   12   11    0    0  
42  184  15.2    4    8    8    0    2  
43  237  11.0    6    6   13    0    0  
44  183  13.1    2    6   15    0    0  
45  139  11.5    5    3    8    1    1  
46  259  12.7    3   11    5    0    0