pandas 读取 ASCII 格式的表格

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30079299/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:19:17  来源:igfitidea点击:

pandas read ASCII formatted table

pythoncsvpandasasciiascii-art

提问by denfromufa

EDIT:

编辑:

I found partial answer here:

我在这里找到了部分答案:

https://stackoverflow.com/a/26551913/2230844

https://stackoverflow.com/a/26551913/2230844

https://stackoverflow.com/a/15026839/2230844

https://stackoverflow.com/a/15026839/2230844



How can I read in pandas such ASCII formatted table:

我如何在 Pandas 中读取这样的 ASCII 格式表:

      ----------------------------------------------------
      |   col1         col2         col3          col4   |
      ------------ ------------ ------------ -------------
 1002 0.402397E-01 0.883513E-02 0.450885E-01 0.118748E-02
 1003 0.105235     0.474509E-02 0.118508     0.168397E-03
 1004 0.102625     0.225842E-02 0.317864E-02 0.997383    
    1 0.603750     0.475112E-01 0.679590     0.114713E-02
    2 0.534171E-01 0.119815E-01 0.600187E-01 0.830949E-04
    3 0.283291E-01 0.119353E-01 0.317530E-01 0.243996E-04
  104 0.739759E-02 0.463873E-02 0.827061E-02 0.145207E-05
     -----------------------------------------------------

I noticed this answer using read_fwf(), but it requires to manually specify the widths of columns:

我注意到这个答案使用read_fwf(),但它需要手动指定列的宽度:

Reading from file a hierarchical ascii table using Pandas

使用 Pandas 从文件中读取分层 ascii 表

回答by fixxxer

Assuming that your ascii data is in a string, x:

假设你的 ascii 数据是一个字符串,x

In [1099]: x
Out[1099]: '      ----------------------------------------------------\n      |   col1         col2         col3          col4   |\n      ------------ ------------ ------------ -------------\n 1002 0.402397E-01 0.883513E-02 0.450885E-01 0.118748E-02\n 1003 0.105235     0.474509E-02 0.118508     0.168397E-03\n 1004 0.102625     0.225842E-02 0.317864E-02 0.997383    \n    1 0.603750     0.475112E-01 0.679590     0.114713E-02\n    2 0.534171E-01 0.119815E-01 0.600187E-01 0.830949E-04\n    3 0.283291E-01 0.119353E-01 0.317530E-01 0.243996E-04\n  104 0.739759E-02 0.463873E-02 0.827061E-02 0.145207E-05\n     -----------------------------------------------------'

A few options available in pd.read_csv can get you to this dataframe:

pd.read_csv 中的一些可用选项可以让您访问此数据框:

   In [1123]: pd.read_csv(StringIO(x), sep=' ', skipfooter=1, skiprows=1, skipinitialspace=True).drop([0])
Out[1123]: 
      |          col1          col2          col3      col4  |.1
1  1002  0.402397E-01  0.883513E-02  0.450885E-01  0.001187  NaN
2  1003      0.105235  0.474509E-02      0.118508  0.000168  NaN
3  1004      0.102625  0.225842E-02  0.317864E-02  0.997383  NaN
4     1      0.603750  0.475112E-01      0.679590  0.001147  NaN
5     2  0.534171E-01  0.119815E-01  0.600187E-01  0.000083  NaN
6     3  0.283291E-01  0.119353E-01  0.317530E-01  0.000024  NaN
7   104  0.739759E-02  0.463873E-02  0.827061E-02  0.000001  NaN