pandas 在熊猫中读取没有标题的制表符分隔数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24582329/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:13:42  来源:igfitidea点击:

reading tab-delimited data without header in pandas

pythonpandasdataframetab-delimited

提问by biohazard

I'm having trouble using pandasto open tab-delimited data without headers.

我无法pandas打开没有标题的制表符分隔数据。

My test data (actually contains 200 lines, of which I am showing the first 10):

我的测试数据(实际上包含 200 行,其中我展示了前 10 行):

Tag19184    CTAAC   hffef   1   a   36  -   chr1    10006   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10012   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10018   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10024   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10030   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10036   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10042   0   36M 36
Tag20198    CTAAC   hffef   1   a   36  -   chr1    10048   0   36M 36
Tag20198    CTAAC   hffef   1   a   36  -   chr1    10054   0   36M 36
Tag45093    CTAAC   hffef   1   a   36  -   chr1    10060   0   36M 36

My code:

我的代码:

import pandas as pd
df = pd.read_csv('in_test.txt',sep='\t',header=None)
print df

However, I get the following output, which I don't think I can use to further process data (?):

但是,我得到以下输出,我认为我不能用它来进一步处理数据 (?):

<class 'pandas.core.frame.DataFrame'>
Int64Index: 200 entries, 0 to 199
Data columns:
X.1     200  non-null values
X.2     200  non-null values
X.3     200  non-null values
X.4     200  non-null values
X.5     200  non-null values
X.6     200  non-null values
X.7     200  non-null values
X.8     200  non-null values
X.9     200  non-null values
X.10    200  non-null values
X.11    200  non-null values
X.12    200  non-null values
dtypes: int64(5), object(7)

The tutorial heresuggests that print dfshould just give me the corresponding data frame. What am I doing wrong?

这里教程建议print df应该只给我相应的数据框。我究竟做错了什么?

采纳答案by CT Zhu

I think you are getting the it read correctly, but:

我认为您正确读取了它,但是:

  1. See: change pandas 0.13.0 "print dataframe" to print dataframe like in earlier versions, this is what pandas do in the older versions. So, update will solve it.
  2. You can use ipython notebook, where DataFrameswill show up as HTML tables.
  3. You can use df.head(5)(similar to r's head) to get the first a few rows just to make sure your DataFrameis correct.
  1. 请参阅:将 pandas 0.13.0 "print dataframe" 更改为像早期版本中那样打印数据帧,这是 Pandas 在旧版本中所做的。所以,更新将解决它。
  2. 您可以使用ipython notebook, whereDataFrames将显示为 HTML 表格。
  3. 您可以使用df.head(5)(类似于r's head)来获取前几行,以确保您DataFrame的正确。