pandas 在熊猫中读取没有标题的制表符分隔数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24582329/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
reading tab-delimited data without header in pandas
提问by biohazard
I'm having trouble using pandasto open tab-delimited data without headers.
我无法pandas打开没有标题的制表符分隔数据。
My test data (actually contains 200 lines, of which I am showing the first 10):
我的测试数据(实际上包含 200 行,其中我展示了前 10 行):
Tag19184 CTAAC hffef 1 a 36 - chr1 10006 0 36M 36
Tag19184 CTAAC hffef 1 a 36 - chr1 10012 0 36M 36
Tag19184 CTAAC hffef 1 a 36 - chr1 10018 0 36M 36
Tag19184 CTAAC hffef 1 a 36 - chr1 10024 0 36M 36
Tag19184 CTAAC hffef 1 a 36 - chr1 10030 0 36M 36
Tag19184 CTAAC hffef 1 a 36 - chr1 10036 0 36M 36
Tag19184 CTAAC hffef 1 a 36 - chr1 10042 0 36M 36
Tag20198 CTAAC hffef 1 a 36 - chr1 10048 0 36M 36
Tag20198 CTAAC hffef 1 a 36 - chr1 10054 0 36M 36
Tag45093 CTAAC hffef 1 a 36 - chr1 10060 0 36M 36
My code:
我的代码:
import pandas as pd
df = pd.read_csv('in_test.txt',sep='\t',header=None)
print df
However, I get the following output, which I don't think I can use to further process data (?):
但是,我得到以下输出,我认为我不能用它来进一步处理数据 (?):
<class 'pandas.core.frame.DataFrame'>
Int64Index: 200 entries, 0 to 199
Data columns:
X.1 200 non-null values
X.2 200 non-null values
X.3 200 non-null values
X.4 200 non-null values
X.5 200 non-null values
X.6 200 non-null values
X.7 200 non-null values
X.8 200 non-null values
X.9 200 non-null values
X.10 200 non-null values
X.11 200 non-null values
X.12 200 non-null values
dtypes: int64(5), object(7)
The tutorial heresuggests that print dfshould just give me the corresponding data frame. What am I doing wrong?
采纳答案by CT Zhu
I think you are getting the it read correctly, but:
我认为您正确读取了它,但是:
- See: change pandas 0.13.0 "print dataframe" to print dataframe like in earlier versions, this is what pandas do in the older versions. So, update will solve it.
- You can use
ipython notebook, whereDataFrameswill show up as HTML tables. - You can use
df.head(5)(similar tor'shead) to get the first a few rows just to make sure yourDataFrameis correct.
- 请参阅:将 pandas 0.13.0 "print dataframe" 更改为像早期版本中那样打印数据帧,这是 Pandas 在旧版本中所做的。所以,更新将解决它。
- 您可以使用
ipython notebook, whereDataFrames将显示为 HTML 表格。 - 您可以使用
df.head(5)(类似于r'shead)来获取前几行,以确保您DataFrame的正确。

