python pandas没有从csv文件中读取第一列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21902080/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python pandas not reading first column from csv file
提问by user308827
I have a simple 2 column csv file called st1.csv:
我有一个简单的 2 列 csv 文件,名为 st1.csv:
GRID St1
1457 614
1458 657
1459 679
1460 732
1461 754
1462 811
1463 748
However, when I try to read the csv file, the first column is not loaded:
但是,当我尝试读取 csv 文件时,未加载第一列:
a = pandas.DataFrame.from_csv('st1.csv')
a.columns
outputs:
输出:
Index([u'ST1'], dtype=object)
Why is the first column not being read?
为什么没有读取第一列?
采纳答案by Ewan
Judging by your data it looks like the delimiter you're using is a .
从您的数据来看,您使用的分隔符似乎是.
Try the following:
请尝试以下操作:
a = pandas.DataFrame.from_csv('st1.csv', sep=' ')
The other issue is that it's assuming your first column is an index, which we can also disable:
另一个问题是它假设你的第一列是一个索引,我们也可以禁用它:
a = pandas.DataFrame.from_csv('st1.csv', index_col=None)
回答by Muzaffar Omer
Based on documentation which compares read_csvand from_csv, it shows that it is possible to put index_col = None. I tried the below and it worked:
根据比较read_csv和 的文档from_csv,它表明可以将index_col = None. 我尝试了以下方法并且它有效:
DataFrame.from_csv('st1.csv', index_col=None);
This assumes that the data is comma-separated.
这假设数据是逗号分隔的。
Please check the below link
请检查以下链接
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.from_csv.html
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.from_csv.html
回答by Matt Messersmith
For newer versions of pandas, pd.DataFrame.from_csvdoesn't exist anymore, and index_col=Noneno longer does the trick with pd.read_csv. You'll want to use pd.read_csvwith index_col=Falseinstead:
对于较新版本的熊猫,pd.DataFrame.from_csv不再存在,并且index_col=None不再使用pd.read_csv. 你会想用pd.read_csvwithindex_col=False代替:
pd.read_csv('st1.csv', index_col=False)
Example:
例子:
(so) URSA-MattM-MacBook:stackoverflow mmessersmith$ cat input.csv
Date Employee Operation Order
2001-01-01 08:32:17 User1 Approved #00045
2001-01-01 08:36:23 User1 Edited #00045
2001-01-01 08:41:04 User1 Rejected #00046
2001-01-01 08:42:56 User1 Deleted #00046
2001-01-02 09:01:11 User1 Created #00047
2019-10-03 17:23:45 User1 Approved #72681
(so) URSA-MattM-MacBook:stackoverflow mmessersmith$ python
Python 3.7.4 (default, Aug 13 2019, 15:17:50)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'0.25.1'
>>> df_bad_index = pd.read_csv('input.csv', delim_whitespace=True)
>>> df_bad_index
Date Employee Operation Order
2001-01-01 08:32:17 User1 Approved #00045
2001-01-01 08:36:23 User1 Edited #00045
2001-01-01 08:41:04 User1 Rejected #00046
2001-01-01 08:42:56 User1 Deleted #00046
2001-01-02 09:01:11 User1 Created #00047
2019-10-03 17:23:45 User1 Approved #72681
>>> df_bad_index.index
Index(['2001-01-01', '2001-01-01', '2001-01-01', '2001-01-01', '2001-01-02',
'2019-10-03'],
dtype='object')
>>> df_still_bad_index = pd.read_csv('input.csv', delim_whitespace=True, index_col=None)
>>> df_still_bad_index
Date Employee Operation Order
2001-01-01 08:32:17 User1 Approved #00045
2001-01-01 08:36:23 User1 Edited #00045
2001-01-01 08:41:04 User1 Rejected #00046
2001-01-01 08:42:56 User1 Deleted #00046
2001-01-02 09:01:11 User1 Created #00047
2019-10-03 17:23:45 User1 Approved #72681
>>> df_still_bad_index.index
Index(['2001-01-01', '2001-01-01', '2001-01-01', '2001-01-01', '2001-01-02',
'2019-10-03'],
dtype='object')
>>> df_good_index = pd.read_csv('input.csv', delim_whitespace=True, index_col=False)
>>> df_good_index
Date Employee Operation Order
0 2001-01-01 08:32:17 User1 Approved
1 2001-01-01 08:36:23 User1 Edited
2 2001-01-01 08:41:04 User1 Rejected
3 2001-01-01 08:42:56 User1 Deleted
4 2001-01-02 09:01:11 User1 Created
5 2019-10-03 17:23:45 User1 Approved
>>> df_good_index.index
RangeIndex(start=0, stop=6, step=1)

