无法从 read_csv 在 Pandas 数据框中索引日期

Question

提问by prre72

I came across a problem today that I unable to solve. I read a csv file using

我今天遇到了一个我无法解决的问题。我使用读取了一个 csv 文件

mydata = pd.read_csv(file_name, header=0, sep=",", index_col=[0], parse_dates=True)

the CSV looks like:

CSV 看起来像：

2009-12-10,5,6,7,8,9  
2009-12-11,7,6,6,7,9

instead of getting an indexed dataframe i get the following output

我得到以下输出，而不是获得索引数据帧

print mydata

Empty DataFrame
Columns: []
Index: [2009-12-10,5,6,7,8,9 2009-12-11,7,6,6,7,9]

Please help!! I have been trying for 2 hours now!

请帮忙！！我已经尝试了2个小时了！

Many thanks

非常感谢

Answer 1

回答by hernamesbarbara

I think your code works. Here's what I see:

我认为你的代码有效。这是我所看到的：

The data:

数据：

import pandas as pd

data = """2009-12-10,5,6,7,8,9
2009-12-11,7,6,6,7,9"""

Read the data from the csv.

从 csv 中读取数据。

ts = pd.read_csv(pd.io.parsers.StringIO(data),
    names=['timepoint', 'a','b','c','d','e'],
    parse_dates=True,
    index_col=0)

That looks like this

看起来像这样

In [59]: ts
Out[59]:
            a  b  c  d  e
timepoint
2009-12-10  5  6  7  8  9
2009-12-11  7  6  6  7  9

And the index is a time series

并且指数是一个时间序列

In [60]: ts.index
Out[60]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2009-12-10 00:00:00, 2009-12-11 00:00:00]
Length: 2, Freq: None, Timezone: None

Can you give this a try and post an update if you get different results?

如果您得到不同的结果，您可以尝试一下并发布更新吗？

UPDATE:In response to @prre72's comment regarding column headers in the csv file:

更新：回应@prre72 关于 csv 文件中列标题的评论：

If the csv has 5 column headers with the index column being unlabeled, you can do this:

如果 csv 有 5 个列标题且索引列未标记，则可以执行以下操作：

In [17]: 
data = """"a","b","c","d","e"
2009-12-10,5,6,7,8,9
2009-12-11,7,6,6,7,9"""

ts = pd.read_csv(pd.io.parsers.StringIO(data),
    parse_dates=True,
    index_col=0)

In [18]: ts
Out[18]:
            a  b  c  d  e
2009-12-10  5  6  7  8  9
2009-12-11  7  6  6  7  9

In [19]: ts.index
Out[19]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2009-12-10 00:00:00, 2009-12-11 00:00:00]
Length: 2, Freq: None, Timezone: None

Answer 2

回答by Yeqing Zhang

You need to use parse_dates=[0]to specify the date columns you want to parse. You don't have to sepcify header=0. Use header=Noneinstead, which won't force you specifying headers. Try this:

您需要使用parse_dates=[0]来指定要解析的日期列。你不必 sepcify header=0。header=None改为使用，这不会强制您指定标题。尝试这个：

mydata = pd.read_csv(file_name, header=None, sep=",", index_col=[0], 
    parse_dates=[0])
print mydata
            1  2  3  4  5
0                        
2009-12-10  5  6  7  8  9
2009-12-11  7  6  6  7  9

If you want to specify column names, just use this:

如果要指定列名，只需使用以下命令：

mydata.columns = list("abcde")  # list of column names

Answer 3

回答by Vaibhav Taneja

import pandas as pd
raw_dt = pd.read_csv("fileName.csv", import_dates = True, index_col = 0)
raw_dt

Now, when you execute this code, index_col = 0will treat the first column from your file as the index column and import_dates = Truewill parse columns containing dates in your file to date type.

现在，当您执行此代码时，index_col = 0会将文件中的第一列视为索引列，import_dates = True并将文件中包含日期的列解析为日期类型。

无法从 read_csv 在 Pandas 数据框中索引日期

提问by prre72

回答by hernamesbarbara

回答by Yeqing Zhang

回答by Vaibhav Taneja

相关推荐

最近更新

标签

无法从 read_csv 在 Pandas 数据框中索引日期

提问by prre72

回答by hernamesbarbara

回答by Yeqing Zhang

回答by Vaibhav Taneja

相关推荐

在 Pandas 中使用 groupby 的 TimeSeries

找不到 Python Pandas read_excel() 模块

Pandas group by 不起作用

使用 Pandas 拆分数据

相关推荐

最近更新

标签