pandas 熊猫 read_csv 中的日期和时间
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/17797384/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Date and time in pandas read_csv
提问by Bastian L?ffler
my data looks like this:
我的数据是这样的:
06.02.2013;13:00;0,215;0,215;0,185;0,205;0,00
I try to read it this way:
s = pandas.read_csv(csv_file, sep=';', skiprows=3, index_col=[0],decimal=',',thousands='.',parse_dates={'Date': [0, 1]}, dayfirst=True)
我试着这样读:
s = pandas.read_csv(csv_file, sep=';', skiprows=3, index_col=[0],decimal=',',thousands='.',parse_dates={'Date': [0, 1]}, dayfirst=True)
(see http://www.nuclearphynance.com/Show%20Post.aspx?PostIDKey=164080https://github.com/pydata/pandas/issues/2586)
(见http://www.nuclearphynance.com/Show%20Post.aspx?PostIDKey=164080 https://github.com/pydata/pandas/issues/2586)
This is what I get:
这就是我得到的:
6022013.0 13:00       0.215  0.215  0.185    0.205        0
What am I doing wrong?
我究竟做错了什么?
回答by Andy Hayden
This was a bug fixed in pandas 0.13+(thanks to this issue):
这是Pandas 0.13+ 中修复的错误(感谢这个问题):
In [11]: pd.read_csv(StringIO(s), sep=';', header=None, parse_dates={'Dates': [0, 1]},
                     index_col=0, decimal=',', thousands=".")
Out[11]:
                            2      3      4      5  6
Dates
2013-06-02 13:00:00  1000.215  0.215  0.185  0.205  0
回答by nitin
I am not sure if this is a bug. See below,
我不确定这是否是一个错误。见下文,
My data file looks like so,
我的数据文件看起来像这样,
date; time; col1; col2; col3; col4; col5
06.02.2013 ; 13:00 ; 0,215 ; 0,215 ; 0,185 ; 0,205 ; 0,00
06.02.2013 ; 13:00 ; 0,215 ; 0,215 ; 0,185 ; 0,205 ; 0,00
I implement the following code on it,
我在上面实现了以下代码,
import pandas
s = pandas.read_csv('test.txt', decimal=',',sep=';', parse_dates=True, index_col=[0])
print s
To get,
要得到,
               time   col1   col2   col3   col4   col5
date                                                  
2013-06-02   13:00   0.215  0.215  0.185  0.205      0
2013-06-02   13:00   0.215  0.215  0.185  0.205      0
Is this the result you want.
这是你想要的结果吗。
Please make sure that you are using the latest pandas version
请确保您使用的是最新的 Pandas 版本
'0.11.0'
To deal with the thousands operators... you could use
处理成千上万的操作员......你可以使用
s = pandas.read_csv('test2.txt',sep=';',decimal=',', parse_dates=True, index_col=[0],converters={'col1':lambda x: float(x.replace('.','').replace(',','.'))})
回答by Bastian L?ffler
Ok, when running your example file date-parsing works. However, my data looks like this:
好的,当运行您的示例文件日期解析工作时。但是,我的数据如下所示:
Datum;Zeit;Er<F6>ffnung;Hoch;Tief;Schluss;Volumen
02.08.2013;14:00;8.428,58;8.431,67;8.376,28;8.406,94;73.393.682,00
01.08.2013;14:00;8.320,38;8.411,30;8.316,89;8.410,73;97.990.435,00
In that case, date does not get recognized:
在这种情况下,日期不会被识别:
s = pd.read_csv('test1.csv', decimal=',',sep=';', parse_dates=True, index_col=[0])
print s
....                                                            
02.08.2013  14:00  8.428,58  8.431,67  8.376,28  8.406,94   73.393.682,00
01.08.2013  14:00  8.320,38  8.411,30  8.316,89  8.410,73   97.990.435,00
For me the only difference between your file and mine is the missing spaces between the separators ;?
对我来说,你的文件和我的文件之间的唯一区别是分隔符之间缺少空格;?

