pandas Python 将时间戳与输入时间进行比较

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42213578/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:58:25  来源:igfitidea点击:

Python compare timestamp to input time

pythonpandasdataframetimestamp

提问by yusica

I have some dataframe with timestamps as a column, I want to filter rows between 8:00:00 to 17:00:00 with np.where. I keep getting error messages on data/object types. Any help would be appreciated

我有一些时间戳作为列的数据框,我想用 np.where 过滤 8:00:00 到 17:00:00 之间的行。我不断收到有关数据/对象类型的错误消息。任何帮助,将不胜感激

example:

例子:

timestamp    volume
2013-03-01 07:59:00    5
2013-03-01 08:00:00    6
2013-03-01 08:01:00    7
2013-03-01 08:02:00    8

Basically I want to end with:

基本上我想以:

2013-03-01 08:00:00    6
2013-03-01 08:01:00    7
2013-03-01 08:02:00    8

By using methods along the line of

通过使用沿线的方法

np.where(df['timestamp'] > dt.time('8:00:00')

采纳答案by MaxU

Try this:

尝试这个:

In [226]: df
Out[226]:
             timestamp  volume
0  2013-03-01 07:59:00       5
1  2013-03-01 08:00:00       6
2  2013-03-01 08:01:00       7
3  2013-03-01 08:02:00       8

In [227]: df.dtypes
Out[227]:
timestamp    object
volume        int64
dtype: object

In [228]: df['timestamp'] = pd.to_datetime(df['timestamp'], errors='coerce')

In [229]: df.dtypes
Out[229]:
timestamp    datetime64[ns]  # <---- it's `datetime64[ns]` now
volume                int64
dtype: object

In [230]: df.set_index('timestamp').between_time('08:00','17:00').reset_index()
Out[230]:
            timestamp  volume
0 2013-03-01 08:00:00       6
1 2013-03-01 08:01:00       7
2 2013-03-01 08:02:00       8

回答by saloua

You can use between

您可以使用 between

I Generated a sample dataframe with

我生成了一个示例数据帧

import datetime
d = {'timestamp': pd.Series([datetime.datetime.now() + 
          datetime.timedelta(hours=i) for i in range(20)]),
    'volume': pd.Series([s for s in range(20)])}
df = pd.DataFrame(d)

df['timeframe']is

df['timeframe']

0    2017-02-13 22:37:54.515840
1    2017-02-13 23:37:54.515859
2    2017-02-14 00:37:54.515865
3    2017-02-14 01:37:54.515870
4    2017-02-14 02:37:54.515878
5    2017-02-14 03:37:54.515884
6    2017-02-14 04:37:54.515888
...
17   2017-02-14 15:37:54.515939
18   2017-02-14 16:37:54.515943
19   2017-02-14 17:37:54.515948

df.dtypes

df.dtypes

timestamp    datetime64[ns]
volume                int64
dtype: object

As in your example dtypeof df['timestamp']is objectyou can do

作为你的例子dtypedf['timestamp']object,你可以做

df['timestamp'] = pd.to_datetime(df['timestamp'], coerce=True)

By setting param coerce=Trueif the conversion fails for any particular string then those rows are set to NaT.

coerce=True如果任何特定字符串的转换失败,则通过设置 param将这些行设置为NaT.

Then filtering can be done using betweenas below

然后过滤可以使用between如下

df[df.timestamp.dt.strftime('%H:%M:%S').between('11:00:00','18:00:00')]will return

df[df.timestamp.dt.strftime('%H:%M:%S').between('11:00:00','18:00:00')]将返回

13 2017-02-14 11:37:54.515922      13
14 2017-02-14 12:37:54.515926      14
15 2017-02-14 13:37:54.515930      15
16 2017-02-14 14:37:54.515935      16
17 2017-02-14 15:37:54.515939      17
18 2017-02-14 16:37:54.515943      18
19 2017-02-14 17:37:54.515948      19

回答by sameer_nubia

if you have a file with data as below : timestamp volume 2013-03-01 07:59:00 5 2013-03-01 08:00:00 6 2013-03-01 08:01:00 7 2013-03-01 08:02:00 8

如果您有一个包含以下数据的文件:时间戳卷 2013-03-01 07:59:00 5 2013-03-01 08:00:00 6 2013-03-01 08:01:00 7 2013-03-01 08:02:00 8

Then while reading only you can skip the first line and you will get output as timestamp volume 2013-03-01 08:00:00 6 2013-03-01 08:01:00 7 2013-03-01 08:02:00 8

然后,在只读时,您可以跳过第一行,您将获得时间戳卷的输出 2013-03-01 08:00:00 6 2013-03-01 08:01:00 7 2013-03-01 08:02:00 8

import pandas as pd
df=pd.read_csv("filename",skiprows=1)
print(df)