从 TimeDelta 到 Pandas 中的浮动天数

Question

提问by alpagarou

I have a TimeDelta column with values that look like this:

我有一个 TimeDelta 列，其值如下所示：

2 days 21:54:00.000000000

2 天 21:54:00.000000000

I would like to have a float representing the number of days, let's say here 2+21/24 = 2.875, neglecting the minutes. Is there a simple way to do this ? I saw an answer suggesting

我想要一个代表天数的浮点数，假设这里是 2+21/24 = 2.875，忽略分钟。有没有一种简单的方法可以做到这一点？我看到一个答案建议

res['Ecart_lacher_collecte'].apply(lambda x: float(x.item().days+x.item().hours/24.))

But I get "AttributeError: 'str' object has no attribute 'item' "

但我得到“AttributeError: 'str' object has no attribute 'item'”

Numpy version is '1.10.4' Pandas version is u'0.17.1'

Numpy 版本是 '1.10.4' Pandas 版本是 u'0.17.1'

The columns has originally been obtained with:

这些列最初是通过以下方式获得的：

lac['DateHeureLacher'] = pd.to_datetime(lac['Date lacher']+' '+lac['Heure lacher'],format='%d/%m/%Y %H:%M:%S')
cap['DateCollecte'] = pd.to_datetime(cap['Date de collecte']+' '+cap['Heure de collecte'],format='%d/%m/%Y %H:%M:%S')

in a first script. Then in a second one:

在第一个脚本中。然后在第二个：

res = pd.merge(lac, cap, how='inner', on=['Loc'])
res['DateHeureLacher']  = pd.to_datetime(res['DateHeureLacher'],format='%Y-%m-%d %H:%M:%S')
res['DateCollecte']  = pd.to_datetime(res['DateCollecte'],format='%Y-%m-%d %H:%M:%S')
res['Ecart_lacher_collecte'] = res['DateCollecte'] - res['DateHeureLacher']

Maybe saving it to csv change their types back to string? The transformation I'm trying to do is in a third script.

也许将它保存到 csv 将它们的类型改回字符串？我正在尝试进行的转换是在第三个脚本中。

Sexe_x  PiegeLacher latL    longL   Loc Col_x   DateHeureLacher Nb envolees PiegeCapture    latC    longC   Col_y   Sexe_y  Effectif    DateCollecte    DatePose    Ecart_lacher_collecte   Dist_m
M   Q0-002  1629238 237877  H   Rouge   2011-02-04 17:15:00 928 Q0-002  1629238 237877  Rouge   M   1   2011-02-07 15:09:00 2011-02-07 12:14:00 2 days 21:54:00.000000000   0
M   Q0-002  1629238 237877  H   Rouge   2011-02-04 17:15:00 928 Q0-002  1629238 237877  Rouge   M   4   2011-02-07 12:14:00 2011-02-07 09:42:00 2 days 18:59:00.000000000   0
M   Q0-002  1629238 237877  H   Rouge   2011-02-04 17:15:00 928 Q0-003  1629244 237950  Rouge   M   1   2011-02-07 15:10:00 2011-02-07 12:16:00 2 days 21:55:00.000000000   75

res.info():

资源信息（）：

Sexe_x                   922 non-null object
PiegeLacher              922 non-null object
latL                     922 non-null int64
longL                    922 non-null int64
Loc                      922 non-null object
Col_x                    922 non-null object
DateHeureLacher          922 non-null object
Nb envolees              922 non-null int64
PiegeCapture             922 non-null object
latC                     922 non-null int64
longC                    922 non-null int64
Col_y                    922 non-null object
Sexe_y                   922 non-null object
Effectif                 922 non-null int64
DateCollecte             922 non-null object
DatePose                 922 non-null object
Ecart_lacher_collecte    922 non-null object
Dist_m                   922 non-null int64

Answer 1

回答by jpp

You can use pd.to_timedeltaor np.timedelta64to define a duration and divide by this:

您可以使用pd.to_timedelta或np.timedelta64定义持续时间并除以：

# set up as per @EdChum
df['total_days_td'] = df['time_delta'] / pd.to_timedelta(1, unit='D')
df['total_days_td'] = df['time_delta'] / np.timedelta64(1, 'D')

Answer 2

回答by EdChum

You can use dt.total_secondsand divide this by the total number of seconds in a day, example:

您可以使用dt.total_seconds并将其除以一天中的总秒数，例如：

In [25]:
df = pd.DataFrame({'dates':pd.date_range(dt.datetime(2016,1,1, 12,15,3), periods=10)})
df

Out[25]:
                dates
0 2016-01-01 12:15:03
1 2016-01-02 12:15:03
2 2016-01-03 12:15:03
3 2016-01-04 12:15:03
4 2016-01-05 12:15:03
5 2016-01-06 12:15:03
6 2016-01-07 12:15:03
7 2016-01-08 12:15:03
8 2016-01-09 12:15:03
9 2016-01-10 12:15:03

In [26]:
df['time_delta'] = df['dates'] - pd.datetime(2015,11,6,8,10)
df

Out[26]:
                dates       time_delta
0 2016-01-01 12:15:03 56 days 04:05:03
1 2016-01-02 12:15:03 57 days 04:05:03
2 2016-01-03 12:15:03 58 days 04:05:03
3 2016-01-04 12:15:03 59 days 04:05:03
4 2016-01-05 12:15:03 60 days 04:05:03
5 2016-01-06 12:15:03 61 days 04:05:03
6 2016-01-07 12:15:03 62 days 04:05:03
7 2016-01-08 12:15:03 63 days 04:05:03
8 2016-01-09 12:15:03 64 days 04:05:03
9 2016-01-10 12:15:03 65 days 04:05:03

In [27]:
df['total_days_td'] = df['time_delta'].dt.total_seconds() / (24 * 60 * 60)
df

Out[27]:
                dates       time_delta  total_days_td
0 2016-01-01 12:15:03 56 days 04:05:03      56.170174
1 2016-01-02 12:15:03 57 days 04:05:03      57.170174
2 2016-01-03 12:15:03 58 days 04:05:03      58.170174
3 2016-01-04 12:15:03 59 days 04:05:03      59.170174
4 2016-01-05 12:15:03 60 days 04:05:03      60.170174
5 2016-01-06 12:15:03 61 days 04:05:03      61.170174
6 2016-01-07 12:15:03 62 days 04:05:03      62.170174
7 2016-01-08 12:15:03 63 days 04:05:03      63.170174
8 2016-01-09 12:15:03 64 days 04:05:03      64.170174
9 2016-01-10 12:15:03 65 days 04:05:03      65.170174

Answer 3

回答by sharinganSawant

Have you tried using this instead?

你试过用这个代替吗？

res['Ecart_lacher_collecte'].apply(lambda x: (x.total_seconds()//(3600*24)) + (x.total_seconds()%(3600*24)//3600)/24))

The first term is the Day ( 2 in your case ) The second term is the hour ratio neglecting the minutes ( 21/24 in your case)

第一项是天（在您的情况下为 2）第二项是忽略分钟的小时比率（在您的情况下为 21/24）

If you don't want the minutes and seconds data to be neglected, and rather need a ratio which considers all the seconds in the day, the code is as mentioned below:

如果您不想忽略分秒数据，而是需要一个考虑当天所有秒数的比率，则代码如下：

res['Ecart_lacher_collecte'].apply(lambda x: (x.total_seconds()/(3600*24))

从 TimeDelta 到 Pandas 中的浮动天数

提问by alpagarou

回答by jpp

回答by EdChum

回答by sharinganSawant

相关推荐

最近更新

标签

从 TimeDelta 到 Pandas 中的浮动天数

提问by alpagarou

回答by jpp

回答by EdChum

回答by sharinganSawant

相关推荐

带超链接的 Pandas read_excel

在 Pandas 中按年份和 ID 求和

将两列设置为 Pandas 数据框中的索引以进行时间序列分析

pandas 将类别列表打印为列

相关推荐

最近更新

标签