使用 python/pandas 将月、日、年转换为月、年?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40744322/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:29:17  来源:igfitidea点击:

Convert month,day,year to month,year with python/pandas?

pythondatedatetimepandas

提问by Joan Triay

I have this kind of list of strings with 9000 rows where each row is month/day/year:

我有这种包含 9000 行的字符串列表,其中每一行是月/日/年:

10/30/2009
12/19/2009
4/13/2009
8/18/2007
7/17/2008
6/16/2009
1/14/2009
12/18/2007
9/14/2009
2/13/2006
3/25/2009
2/23/2007

I want convert it and only have the list with month/year if is it possible as dateformat, like this:

我想转换它,如果有可能作为日期格式,则只有带有月/年的列表,如下所示:

10/2009
12/2009
4/2009
8/2007
7/2008
6/2009
1/2009
12/2007
9/2009
2/2006
3/2009
2/2007

回答by jezrael

I think you can use first to_datetimeand then to_period:

我认为你可以先使用to_datetime,然后to_period

df.col = pd.to_datetime(df.col).dt.to_period('m')
print (df)
       col
0  2009-10
1  2009-12
2  2009-04
3  2007-08
4  2008-07
5  2009-06
6  2009-01
7  2007-12
8  2009-09
9  2006-02
10 2009-03
11 2007-02

print (type(df.loc[0,'col']))
<class 'pandas._period.Period'>

Or strftime:

strftime

df.col = pd.to_datetime(df.col).dt.strftime('%m/%Y')
print (df)
        col
0   10/2009
1   12/2009
2   04/2009
3   08/2007
4   07/2008
5   06/2009
6   01/2009
7   12/2007
8   09/2009
9   02/2006
10  03/2009
11  02/2007

print (type(df.loc[0,'col']))
<class 'str'>

Or replaceby regex:

replace通过regex

df.col = df.col.str.replace('/.+/','/')
print (df)
        col
0   10/2009
1   12/2009
2    4/2009
3    8/2007
4    7/2008
5    6/2009
6    1/2009
7   12/2007
8    9/2009
9    2/2006
10   3/2009
11   2/2007

print (type(df.loc[0,'col']))
<class 'str'>

回答by EdChum

You can use str.splitto build the strings:

您可以使用str.split来构建字符串:

In [32]:
df['date'] =df['date'].str.split('/').str[0] + '/'  + df['date'].str.split('/').str[-1]
df

Out[32]:
       date
0   10/2009
1   12/2009
2    4/2009
3    8/2007
4    7/2008
5    6/2009
6    1/2009
7   12/2007
8    9/2009
9    2/2006
10   3/2009
11   2/2007

回答by internetional

Or you could use a regular expression, if you prefer that kind of solution. This would solve your problem:

或者您可以使用正则表达式,如果您更喜欢那种解决方案。这将解决您的问题:

import re

res = re.sub(r"/\d\d?/", "/", s)

(Note that sis the date string, either as separate date strings or a long string containing all dates, and that you have your result bound to res.)

(请注意,这s是日期字符串,可以是单独的日期字符串,也可以是包含所有日期的长字符串,并且您的结果绑定到res.)