Python 熊猫:将日期“对象”转换为整数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50863691/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: convert date 'object' to int
提问by jabba
I have a Pandas dataframe and I need to convert a column with dates to int but unfortunately all the given solutions end up with errors (below)
我有一个 Pandas 数据框,我需要将带有日期的列转换为 int 但不幸的是所有给定的解决方案最终都出现错误(如下)
test_df.info()
<class 'pandas.core.frame.DataFrame'>
Data columns (total 4 columns):
Date 1505 non-null object
Avg 1505 non-null float64
TotalVol 1505 non-null float64
Ranked 1505 non-null int32
dtypes: float64(2), int32(1), object(1)
sample data:
样本数据:
Date Avg TotalVol Ranked
0 2014-03-29 4400.000000 0.011364 1
1 2014-03-30 1495.785714 4.309310 1
2 2014-03-31 1595.666667 0.298571 1
3 2014-04-01 1523.166667 0.270000 1
4 2014-04-02 1511.428571 0.523792 1
I think that I've tried everything but nothing works
我想我已经尝试了一切,但没有任何效果
test_df['Date'].astype(int):
TypeError: int() argument must be a string, a bytes-like object or a number, not 'datetime.date'
类型错误:int() 参数必须是字符串、类似字节的对象或数字,而不是“datetime.date”
test_df['Date']=pd.to_numeric(test_df['Date']):
TypeError: Invalid object type at position 0
类型错误:位置 0 处的对象类型无效
test_df['Date'].astype(str).astype(int):
ValueError: invalid literal for int() with base 10: '2014-03-29'
ValueError:int() 的无效文字,基数为 10:'2014-03-29'
test_df['Date'].apply(pd.to_numeric, errors='coerce'):
Converts the entire column to NaNs
将整列转换为 NaN
回答by Neroksi
The reason why test_df['Date'].astype(int)
gives you an error is that your dates still contain hyphens "-". First suppress them by doing test_df['Date'].str.replace("-","")
, then you can apply your first method to the resulting series. So the whole solution would be :
test_df['Date'].astype(int)
给你一个错误的原因是你的日期仍然包含连字符“ -”。首先通过执行 来抑制它们test_df['Date'].str.replace("-","")
,然后您可以将第一种方法应用于结果系列。所以整个解决方案是:
test_df['Date'].str.replace("-","").astype(int)
Note that this won't work if your "Date" column is not a string object, typically when Pandas has already parsed your series as TimeStamp. In this case you can use :
test_df['Date'].str.replace("-","").astype(int)
请注意,如果您的“日期”列不是字符串对象,这将不起作用,通常是当 Pandas 已经将您的系列解析为时间戳时。在这种情况下,您可以使用:
test_df['Date'].dt.strftime("%Y%m%d").astype(int)
回答by Rakesh
Looks like you need pd.to_datetime().dt.strftime("%Y%m%d")
.
看起来你需要pd.to_datetime().dt.strftime("%Y%m%d")
.
Demo:
演示:
import pandas as pd
df = pd.DataFrame({"Date": ["2014-03-29", "2014-03-30", "2014-03-31"]})
df["Date"] = pd.to_datetime(df["Date"]).dt.strftime("%Y%m%d")
print( df )
Output:
输出:
Date
0 20140329
1 20140330
2 20140331
回答by msolomon87
This should work
这应该工作
df['Date'] = pd.to_numeric(df.Date.str.replace('-',''))
print(df['Date'])
0 20140329
1 20140330
2 20140331
3 20140401
4 20140402