pandas 使用str中的常量值在pandas df中添加日期列

Question

提问by Shubham R

i have a table in pandas df

我在Pandas df 中有一张桌子

    product_id_x    product_id_y    count
0   2727846            7872456       1
1   29234              2932348       2
2   29346              9137500       1
3   29453              91365738      1
4   2933666            91323494      1

i want to add a new column 'dates' which i have defined in a str.

我想添加一个我在 str 中定义的新列“日期”。

dateSelect = "'2016-11-06'"

so i added a new constant column

所以我添加了一个新的常量列

df['dates'] = dateSelect

but i am getting result as

但我得到的结果是

   product_id_x   product_id_y    count   dates
0   2727846          7872456         1  '2016-11-06'
1   29234            2932348         2  '2016-11-06'
2   29346            9137500         1  '2016-11-06'
3   29453            91365738        1  '2016-11-06'
4   2933666          91323494        1  '2016-11-06'

the values in the dates are coming in quotes. and

日期中的值用引号引起来。和

type(df['dates']) = str

but i want it in date format, because further i am going to store this table in my mysql database. and i want the type to be date.

但我希望它采用日期格式，因为我将进一步将此表存储在我的 mysql 数据库中。我希望类型是日期。

from sqlalchemy import create_engine
engine = create_engine('mysql+mysqldb://name:[email protected]/dbname', echo=False)
df.to_sql(name='tablename', con=engine, if_exists = 'append', index=False)

Answer 1

回答by jezrael

I think you can use first replace'by empty space and then to_datetime:

我认为您可以先使用replace'空格，然后使用to_datetime：

dateSelect = pd.to_datetime("'2016-11-06'".replace("'",""))
print (dateSelect)
2016-11-06 00:00:00

print (type(dateSelect))
<class 'pandas.tslib.Timestamp'>

df['dates'] = pd.to_datetime("'2016-11-06'".replace("'",""))

print (df)
   product_id_x  product_id_y  count      dates
0       2727846       7872456      1 2016-11-06
1         29234       2932348      2 2016-11-06
2         29346       9137500      1 2016-11-06
3         29453      91365738      1 2016-11-06
4       2933666      91323494      1 2016-11-06

print (df.dtypes)
product_id_x             int64
product_id_y             int64
count                    int64
dates           datetime64[ns]
dtype: object

Answer 2

回答by piRSquared

most direct route

最直接的路线

df['dates'] = pd.Timestamp('2016-11-06')
df

   product_id_x  product_id_y  count      dates
0       2727846       7872456      1 2016-11-06
1         29234       2932348      2 2016-11-06
2         29346       9137500      1 2016-11-06
3         29453      91365738      1 2016-11-06
4       2933666      91323494      1 2016-11-06

Answer 3

回答by Vivek Kalyanarangan

Ahh! @jezrael got there first...

啊！@jezrael 首先到达那里...

 print timeit.timeit("""
import pandas as pd
import datetime as dt
import timeit
df = pd.read_csv('date_time_pandas.csv')
dateSelect_str = "2016-11-06"

# using standard datetime
dateSelect = dt.datetime.strptime(dateSelect_str,"%Y-%m-%d")
df['dates'] = dateSelect
#print(df['dates'])
""",number=100)


# Alternate method using pandas datetime
print timeit.timeit("""
import pandas as pd
import datetime as dt
import timeit
df = pd.read_csv('date_time_pandas.csv')
dateSelect_str = "2016-11-06"

dateSelect = pd.to_datetime(dateSelect_str, format='%Y-%m-%d', errors='ignore')
df['dates'] = dateSelect
#print df['dates']
""",number=100)

gives output -

给出输出 -

0.228258825751
0.167258402887

on an average.

平均而言。

ConclusionUsing pd_datetime in this case is more efficient

结论在这种情况下使用 pd_datetime 更有效

Answer 4

回答by Chandan

In it don't put double quote avoiding to define it as string.

在其中不要放双引号，避免将其定义为字符串。

dateSelect = '2016-11-06'  
df['dates'] = dateSelect

pandas 使用str中的常量值在pandas df中添加日期列

提问by Shubham R

回答by jezrael

回答by piRSquared

回答by Vivek Kalyanarangan

回答by Chandan

相关推荐

最近更新

标签

pandas 使用str中的常量值在pandas df中添加日期列

提问by Shubham R

回答by jezrael

回答by piRSquared

回答by Vivek Kalyanarangan

回答by Chandan

相关推荐

pandas 如何使用 sklearn FeatureHasher？

使用带有 zip 压缩的 Pandas read_csv

使用 python/pandas 将月、日、年转换为月、年？

带有排序值的 Pandas 堆积条形图

相关推荐

最近更新

标签