Python 将熊猫数据框中的多列更改为日期时间

Question

提问by kwashington122

I have a dataframe of 13 columns and 55,000 rows I am trying to convert 5 of those rows to datetime, right now they are returning the type 'object' and I need to transform this data for machine learning I know that if I do

我有一个包含 13 列和 55,000 行的数据框，我正在尝试将其中的 5 行转换为日期时间，现在它们正在返回“对象”类型，我需要转换这些数据以进行机器学习我知道如果我这样做

data['birth_date'] = pd.to_datetime(data[birth_date], errors ='coerce')

it will return a datetime column but I want to do it for 4 other columns as well, is there one line that I can write to call all of them? I dont think I can index like

它将返回一个日期时间列，但我也想为其他 4 个列执行此操作，是否可以编写一行来调用所有这些列？我不认为我可以索引

data[:,7:12]

thanks!

谢谢！

Answer 1

回答by Ted Petrou

You can use applyto iterate through each column using pd.to_datetime

您可以使用apply迭代每列pd.to_datetime

data.iloc[:, 7:12] = data.iloc[:, 7:12].apply(pd.to_datetime, errors='coerce')

Answer 2

回答by mel el

my_df[['column1','column2']] =     
my_df[['column1','column2']].apply(pd.to_datetime, format='%Y-%m-%d %H:%M:%S.%f')

Note: of course the format can be changed as required.

注意：当然可以根据需要更改格式。

Answer 3

回答by SerialDev

If performance is a concern I would advice to use the following function to convert those columns to date_time:

如果性能是一个问题，我建议使用以下函数将这些列转换为 date_time：

def lookup(s):
    """
    This is an extremely fast approach to datetime parsing.
    For large data, the same dates are often repeated. Rather than
    re-parse these, we store all unique dates, parse them, and
    use a lookup to convert all dates.
    """
    dates = {date:pd.to_datetime(date) for date in s.unique()}
    return s.apply(lambda v: dates[v])

to_datetime: 5799 ms
dateutil:    5162 ms
strptime:    1651 ms
manual:       242 ms
lookup:        32 ms

Source: https://github.com/sanand0/benchmarks/tree/master/date-parse

来源：https: //github.com/sanand0/benchmarks/tree/master/date-parse

Answer 4

回答by smishra

If you rather want to convert at load time, you could do something like this

如果你想在加载时转换，你可以做这样的事情

date_columns = ['c1','c2', 'c3', 'c4', 'c5']
data = pd.read_csv('file_to_read.csv', parse_dates=date_columns)

Answer 5

回答by sgDysregulation

First you need to extract all the columns your interested in from datathen you can use pandas applymapto apply to_datetimeto each element in the extracted frame, I assume you know the index of the columns you want to extract, In the code below column names of the third to the sixteenth columns are extracted. you can alternatively define a list and add the names of the columns to it and use that in place, you may also need to pass the date/time format of the the DateTime entries

首先，您需要从中提取您感兴趣的所有列，data然后您可以使用 pandasapplymap将其应用于to_datetime提取的框架中的每个元素，我假设您知道要提取的列的索引，在第三个列名下面的代码中到第十六列被提取。您也可以定义一个列表并将列的名称添加到其中并在适当的位置使用它，您可能还需要传递 DateTime 条目的日期/时间格式

import pandas as pd

cols_2_extract = data.columns[2:15]

data[cols_2_extract] = data[cols_2_extract].applymap(lambda x : pd.to_datetime(x, format = '%d %M %Y'))

Python 将熊猫数据框中的多列更改为日期时间

提问by kwashington122

回答by Ted Petrou

回答by mel el

回答by SerialDev

回答by smishra

回答by sgDysregulation

相关推荐

最近更新

标签

Python 将熊猫数据框中的多列更改为日期时间

提问by kwashington122

回答by Ted Petrou

回答by mel el

回答by SerialDev

回答by smishra

回答by sgDysregulation

相关推荐

如何使用python和Opencv读取视频文件

Python django.core.exceptions.ImproperlyConfigured：无法加载 WSGI 应用程序“应用程序”

Python “for”循环中的 i = i + 1 和 i += 1 有什么区别？

Python Tkinter：按下按钮时调用函数

相关推荐

最近更新

标签