Pandas - Python,根据日期列删除行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28629154/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:32:22  来源:igfitidea点击:

Pandas - Python, deleting rows based on Date column

pythondatedatetimepandas

提问by Colin O'Brien

I'm trying to delete rows of a dataframe based on one date column; [Delivery Date]

我正在尝试根据一个日期列删除数据框的行; [Delivery Date]

I need to delete rows which are older than 6 months old but not equal to the year '1970'.

我需要删除超过 6 个月但不等于“1970”年的行。

I've created 2 variables:

我创建了 2 个变量:

from datetime import date, timedelta
sixmonthago = date.today() - timedelta(188)

import time
nineteen_seventy = time.strptime('01-01-70', '%d-%m-%y')

but I don't know how to delete rows based on these two variables, using the [Delivery Date]column.

但我不知道如何使用[Delivery Date]列删除基于这两个变量的行。

Could anyone provide the correct solution?

谁能提供正确的解决方案?

采纳答案by EdChum

You can just filter them out:

你可以过滤掉它们:

df[(df['Delivery Date'].dt.year == 1970) | (df['Delivery Date'] >= sixmonthago)]

This returns all rows where the year is 1970 or the date is less than 6 months.

这将返回年份为 1970 年或日期小于 6 个月的所有行。

You can use boolean indexing and pass multiple conditions to filter the df, for multiple conditions you need to use the array operators so |instead of or, and parentheses around the conditions due to operator precedence.

您可以使用布尔索引并传递多个条件来过滤 df,对于多个条件,您需要使用数组运算符 so|而不是or, 并且由于运算符优先级而在条件周围加上括号。

Check the docs for an explanation of boolean indexing

检查文档以了解布尔索引的解释

回答by andrewwowens

Be sure the calculation itself is accurate for "6 months" prior. You may not want to be hardcoding in 188 days. Not all months are made equally.

确保计算本身在“6 个月”之前是准确的。您可能不想在 188 天内进行硬编码。并非所有月份都是一样的。

from datetime import date
from dateutil.relativedelta import relativedelta

#http://stackoverflow.com/questions/546321/how-do-i-calculate-the-date-six-months-from-the-current-date-using-the-datetime
six_months = date.today() - relativedelta( months = +6 )

Then you can apply the following logic.

然后您可以应用以下逻辑。

import time
nineteen_seventy = time.strptime('01-01-70', '%d-%m-%y')

df = df[(df['Delivery Date'].dt.year == nineteen_seventy.tm_year) | (df['Delivery Date'] >= six_months)]

If you truly want to drop sections of the dataframe, you can do the following:

如果您真的想删除数据框的部分,您可以执行以下操作:

df = df[(df['Delivery Date'].dt.year != nineteen_seventy.tm_year) | (df['Delivery Date'] < six_months)].drop(df.columns)