python-pandas-检查数据框中是否存在日期

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39893420/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:08:59  来源:igfitidea点击:

python - pandas - check if date exists in dataframe

pythondatetimepandasdataframe

提问by Leo

I have a dataframe like this:

我有一个这样的数据框:

      category  date            number
0      Cat1     2010-03-01      1
1      Cat2     2010-09-01      1
2      Cat3     2010-10-01      1
3      Cat4     2010-12-01      1
4      Cat5     2012-04-01      1
5      Cat2     2013-02-01      1
6      Cat3     2013-07-01      1
7      Cat4     2013-11-01      2
8      Cat5     2014-11-01      5
9      Cat2     2015-01-01      1
10     Cat3     2015-03-01      1

I would like to check if a date is exist in this dataframe but I am unable to. I tried various ways as below but still no use:

我想检查此数据框中是否存在日期,但我无法检查。我尝试了以下各种方法,但仍然没有用:

if pandas.Timestamp("2010-03-01 00:00:00", tz=None) in df['date'].values:
    print 'date exist'

if datetime.strptime('2010-03-01', '%Y-%m-%d') in df['date'].values:
    print 'date exist'

if '2010-03-01' in df['date'].values:
    print 'date exist'  

The 'date exist' never got printed. How could I check if the date exist? Because I want to insert the none-existed date with number equals 0 to all the categories so that I could plot a continuously line chart (one category per line). Help is appreciated. Thanks in advance.

“存在日期”从未被打印出来。我如何检查日期是否存在?因为我想在所有类别中插入数字等于 0 的不存在日期,以便我可以绘制连续折线图(每行一个类别)。帮助表示赞赏。提前致谢。

The last one gives me this: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparisonAnd the date existnot get printed.

最后一个给了我这个: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparisondate exist不是被打印出来。

采纳答案by jezrael

I think you need convert to datetime first by to_datetimeand then if need select all rows use boolean indexing:

我认为您需要先转换为日期时间to_datetime,然后如果需要选择所有行,请使用boolean indexing

df.date = pd.to_datetime(df.date)

print (df.date == pd.Timestamp("2010-03-01 00:00:00"))
0      True
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
Name: date, dtype: bool

print (df[df.date == pd.Timestamp("2010-03-01 00:00:00")])
  category       date  number
0     Cat1 2010-03-01       1

For return Trueuse check value converted to numpy arrayby values:

对于返回True使用校验值转换为numpy array通过values

if ('2010-03-01' in df['date'].values):
    print ('date exist')

Or at least one Trueby anyas comment Edchum:

或至少一个Trueany作为注释Edchum

if (df.date == pd.Timestamp("2010-03-01 00:00:00")).any():
    print ('date exist')  

回答by JoseGzz

For example, to cofirm that the 4th value of dsis contained within itself:

例如,要确认 的第 4 个值ds包含在其自身中:

len(set(ds.isin([ds.iloc[3]]))) > 1

Let dsbe a Pandas DataSeries of the form [index, pandas._libs.tslib.Timestamp] with example values:

让我们ds成为一个带有示例值的 [index, pandas._libs.tslib.Timestamp] 形式的 Pandas DataSeries:

0 2018-01-31 19:08:27.465515 1 2018-02-01 19:08:27.465515 2 2018-02-02 19:08:27.465515 3 2018-02-03 19:08:27.465515 4 2018-02-04 19:08:27.465515

0 2018-01-31 19:08:27.465515 1 2018-02-01 19:08:27.465515 2 2018-02-02 19:08:27.465515 3 2018-02-03 19:08:27.465515 4 2018-02-04 19:08:27.465515

Then, we use the isinlocal method to get a DataSeries of booleans where each entry indicates wether that position in dsmatches with the value passed as argument to the function (since isinexpects a list of values we need to provide the value in list format).

然后,我们使用isin本地方法获取布尔值的 DataSeries,其中每个条目指示该位置是否ds与作为参数传递给函数的值匹配(因为isin需要一个值列表,我们需要以列表格式提供值)。

Next, we use the setglobal method as to get a set with 1 or 2 values depending on wether there was a match (True and False values) or not (only a False value).

接下来,我们使用set全局方法获取具有 1 或 2 个值的集合,具体取决于是否存在匹配(True 和 False 值)或不匹配(只有 False 值)。

Finally, we check if the set contains more than 1 value, if that is the case, it means we have a match, and no match otherwise.

最后,我们检查集合是否包含超过 1 个值,如果是这样,则表示我们有匹配项,否则就没有匹配项。