Python ValueError:在 Pandas 中匹配日期时,系列长度必须匹配才能进行比较
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34586069/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
ValueError: Series lengths must match to compare when matching dates in Pandas
提问by Monica Heddneck
I apologize in advance for asking such a basic question but I am stumped.
我提前为问这样一个基本问题而道歉,但我很难过。
This is a very simple, dummy example. I'm having some issue matching dates in Pandas and I can't figure out why.
这是一个非常简单的虚拟示例。我在 Pandas 中匹配日期时遇到一些问题,我不知道为什么。
df = pd.DataFrame([[1,'2016-01-01'],
[2,'2016-01-01'],
[3,'2016-01-02'],
[4,'2016-01-03']],
columns=['ID', 'Date'])
df['Date'] = df['Date'].astype('datetime64')
Say I want to match row 1 in the above df.
I know beforehand that I want to match ID 1
.
And I know the date I want as well, and as a matter of fact, I'll extract that date directly from row 1 of the df to make it bulletproof.
假设我想匹配上面 df 中的第 1 行。
我事先知道我想匹配 ID 1
。
而且我也知道我想要的日期,事实上,我会直接从 df 的第 1 行中提取该日期以使其防弹。
some_id = 1
some_date = df.iloc[1:2]['Date'] # gives 2016-01-01
So why doesn't this line work to return me row 1??
那么为什么这条线不能返回我的第 1 行?
df[(df['ID']==some_id) & (df['Date'] == some_date)]
Instead I get
ValueError: Series lengths must match to compare
which I understand, and makes sense...but leaves me wondering...how else can I compare dates in pandas if I can't compare one to many?
相反,我得到
ValueError: Series lengths must match to compare
了我理解的,并且是有道理的......但让我想知道......如果我不能比较一对多,我还能如何比较熊猫中的日期?
采纳答案by DSM
You say:
你说:
some_date = df.iloc[1:2]['Date'] # gives 2016-01-01
but that's notwhat it gives. It gives a Series with one element, not simply a value -- when you use [1:2]
as your slice, you don't get a single element, but a container with one element:
但这不是它给出的。它提供一个带有一个元素的系列,而不仅仅是一个值——当你[1:2]
用作切片时,你不会得到一个元素,而是一个带有一个元素的容器:
>>> some_date
1 2016-01-01
Name: Date, dtype: datetime64[ns]
Instead, do
相反,做
>>> some_date = df.iloc[1]['Date']
>>> some_date
Timestamp('2016-01-01 00:00:00')
after which
之后
>>> df[(df['ID']==some_id) & (df['Date'] == some_date)]
ID Date
0 1 2016-01-01
(Note that there are more efficient patterns if you have a lot of some_id
and some_date
values to look up, but that's a separate issue.)
(请注意,如果您要查找大量some_id
和some_date
值,则有更有效的模式,但这是一个单独的问题。)
回答by Amruth Lakkavaram
As mentioned by DSM, some_date is a series and not a value. When you use boolean masking, and checking if value of a column is equal to some variable or not, we have to make sure that the variable is a value, not a container. One possible way of solving the problem is mentioned by DSM, there is also another way of solving your problem.
正如 DSM 所提到的, some_date 是一个系列而不是一个值。当您使用布尔掩码并检查列的值是否等于某个变量时,我们必须确保该变量是一个值,而不是一个容器。DSM 提到了解决问题的一种可能方法,还有另一种解决问题的方法。
df[(df['ID']==some_id) & (df['Date'] == some_date.values[0])]
We have just replaced the some_date with some_date.values[0]. some_date.values returns an array with one element. We are interested in the value in the container, not the container, so we index it by [0] to get the value.
我们刚刚用 some_date.values[0] 替换了 some_date。some_date.values 返回一个包含一个元素的数组。我们感兴趣的是容器中的值,而不是容器,因此我们通过 [0] 对其进行索引以获取值。