pandas 无法将系列转换为 <class 'int'`>

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51865367/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:56:38  来源:igfitidea点击:

cannot convert the series to <class 'int'`>

pythonpandas

提问by Fan Zhao

I have a set of data with an Age column. I want to remove all the rows that are aged more than 90 and less than 1856.

我有一组带有 Age 列的数据。我想删除所有年龄超过 90 且小于 1856 的行。

This is head of df:

这是 df 的负责人:

enter image description here

在此处输入图片说明

This is what I attempted: enter image description here

这是我尝试的: 在此处输入图片说明

回答by Scott Boston

Your error is line 2. df['intage'] = int(df['age'])is not valid, you can't pass a pandas series to the int function.

您的错误是第 2 行。 df['intage'] = int(df['age'])无效,您无法将 Pandas 系列传递给 int 函数。

You need to use astypeif df['age'] is object dtype.

astype如果 df['age'] 是对象数据类型,则需要使用。

df['intage'] = df['age'].astype(int)

Or since you are subtracting two dates, you need to use dt accessor with the days attribute to get the number of days as an integer

或者由于您要减去两个日期,您需要使用带有 days 属性的 dt 访问器来获取作为整数的天数

df['intage'] = df['age'].dt.days

回答by Cedric Zoppolo

One solution would be to extract days from the timedeltavariables you have within agecolumn.

一种解决方案是从列中的timedelta变量中提取天数age

In below toy example you can see how you can achieve that:

在下面的玩具示例中,您可以看到如何实现这一目标:

import pandas as pd
import datetime
from datetime import timedelta as td

# Create example DataFrame
df = pd.DataFrame([td(83),td(108),td(83),td(63),td(81)], columns=["age"])
print df

# Get days from timedeltas
df.age = df.age.apply(lambda x: x.days)
print df

# Filter ages
df = df[df.age.between(91,1956, inclusive=True)]
print df

Results in below prints:

结果如下:

>>> 
       age
0  83 days
1 108 days
2  83 days
3  63 days
4  81 days
   age
0   83
1  108
2   83
3   63
4   81
   age
1  108

回答by ALollz

Since the dtypeis timedelta64[ns]you can either use between, specifying two timedeltasas the endpoints, or you can first convert the days to a numeric type using numpy.

由于dtypeistimedelta64[ns]您可以在两者之间使用,指定两个timedeltas作为端点,或者您可以先使用 将天数转换为数字类型numpy

Setup

设置

import pandas as pd
import numpy as np

df = pd.DataFrame({'age': [83, 108, 83, 63, 81]})
df['age'] = pd.to_timedelta(df.age, unit='days')

Find those between 82 and 107 days:

找出 82 到 107 天之间的那些:

df[df.age.between(pd.to_timedelta(82, unit='days'), pd.to_timedelta(107, unit='days'))]
#      age
#0 83 days
#2 83 days

With numpy

numpy

df[(df.age/np.timedelta64(1, 'D')).between(82, 107)]
#      age
#0 83 days
#2 83 days