pandas 如何在熊猫中定义用户定义的函数

Question

提问by Edwin Baby

I have a csv file that contains information like

我有一个包含以下信息的 csv 文件

name    salary  department
a        2500      x
b        5000      y
c        10000      y
d        20000      x

I need to convert this using Pandas to the form like

我需要使用 Pandas 将其转换为类似的形式

dept    name    position
x        a       Normal Employee
x        b       Normal Employee
y        c       Experienced Employee
y        d       Experienced Employee

if the salary <=8000 Position is Normal Employee

如果薪水 <=8000 职位是普通员工

if the salary >8000 && <=25000 Position is Experienced Employee

如果薪水 >8000 && <=25000 职位是有经验的员工

My default code for group by

我的默认分组代码

import csv
import pandas
pandas.set_option('display.max_rows', 999)
data_df = pandas.read_csv('employeedetails.csv')
#print(data_df.columns)
t = data_df.groupby(['dept'])
print t

What are the changes i need to make in this code to get the output that i mentioned above

我需要在此代码中进行哪些更改才能获得我上面提到的输出

Answer 1

采纳答案by EdChum

You could define 2 masks and pass these to np.where:

您可以定义 2 个掩码并将它们传递给np.where：

In [91]:
normal = df['salary'] <= 8000
experienced = (df['salary'] > 8000) & (df['salary'] <= 25000)
df['position'] = np.where(normal, 'normal emplyee', np.where(experienced, 'experienced employee', 'unknown'))
df

Out[91]:
  name  salary department              position
0    a    2500          x        normal emplyee
1    b    5000          y        normal emplyee
2    c   10000          y  experienced employee
3    d   20000          x  experienced employee

Or slightly more readable is to pass them to loc:

或者稍微更具可读性的是将它们传递给loc：

In [92]:
df.loc[normal, 'position'] = 'normal employee'
df.loc[experienced,'position'] = 'experienced employee'
df

Out[92]:
  name  salary department              position
0    a    2500          x       normal employee
1    b    5000          y       normal employee
2    c   10000          y  experienced employee
3    d   20000          x  experienced employee

Answer 2

回答by Fabio Lamanna

I would use a simple function like:

我会使用一个简单的函数，如：

def f(x):
    if x <= 8000:
        x = 'Normal Employee'
    elif 8000 < x <= 25000:
        x = 'Experienced Employee'
    return x

and then apply it to the df:

然后将其应用于df：

df['position'] = df['salary'].apply(f)

Answer 3

回答by IanS

A useful function is apply:

一个有用的功能是apply：

data_df['position'] = data_df['salary'].apply(lambda salary: 'Normal Employee' if salary <= 8000 else 'Experienced Employee', axis=1)

This applies the lambdafunction to every element in the salary column.

这将lambda函数应用于工资列中的每个元素。

pandas 如何在熊猫中定义用户定义的函数

提问by Edwin Baby

采纳答案by EdChum

回答by Fabio Lamanna

回答by IanS

相关推荐

最近更新

标签

pandas 如何在熊猫中定义用户定义的函数

提问by Edwin Baby

采纳答案by EdChum

回答by Fabio Lamanna

回答by IanS

相关推荐

pandas 在 jupyter ipython notebook 上导入熊猫失败

pandas ValueError：无法从重复的轴重新索引

Pandas：计算列上组的中位数

pandas 熊猫无效类型比较错误

相关推荐

最近更新

标签