pandas 如何根据现有列熊猫的多个条件创建新列

Question

提问by e9e9s

So I have a df column of 9 digit IDs. There are no duplicates and each ID starts with a different number that ranges from 1-6 -- depending on the number each ID starts with I want to create a separate column with the "name" that the first number of the ID represents. (e.g. IDs that start with 1 represent Maine, IDs that start with 2 represent California... and so on)

所以我有一个 df 列的 9 位 ID。没有重复，并且每个 ID 都以不同的数字开头，范围从 1 到 6 - 根据每个 ID 开头的数字我想创建一个单独的列，其中包含 ID 的第一个数字代表的“名称”。（例如，以 1 开头的 ID 代表缅因州，以 2 开头的 ID 代表加利福尼亚州......等等）

This works if it was only 2 conditions:

如果只有 2 个条件，则此方法有效：

df['id_label'] = ['name_1' if name.startswith('1') else 'everything_else' for name in df['col_1']]

I couldn't figure out how to create a multi line line comprehension for what I need so I thought this would work, but it only creates the id_labelcolumn from the last iteration of the loop (i.e. the id_labelcolumn will only contain 'name_5):

我无法弄清楚如何为我需要的内容创建多行理解，所以我认为这会起作用，但它只id_label从循环的最后一次迭代创建列（即该id_label列将只包含'name_5）：

for col in df['col_1']:
    if col.startswith('1'):
        df['id_label'] = 'name_1'
    if col.startswith('2'):
        df['id_label'] = 'name_2'
    if col.startswith('3'):
       df['id_label'] = 'name_3'
    if col.startswith('4'):
        df['id_label'] = 'name_4'
    if col.startswith('5'):
        df['id_label'] = 'name_5'
    if col.startswith('6'):
        df['id_label'] = 'name_5'

My question is how can I create a new column from an old column based on multiple conditional statements?

我的问题是如何根据多个条件语句从旧列创建新列？

Answer 1

回答by jezrael

I think you can convert column to strby astype, select first value and last mapby dict:

我认为您可以将列转换为strby astype，选择第一个值和最后一个mapby dict：

df = pd.DataFrame({'col_1':[133,255,36,477,55,63]})
print (df)

d = {'1':'Maine', '2': 'California', '3':'a', '4':'f', '5':'r', '6':'r'}
df['id_label'] = df['col_1'].astype(str).str[0].map(d)
print (df)
   col_1    id_label
0    133       Maine
1    255  California
2     36           a
3    477           f
4     55           r
5     63           r

Answer 2

回答by Bharath

You can use applyin case you have lot of if elses

apply如果你有很多 if elses，你可以使用

def ifef(col):
    col = str(col)
    if col.startswith('1'):
        return  'name_1'
    if col.startswith('2'):
        return 'name_2'
    if col.startswith('3'):
        return 'name_3'
    if col.startswith('4'):
        return'name_4'
    if col.startswith('5'):
        return 'name_5'
    if col.startswith('6'):
        return 'name_5'
df = pd.DataFrame({'col_1':[133,255,36,477,55,63]})
df['id_label'] = df['col_1'].apply(ifef)

   col_1 id_label
0    133   name_1
1    255   name_2
2     36   name_3
3    477   name_4
4     55   name_5
5     63   name_5

In case if you have a dictionaary you can use

如果你有字典，你可以使用

df = pd.DataFrame({'col_1':[133,255,36,477,55,63]})
d = {'1':'M', '2': 'C', '3':'a', '4':'f', '5':'r', '6':'s'}
def ifef(col):
    col = str(col)
    return d[col[0]]

df['id_label'] = df['col_1'].apply(ifef)
print(df)

  col_1 id_label
0    133        M
1    255        C
2     36        a
3    477        f
4     55        r
5     63        s

Answer 3

回答by Preetham

Can you check this and let me know whether its suitable for your question.

你能检查一下吗，让我知道它是否适合你的问题。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({'col_1':[133,255,36,477,55,63]})

df['col_2'] = df['col_1'].astype(str).str[0]

condlist = [df['col_2'] == "1",
            df['col_2'] == "2",
            df['col_2'] == "3",
            df['col_2'] == "4",
            ((df['col_2'] == "5") | (df['col_2'] == "6")),
            ]
choicelist = ['Maine','California','India', 'Frnace','5/6']

df['id_label'] = np.select(condlist, choicelist)

print(df)

#### Output ####

   col_1 col_2    id_label
0    133     1       Maine
1    255     2  California
2     36     3       India
3    477     4      Frnace
4     55     5         5/6
5     63     6         5/6

PS: Thanks for @ALollzwho introduced me to np.select

PS：感谢@ALollz向我介绍 np.select

pandas 如何根据现有列熊猫的多个条件创建新列

提问by e9e9s

回答by jezrael

回答by Bharath

回答by Preetham

相关推荐

最近更新

标签

pandas 如何根据现有列熊猫的多个条件创建新列

提问by e9e9s

回答by jezrael

回答by Bharath

回答by Preetham

相关推荐

pandas 如何使用熊猫删除第一行？

Pandas DataFrame 可变性

Pandas pivot_table 保留顺序

如何命名 Pandas 系列

相关推荐

最近更新

标签