pandas np.where 多个返回值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35725305/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:47:58  来源:igfitidea点击:

np.where multiple return values

pythonnumpypandas

提问by DGraham

Using pandas and numpy I am trying to process a column in a dataframe, and want to create a new column with values relating to it. So if in column x the value 1 is present, in the new column it would be a, for value 2 it would be b etc

使用 pandas 和 numpy 我正在尝试处理数据框中的一列,并希望创建一个包含与其相关的值的新列。因此,如果在 x 列中存在值 1,则在新列中它将是 a,对于值 2 它将是 b 等

I can do this for single conditions, i.e

我可以针对单个条件执行此操作,即

df['new_col'] = np.where(df['col_1'] == 1, a, n/a)

And I can find example of multiple conditions i.e if x = 3 or x = 4 the value should a, but not to do something like if x = 3 the value should be a and if x = 4 the value be c.

我可以找到多个条件的示例,即如果 x = 3 或 x = 4,则该值应该是 a,但不要做类似如果 x = 3 的值应该是 a 并且如果 x = 4 的值是 c 的事情。

I tried simply running two lines of code such as :

我尝试简单地运行两行代码,例如:

df['new_col'] = np.where(df['col_1'] == 1, a, n/a)
df['new_col'] = np.where(df['col_1'] == 2, b, n/a)

But obviously the second line overwrites. Am I missing something crucial?

但显然第二行会覆盖。我错过了一些重要的东西吗?

回答by jezrael

I think you can use loc:

我认为你可以使用loc

df.loc[(df['col_1'] == 1, 'new_col')] = a
df.loc[(df['col_1'] == 2, 'new_col')] = b

Or:

或者:

df['new_col'] = np.where(df['col_1'] == 1, a, np.where(df['col_1'] == 2, b, np.nan))

回答by Stop harming Monica

I think numpy choose()is the best option for you.

我认为 numpychoose()是您的最佳选择。

import numpy as np
choices = 'abcde'
N = 10
np.random.seed(0)
data = np.random.randint(1, len(choices) + 1, size=N)
print(data)
print(np.choose(data - 1, choices))

Output:

输出:

[5 1 4 4 4 2 4 3 5 1]
['e' 'a' 'd' 'd' 'd' 'b' 'd' 'c' 'e' 'a']

回答by SpeedCoder5

Use the pandas Series.mapinstead of where.

使用 pandas Series.map而不是 where。

import pandas as pd
df = pd.DataFrame({'col_1' : [1,2,4,2]})
print(df)

def ab_ify(v):
    if v == 1:
        return 'a'
    elif v == 2:
        return 'b'
    else:
        return None

df['new_col'] = df['col_1'].map(ab_ify)
print(df)

# output:
#
#    col_1
# 0      1
# 1      2
# 2      4
# 3      2
#    col_1 new_col
# 0      1       a
# 1      2       b
# 2      4    None
# 3      2       b  

回答by rde

you could define a dict with your desired transformations. Then loop through the a DataFrame column and fill it.

你可以用你想要的转换定义一个字典。然后循环遍历 DataFrame 列并填充它。

There may a more elegant ways, but this will work:

可能有更优雅的方法,但这会起作用:

# create a dummy DataFrame
df = pd.DataFrame( np.random.randint(2, size=(6,4)), columns=['col_1', 'col_2', 'col_3', 'col_4'],  index=range(6)  )

# create a dict with your desired substitutions:
swap_dict = {  0 : 'a',
               1 : 'b',
             999 : 'zzz',  }

# introduce new column and fill with swapped information:
for i in df.index:
    df.loc[i, 'new_col'] = swap_dict[  df.loc[i, 'col_1']  ]

print df

returns something like:

返回类似:

   col_1  col_2  col_3  col_4 new_col
0      1      1      1      1       b
1      1      1      1      1       b
2      0      1      1      0       a
3      0      1      0      0       a
4      0      0      1      1       a
5      0      0      1      0       a