Python:从现有列创建一个新列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30265723/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:08:58  来源:igfitidea点击:

Python: create a new column from existing columns

pythonpandasmissing-datacalculated-columns

提问by Kexin Xu

I am trying to create a new column based on both columns. Say I want to create a new column z, and it should be the value of y when it is not missing and be the value of x when y is indeed missing. So in this case, I expect z to be [1, 8, 10, 8].

我正在尝试基于两列创建一个新列。假设我想创建一个新列 z,它应该是 y 没有缺失时的值,当 y 确实缺失时它应该是 x 的值。所以在这种情况下,我希望 z 是[1, 8, 10, 8]

   x   y
0  1 NaN
1  2   8
2  4  10
3  8 NaN

采纳答案by Vidhya G

The new column 'z'get its values from column 'y'using df['z'] = df['y']. This brings over the missing values so fill them in using fillnausing column 'x'. Chain these two actions:

'z''y'使用df['z'] = df['y']. 这会带来缺失值,因此使用fillnausing column填充它们'x'。将这两个动作串联起来:

>>> df['z'] = df['y'].fillna(df['x'])
>>> df
   x   y   z
0  1 NaN   1
1  2   8   8
2  4  10  10
3  8 NaN   8

回答by Red Twoon

I'm not sure if I understand the question, but would this be what you're looking for?

我不确定我是否理解这个问题,但这会是你要找的吗?

"if y[i]" will skip if the value is none.

如果值为 none,“if y[i]”将跳过。

for i in range(len(x));
    if y[i]:
        z.append(y[i])
    else:
        z.append(x[i])

回答by Kyler Brown

Let's say DataFrame is called df. First copy the ycolumn.

假设 DataFrame 被调用df。首先复制y列。

df["z"] = df["y"].copy()

Then set the nan locations of z to the locations in x where the nans are in z.

然后将 z 的 nan 位置设置为 x 中 nan 在 z 中的位置。

import numpy as np
df.z[np.isnan(df.z)]=df.x[np.isnan(df.z)]


>>> df 
   x   y   z
0  1 NaN   1
1  2   8   8
2  4  10  10
3  8 NaN   8

回答by EdChum

Use np.where:

使用np.where

In [3]:

df['z'] = np.where(df['y'].isnull(), df['x'], df['y'])
df
Out[3]:
   x   y   z
0  1 NaN   1
1  2   8   8
2  4  10  10
3  8 NaN   8

Here it uses the boolean condition and if true returns df['x']else df['y']

这里它使用布尔条件,如果为真则返回df['x']其他df['y']

回答by Haleemur Ali

You can use applywith option axis=1. Then your solution is pretty concise.

您可以apply与选项一起使用axis=1。那么您的解决方案非常简洁。

df[z] = df.apply(lambda row: row.y if pd.notnull(row.y) else row.x, axis=1)

回答by ari

The updatemethod does almost exactly this. The only caveat is that updatewill do so in place so you must first create a copy:

update方法几乎就是这样做的。唯一需要注意的是,这update将就地进行,因此您必须先创建一个副本:

df['z'] = df.x.copy()
df.z.update(df.y)

In the above example you start with xand replace each value with the corresponding value from y, as long as the new value is not NaN.

在上面的示例中,只要新值不是 ,您就可以从 开始x并将每个值替换y为来自 的相应值NaN