Python：从现有列创建一个新列

Question

提问by Kexin Xu

I am trying to create a new column based on both columns. Say I want to create a new column z, and it should be the value of y when it is not missing and be the value of x when y is indeed missing. So in this case, I expect z to be [1, 8, 10, 8].

我正在尝试基于两列创建一个新列。假设我想创建一个新列 z，它应该是 y 没有缺失时的值，当 y 确实缺失时它应该是 x 的值。所以在这种情况下，我希望 z 是[1, 8, 10, 8]。

Answer 1

采纳答案by Vidhya G

The new column 'z'get its values from column 'y'using df['z'] = df['y']. This brings over the missing values so fill them in using fillnausing column 'x'. Chain these two actions:

新'z'列'y'使用df['z'] = df['y']. 这会带来缺失值，因此使用fillnausing column填充它们'x'。将这两个动作串联起来：

>>> df['z'] = df['y'].fillna(df['x'])
>>> df
   x   y   z
0  1 NaN   1
1  2   8   8
2  4  10  10
3  8 NaN   8

Answer 2

回答by Red Twoon

I'm not sure if I understand the question, but would this be what you're looking for?

我不确定我是否理解这个问题，但这会是你要找的吗？

"if y[i]" will skip if the value is none.

如果值为 none，“if y[i]”将跳过。

for i in range(len(x));
    if y[i]:
        z.append(y[i])
    else:
        z.append(x[i])

Answer 3

回答by Kyler Brown

Let's say DataFrame is called df. First copy the ycolumn.

假设 DataFrame 被调用df。首先复制y列。

df["z"] = df["y"].copy()

Then set the nan locations of z to the locations in x where the nans are in z.

然后将 z 的 nan 位置设置为 x 中 nan 在 z 中的位置。

import numpy as np
df.z[np.isnan(df.z)]=df.x[np.isnan(df.z)]


>>> df 
   x   y   z
0  1 NaN   1
1  2   8   8
2  4  10  10
3  8 NaN   8

Answer 4

回答by EdChum

Use np.where:

使用np.where：

In [3]:

df['z'] = np.where(df['y'].isnull(), df['x'], df['y'])
df
Out[3]:
   x   y   z
0  1 NaN   1
1  2   8   8
2  4  10  10
3  8 NaN   8

Here it uses the boolean condition and if true returns df['x']else df['y']

这里它使用布尔条件，如果为真则返回df['x']其他df['y']

Answer 5

回答by Haleemur Ali

You can use applywith option axis=1. Then your solution is pretty concise.

您可以apply与选项一起使用axis=1。那么您的解决方案非常简洁。

df[z] = df.apply(lambda row: row.y if pd.notnull(row.y) else row.x, axis=1)

Answer 6

回答by ari

The updatemethod does almost exactly this. The only caveat is that updatewill do so in place so you must first create a copy:

该update方法几乎就是这样做的。唯一需要注意的是，这update将就地进行，因此您必须先创建一个副本：

df['z'] = df.x.copy()
df.z.update(df.y)

In the above example you start with xand replace each value with the corresponding value from y, as long as the new value is not NaN.

在上面的示例中，只要新值不是，您就可以从开始x并将每个值替换y为来自的相应值NaN。

Python：从现有列创建一个新列

提问by Kexin Xu

采纳答案by Vidhya G

回答by Red Twoon

回答by Kyler Brown

回答by EdChum

回答by Haleemur Ali

回答by ari

相关推荐

最近更新

标签

Python：从现有列创建一个新列

提问by Kexin Xu

采纳答案by Vidhya G

回答by Red Twoon

回答by Kyler Brown

回答by EdChum

回答by Haleemur Ali

回答by ari

相关推荐

使用 Python 在 Pandas 数据框中创建星期几列

Python uwsgi + Flask + virtualenv 导入错误：没有名为站点的模块

Python 我应该如何在我的模型中使用 DurationField？

Python 在 Kivy 中将图像对象作为按钮背景传递

相关推荐

最近更新

标签