Python 替换熊猫数据框中大于数字的值

Question

提问by Zanam

I have a large dataframe which looks as:

我有一个大数据框，它看起来像：

df1['A'].ix[1:3]
2017-01-01 02:00:00    [33, 34, 39]
2017-01-01 03:00:00    [3, 43, 9]

I want to replace each element greater than 9 with 11.

我想用 11 替换大于 9 的每个元素。

So, the desired output for above example is:

因此，上述示例所需的输出是：

df1['A'].ix[1:3]
2017-01-01 02:00:00    [11, 11, 11]
2017-01-01 03:00:00    [3, 11, 9]

Edit:

编辑：

My actual dataframe has about 20,000 rows and each row has list of size 2000.

我的实际数据框有大约 20,000 行，每行都有大小为 2000 的列表。

Is there a way to use numpy.minimumfunction for each row? I assume that it will be faster than list comprehensionmethod?

有没有办法numpy.minimum为每一行使用函数？我认为它会比list comprehension方法更快？

Answer 1

采纳答案by jezrael

You can use applywith list comprehension:

你可以用apply与list comprehension：

df1['A'] = df1['A'].apply(lambda x: [y if y <= 9 else 11 for y in x])
print (df1)
                                A
2017-01-01 02:00:00  [11, 11, 11]
2017-01-01 03:00:00    [3, 11, 9]

Faster solution is first convert to numpy arrayand then use numpy.where:

更快的解决方案是首先转换为numpy array然后使用numpy.where：

a = np.array(df1['A'].values.tolist())
print (a)
[[33 34 39]
 [ 3 43  9]]

df1['A'] = np.where(a > 9, 11, a).tolist()
print (df1)
                                A
2017-01-01 02:00:00  [11, 11, 11]
2017-01-01 03:00:00    [3, 11, 9]

Answer 2

回答by Edouard Cuny

Very simply : df[df > 9] = 11

很简单： df[df > 9] = 11

Answer 3

回答by D.Griffiths

You can use numpy indexing, accessed through the .valuesfunction.

您可以使用 numpy 索引，通过.values函数访问。

df['col'].values[df['col'].values > x] = y

where you are replacing any value greater than x with the value of y.

用 y 的值替换任何大于 x 的值。

So for the example in the question:

因此，对于问题中的示例：

df1['A'].values[df1['A'] > 9] = 11

Answer 4

回答by CFW

I came for a solution to replacing each element larger than h by 1 else 0, which has the simple solution:

我来找一个解决方案，用 1 else 0 替换每个大于 h 的元素，它有一个简单的解决方案：

df = (df > h) * 1

(This does not solve the OP's question as all df <= h are replaced by 0.)

（这不能解决 OP 的问题，因为所有 df <= h 都被 0 替换。）

Python 替换熊猫数据框中大于数字的值

提问by Zanam

采纳答案by jezrael

回答by Edouard Cuny

回答by D.Griffiths

回答by CFW

相关推荐

最近更新

标签

Python 替换熊猫数据框中大于数字的值

提问by Zanam

采纳答案by jezrael

回答by Edouard Cuny

回答by D.Griffiths

回答by CFW

相关推荐

Python 如何在seaborn的facetgrid中设置可读的xticks？

Selenium Python 等待文本出现在元素错误显示需要 3 个参数 2 给出

Python 3.6 导入请求

Python “gaierror: [Errno -3] 名称解析暂时失败”是什么意思

相关推荐

最近更新

标签