Python 如何用零替换 Pandas Data Frame 中的负数

Question

提问by Hangon

I would like to know if there is someway of replacing all DataFrame negative numbers by zeros?

我想知道是否有办法用零替换所有 DataFrame 负数？

Answer 1

采纳答案by Lev Levitsky

If all your columns are numeric, you can use boolean indexing:

如果所有列都是数字，则可以使用布尔索引：

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1]})

In [3]: df
Out[3]: 
   a  b
0  0 -3
1 -1  2
2  2  1

In [4]: df[df < 0] = 0

In [5]: df
Out[5]: 
   a  b
0  0  0
1  0  2
2  2  1

For the more general case, this answershows the private method _get_numeric_data:

对于更一般的情况，这个答案显示了私有方法_get_numeric_data：

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1],
                           'c': ['foo', 'goo', 'bar']})

In [3]: df
Out[3]: 
   a  b    c
0  0 -3  foo
1 -1  2  goo
2  2  1  bar

In [4]: num = df._get_numeric_data()

In [5]: num[num < 0] = 0

In [6]: df
Out[6]: 
   a  b    c
0  0  0  foo
1  0  2  goo
2  2  1  bar

With timedeltatype, boolean indexing seems to work on separate columns, but not on the whole dataframe. So you can do:

对于timedelta类型，布尔索引似乎适用于单独的列，但不适用于整个数据帧。所以你可以这样做：

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'),
   ...:                    'b': pd.to_timedelta([-3, 2, 1], 'd')})

In [3]: df
Out[3]: 
        a       b
0  0 days -3 days
1 -1 days  2 days
2  2 days  1 days

In [4]: for k, v in df.iteritems():
   ...:     v[v < 0] = 0
   ...:     

In [5]: df
Out[5]: 
       a      b
0 0 days 0 days
1 0 days 2 days
2 2 days 1 days

Update:comparison with a pd.Timedeltaworks on the whole DataFrame:

更新：与pd.Timedelta整个 DataFrame 上的作品进行比较：

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'),
   ...:                    'b': pd.to_timedelta([-3, 2, 1], 'd')})

In [3]: df[df < pd.Timedelta(0)] = 0

In [4]: df
Out[4]: 
       a      b
0 0 days 0 days
1 0 days 2 days
2 2 days 1 days

Answer 2

回答by aus_lacy

Perhaps you could use pandas.where(args)like so:

也许你可以这样使用pandas.where(args)：

data_frame = data_frame.where(data_frame < 0, 0)

Answer 3

回答by follyroof

Another succinct way of doing this is pandas.DataFrame.clip.

另一种简洁的方法是pandas.DataFrame.clip。

For example:

例如：

import pandas as pd

In [20]: df = pd.DataFrame({'a': [-1, 100, -2]})

In [21]: df
Out[21]: 
     a
0   -1
1  100
2   -2

In [22]: df.clip(lower=0)
Out[22]: 
     a
0    0
1  100
2    0

There's also df.clip_lower(0).

还有df.clip_lower(0)。

Answer 4

回答by MarKo9

If you are dealing with a large df (40m x 700 in my case) it works much faster and memory savvy through iteration on columns with something like.

如果您正在处理大型 df（在我的情况下为 40m x 700），它的工作速度会更快，并且通过对类似列的迭代来了解内存。

for col in df.columns:
    df[col][df[col] < 0] = 0

Answer 5

回答by Michael Conlin

Another clean option that I have found useful is pandas.DataFrame.maskwhich will "replace values where the condition is true."

我发现另一个有用的干净选项是 pandas.DataFrame.mask，它将“替换条件为真的值”。

Create the DataFrame:

创建数据框：

In [2]: import pandas as pd

In [3]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1]})

In [4]: df
Out[4]: 
   a  b
0  0 -3
1 -1  2
2  2  1

Replace negative numbers with 0:

用 0 替换负数：

In [5]: df.mask(df < 0, 0)
Out[5]: 
   a  b
0  0  0
1  0  2
2  2  1

Or, replace negative numbers with NaN, which I frequently need:

或者，用 NaN 替换负数，这是我经常需要的：

In [7]: df.mask(df < 0)
Out[7]: 
     a    b
0  0.0  NaN
1  NaN  2.0
2  2.0  1.0

Python 如何用零替换 Pandas Data Frame 中的负数

提问by Hangon

采纳答案by Lev Levitsky

回答by aus_lacy

回答by follyroof

回答by MarKo9

回答by Michael Conlin

相关推荐

最近更新

标签

Python 如何用零替换 Pandas Data Frame 中的负数

提问by Hangon

采纳答案by Lev Levitsky

回答by aus_lacy

回答by follyroof

回答by MarKo9

回答by Michael Conlin

相关推荐

Python 单行 for 循环来构建字典？

以数组的形式访问javascript中的python列表

Python TypeError: 'NoneType' 对象不可下标

Python 错误：没有名为 cv2 的模块

相关推荐

最近更新

标签