pandas.DF() 中的列是否单调递增?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28093365/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:51:53  来源:igfitidea点击:

Is a column in pandas.DF() monotonically increasing?

pythonpandasdata-analysis

提问by amehta

I can check if the index of a pandas.DataFrame() is monotonically increasing by using is_monotonic method. However, I would like to check if one of the column value is strictly increasing in value(float/integer) ?

我可以使用 is_monotonic 方法检查 pandas.DataFrame() 的索引是否单调递增。但是,我想检查列值之一是否严格增加 value(float/integer) ?

In [13]: my_df = pd.DataFrame([1,2,3,5,7,6,9])

In [14]: my_df
Out[14]: 
   0
0  1
1  2
2  3
3  5
4  7
5  6
6  9

In [15]: my_df.index.is_monotonic
Out[15]: True

回答by OmerB

Pandas 0.19added a public Series.is_monotonicAPI (previously, this was available only in the undocumented algosmodule).

Pandas 0.19添加了一个公共Series.is_monotonicAPI(以前,这仅在未记录的algos模块中可用)。

(Updated)Note that despite its name, Series.is_monotoniconly indicates whether a series is monotonically increasing(equivalent to using Series.is_monotonic_increasing). For the other way around, use Series.is_monotonic_decreasing. Anyway, both are non-strict, but you can combine them with is_unqiueto get strictness.

(更新)请注意,尽管它的名称,Series.is_monotonic仅表示一个系列是否单调递增(相当于使用Series.is_monotonic_increasing)。相反,请使用Series.is_monotonic_decreasing. 无论如何,两者都是非严格的,但您可以将它们与is_unqiue严格结合起来。

e.g.:

例如:

my_df = pd.DataFrame([1,2,2,3], columns = ['A'])

my_df['A'].is_monotonic    # non-strict
Out[1]: True

my_df['A'].is_monotonic_increasing    # equivalent to is_monotonic
Out[2]: True

(my_df['A'].is_monotonic_increasing and my_df['A'].is_unique)    # strict  
Out[3]: False

my_df['A'].is_monotonic_decreasing    # Other direction (also non-strict)
Out[4]: False

You can use applyto run this at a DataFrame level:

您可以使用apply在 DataFrame 级别运行它:

my_df = pd.DataFrame({'A':[1,2,3],'B':[1,1,1],'C':[3,2,1]})
my_df
Out[32]: 
   A  B  C
0  1  1  3
1  2  1  2
2  3  1  1

my_df.apply(lambda x: x.is_monotonic)
Out[33]: 
A     True
B     True
C    False
dtype: bool

回答by Dr. Jan-Philip Gehrcke

Probably the best way is to obtain a dataframe column as a numpy array without copying data around (using the .valuespropertyafter column selection via indexing), and to then use a numpy-based test for checking monotonicity:

可能最好的方法是获取数据帧列作为 numpy 数组而不复制数据(通过索引使用列选择后.values属性),然后使用基于 numpy 的测试来检查单调性:

def monotonic(x):
    return np.all(np.diff(x) > 0)

monotonic(df[0].values)

A pure Python implementation, borrowed from here: Python - How to check list monotonicity

一个纯 Python 实现,从这里借用:Python - How to check list monotonicity

def strictly_increasing(L):
    return all(x<y for x, y in zip(L, L[1:]))

回答by birone

If two indices are equal, they won't be unique. So you can just use:

如果两个索引相等,则它们将不唯一。所以你可以使用:

my_df.Index.is_monotonic and my_df.Index.is_unique

These attributes are documented in version 15.2; is_unique is mentioned sketchily in 14.1 but just worked for me. See

这些属性记录在 15.2 版中;is_unique 在 14.1 中被粗略地提到,但对我有用。看

http://pandas.pydata.org/pandas-docs/version/0.15.2/api.html#indexhttp://pandas.pydata.org/pandas-docs/version/0.14.1/generated/pandas.Index.html

http://pandas.pydata.org/pandas-docs/version/0.15.2/api.html#index http://pandas.pydata.org/pandas-docs/version/0.14.1/generated/pandas.Index .html

回答by acushner

you can just math this one:

你可以算一下这个:

diff = df[0] - df[0].shift(1)
is_monotonic = (diff < 0).sum() == 0 or (diff > 0).sum() == 0

all you're checking here is that either the differences are all >= 0 or all <= 0.

您在此处检查的所有内容是差异全部 >= 0 或全部 <= 0。

edit: since you only want strictly increasing, then it's just:

编辑:既然你只想严格增加,那么它只是:

is_monotonic = (diff <= 0).sum() == 0

回答by John

I understand that by strictlyincreasing you mean that the values are integers and that neighbors are separated by exactly 1? As discussed here, this is a simple method for checking named criterion:

我知道严格增加是指这些值是整数,并且邻居之间的间隔正好是 1?正如这里所讨论的,这是一种检查命名标准的简单方法:

def is_coherent(seq):
    return seq == range(seq[0], seq[-1]+1)

Using it with the first column of your my_dfmight look like so:

将它与您的第一列一起使用my_df可能如下所示:

is_coherent(my_df[0].tolist())