pandas 列上的熊猫数据框排序会引发索引上的关键错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38810395/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:44:46  来源:igfitidea点击:

pandas dataframe sort on column raises keyerror on index

pythonpandassyntax

提问by gvoysey

I have the following dataframe, df:

我有以下数据框df

   peaklatency        snr
0        52.99        0.0
1        54.15  62.000000
2        54.12  82.000000
3        54.64  52.000000
4        54.57  42.000000
5        54.13  72.000000

I'm attempting to sort this by snr:

我正在尝试按以下方式排序snr

df.sort_values(df.snr)

but this raises

但这会引起

_convert_to_indexer(self, obj, axis, is_setter)
   1208                 mask = check == -1
   1209                 if mask.any():
-> 1210                     raise KeyError('%s not in index' % objarr[mask])
   1211 
   1212                 return _values_from_object(indexer)

KeyError: '[ inf  62.  82.  52.  42.  72.] not in index'

I am not explicitly setting an index on this DataFrame, it's coming from a list comprehension:

我没有在这个 DataFrame 上明确设置索引,它来自列表理解:

    import pandas as pd
    d = []
    for run in runs:
        d.append({            
            'snr': run.periphery.snr.snr,
            'peaklatency': (run.brainstem.wave5.wave5.argmax() / 100e3) * 1e3
        })
    df = pd.DataFrame(d)

回答by chrisb

The bykeyword to sort_valuesexpects column names, not the actual Series itself. So, you'd want:

by关键字sort_values预计列名,而不是实际的系列本身。所以,你会想要:

In [23]: df.sort_values('snr')
Out[23]: 
   peaklatency   snr
0        52.99   0.0
4        54.57  42.0
3        54.64  52.0
1        54.15  62.0
5        54.13  72.0
2        54.12  82.0