Python Pandas - 突出显示列中的最大值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45606458/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:13:39  来源:igfitidea点击:

Python Pandas - Highlighting maximum value in column

pythonpandas

提问by ScoutEU

I have a dataframe produced by this code:

我有一个由这段代码生成的数据框:

hmdf = pd.DataFrame(hm01)
new_hm02 = hmdf[['FinancialYear','Month']]
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]

hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')
vals1 = ['April    ', 'May      ', 'June     ', 'July     ', 'August   ', 'September', 'October  ', 'November ', 'December ', 'January  ', 'February ', 'March    ']

df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))
df_hml = df_hm.reindex(vals1)

And then I have a function to highlight the maximum value in each column:

然后我有一个函数来突出显示每列中的最大值:

def highlight_max(data, color='yellow'):
    '''
    highlight the maximum in a Series or DataFrame
    '''
    attr = 'background-color: {}'.format(color)
    if data.ndim == 1:  # Series from .apply(axis=0) or axis=1
        is_max = data == data.max()
        return [attr if v else '' for v in is_max]
    else:  # from .apply(axis=None)
        is_max = data == data.max().max()
        return pd.DataFrame(np.where(is_max, attr, ''),
                            index=data.index, columns=data.columns)

And then this code: dfPercent.style.apply(highlight_max)produces this:

然后这个代码:dfPercent.style.apply(highlight_max)产生这个:

enter image description here

在此处输入图片说明

As you can see, only the first and last column have the correct max value highlighted.

如您所见,只有第一列和最后一列突出显示了正确的最大值。

Anyone know what is going wrong?

有谁知道出了什么问题?

Thank you

谢谢

回答by jezrael

There is problem you need convert values to floats for correct max, because get max value of strings - 9is more as 1:

有问题您需要将值转换为浮点数才能正确max,因为获取字符串的最大值 -9更像是1

def highlight_max(data, color='yellow'):
    '''
    highlight the maximum in a Series or DataFrame
    '''
    attr = 'background-color: {}'.format(color)
    #remove % and cast to float
    data = data.replace('%','', regex=True).astype(float)
    if data.ndim == 1:  # Series from .apply(axis=0) or axis=1
        is_max = data == data.max()
        return [attr if v else '' for v in is_max]
    else:  # from .apply(axis=None)
        is_max = data == data.max().max()
        return pd.DataFrame(np.where(is_max, attr, ''),
                            index=data.index, columns=data.columns)

Sample:

样品

dfPercent = pd.DataFrame({'2014/2015':['10.3%','9.7%','9.2%'],
                   '2015/2016':['4.8%','100.8%','9.7%']})
print (dfPercent)
  2014/2015 2015/2016
0     10.3%      4.8%
1      9.7%    100.8%
2      9.2%      9.7%

jupyter

jupyter