Python Pandas - 突出显示列中的最大值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45606458/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Pandas - Highlighting maximum value in column
提问by ScoutEU
I have a dataframe produced by this code:
我有一个由这段代码生成的数据框:
hmdf = pd.DataFrame(hm01)
new_hm02 = hmdf[['FinancialYear','Month']]
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]
hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')
vals1 = ['April ', 'May ', 'June ', 'July ', 'August ', 'September', 'October ', 'November ', 'December ', 'January ', 'February ', 'March ']
df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))
df_hml = df_hm.reindex(vals1)
And then I have a function to highlight the maximum value in each column:
然后我有一个函数来突出显示每列中的最大值:
def highlight_max(data, color='yellow'):
'''
highlight the maximum in a Series or DataFrame
'''
attr = 'background-color: {}'.format(color)
if data.ndim == 1: # Series from .apply(axis=0) or axis=1
is_max = data == data.max()
return [attr if v else '' for v in is_max]
else: # from .apply(axis=None)
is_max = data == data.max().max()
return pd.DataFrame(np.where(is_max, attr, ''),
index=data.index, columns=data.columns)
And then this code: dfPercent.style.apply(highlight_max)
produces this:
然后这个代码:dfPercent.style.apply(highlight_max)
产生这个:
As you can see, only the first and last column have the correct max value highlighted.
如您所见,只有第一列和最后一列突出显示了正确的最大值。
Anyone know what is going wrong?
有谁知道出了什么问题?
Thank you
谢谢
回答by jezrael
There is problem you need convert values to floats for correct max
, because get max value of strings - 9
is more as 1
:
有问题您需要将值转换为浮点数才能正确max
,因为获取字符串的最大值 -9
更像是1
:
def highlight_max(data, color='yellow'):
'''
highlight the maximum in a Series or DataFrame
'''
attr = 'background-color: {}'.format(color)
#remove % and cast to float
data = data.replace('%','', regex=True).astype(float)
if data.ndim == 1: # Series from .apply(axis=0) or axis=1
is_max = data == data.max()
return [attr if v else '' for v in is_max]
else: # from .apply(axis=None)
is_max = data == data.max().max()
return pd.DataFrame(np.where(is_max, attr, ''),
index=data.index, columns=data.columns)
Sample:
样品:
dfPercent = pd.DataFrame({'2014/2015':['10.3%','9.7%','9.2%'],
'2015/2016':['4.8%','100.8%','9.7%']})
print (dfPercent)
2014/2015 2015/2016
0 10.3% 4.8%
1 9.7% 100.8%
2 9.2% 9.7%