在 Python 中规范化数字列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26785354/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Normalizing a list of numbers in Python
提问by Adam_G
I need to normalize a list of values to fit in a probability distribution, i.e. between 0.0 and 1.0.
我需要对一系列值进行归一化以适应概率分布,即介于 0.0 和 1.0 之间。
I understand howto normalize, but was curious if Python had a function to automate this.
我了解如何规范化,但很好奇 Python 是否有一个功能可以自动执行此操作。
I'd like to go from:
我想从:
raw = [0.07, 0.14, 0.07]
to
到
normed = [0.25, 0.50, 0.25]
采纳答案by Tony Suffolk 66
Use :
用 :
norm = [float(i)/sum(raw) for i in raw]
to normalize against the sum to ensure that the sum is always 1.0 (or as close to as possible).
对总和进行标准化以确保总和始终为 1.0(或尽可能接近)。
use
用
norm = [float(i)/max(raw) for i in raw]
to normalize against the maximum
对最大值进行标准化
回答by Anzel
try:
尝试:
normed = [i/sum(raw) for i in raw]
normed
[0.25, 0.5, 0.25]
回答by wnnmaw
There isn't any function in the standard library (to my knowledge) that will do it, but there are absolutely modules out there which have such functions. However, its easy enough that you can just write your own function:
标准库中没有任何函数(据我所知)可以做到这一点,但是绝对有具有此类功能的模块。但是,它很容易,您可以编写自己的函数:
def normalize(lst):
s = sum(lst)
return map(lambda x: float(x)/s, lst)
Sample output:
示例输出:
>>> normed = normalize(raw)
>>> normed
[0.25, 0.5, 0.25]
回答by gboffi
How long is the list you're going to normalize?
您要规范化的列表有多长?
def psum(it):
"This function makes explicit how many calls to sum() are done."
print "Another call!"
return sum(it)
raw = [0.07,0.14,0.07]
print "How many calls to sum()?"
print [ r/psum(raw) for r in raw]
print "\nAnd now?"
s = psum(raw)
print [ r/s for r in raw]
# if one doesn't want auxiliary variables, it can be done inside
# a list comprehension, but in my opinion it's quite Baroque
print "\nAnd now?"
print [ r/s for s in [psum(raw)] for r in raw]
Output
输出
# How many calls to sum()?
# Another call!
# Another call!
# Another call!
# [0.25, 0.5, 0.25]
#
# And now?
# Another call!
# [0.25, 0.5, 0.25]
#
# And now?
# Another call!
# [0.25, 0.5, 0.25]
回答by Nurul Akter Towhid
Try this :
尝试这个 :
from __future__ import division
raw = [0.07, 0.14, 0.07]
def norm(input_list):
norm_list = list()
if isinstance(input_list, list):
sum_list = sum(input_list)
for value in input_list:
tmp = value /sum_list
norm_list.append(tmp)
return norm_list
print norm(raw)
This will do what you asked. But I will suggest to try Min-Max normalization.
这将按照您的要求进行。 但我会建议尝试 Min-Max 归一化。
min-max normalization :
最小-最大归一化:
def min_max_norm(dataset):
if isinstance(dataset, list):
norm_list = list()
min_value = min(dataset)
max_value = max(dataset)
for value in dataset:
tmp = (value - min_value) / (max_value - min_value)
norm_list.append(tmp)
return norm_list
回答by blaylockbk
if your list has negative numbers, this is how you would normalize it
如果您的列表有负数,这就是您将其标准化的方式
a = range(-30,31,5)
norm = [(float(i)-min(a))/(max(a)-min(a)) for i in a]
回答by Tengerye
If you consider using numpy, you can get a faster solution.
如果您考虑使用numpy,您可以获得更快的解决方案。
import random, time
import numpy as np
a = random.sample(range(1, 20000), 10000)
since = time.time(); b = [i/sum(a) for i in a]; print(time.time()-since)
# 0.7956490516662598
since = time.time(); c=np.array(a);d=c/sum(a); print(time.time()-since)
# 0.001413106918334961
回答by vespertine venus
If working with data, many times pandasis the simple key
如果处理数据,很多次pandas是简单的关键
This particular code will put the rawinto one column, then normalize by column per row. (But we can put it into a row and do it by row per column, too! Just have to change the axisvalues where 0 is for row and 1 is for column.)
这个特定的代码将把它raw放入一列,然后每行按列标准化。(但我们也可以将它放在一行中,并且每列逐行执行!只需更改axis值,其中 0 代表行,1 代表列。)
import pandas as pd
raw = [0.07, 0.14, 0.07]
raw_df = pd.DataFrame(raw)
normed_df = raw_df.div(raw_df.sum(axis=0), axis=1)
normed_df
where normed_dfwill display like:
其中normed_df将提示:
0
0 0.25
1 0.50
2 0.25
and then can keep playing with the data, too!
然后也可以继续玩数据!

