在 Python 中规范化数字列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26785354/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:00:56  来源:igfitidea点击:

Normalizing a list of numbers in Python

pythonprobability

提问by Adam_G

I need to normalize a list of values to fit in a probability distribution, i.e. between 0.0 and 1.0.

我需要对一系列值进行归一化以适应概率分布,即介于 0.0 和 1.0 之间。

I understand howto normalize, but was curious if Python had a function to automate this.

我了解如何规范化,但很好奇 Python 是否有一个功能可以自动执行此操作。

I'd like to go from:

我想从:

raw = [0.07, 0.14, 0.07]  

to

normed = [0.25, 0.50, 0.25]

采纳答案by Tony Suffolk 66

Use :

用 :

norm = [float(i)/sum(raw) for i in raw]

to normalize against the sum to ensure that the sum is always 1.0 (or as close to as possible).

对总和进行标准化以确保总和始终为 1.0(或尽可能接近)。

use

norm = [float(i)/max(raw) for i in raw]

to normalize against the maximum

对最大值进行标准化

回答by Anzel

try:

尝试:

normed = [i/sum(raw) for i in raw]

normed
[0.25, 0.5, 0.25]

回答by wnnmaw

There isn't any function in the standard library (to my knowledge) that will do it, but there are absolutely modules out there which have such functions. However, its easy enough that you can just write your own function:

标准库中没有任何函数(据我所知)可以做到这一点,但是绝对有具有此类功能的模块。但是,它很容易,您可以编写自己的函数:

def normalize(lst):
    s = sum(lst)
    return map(lambda x: float(x)/s, lst)

Sample output:

示例输出:

>>> normed = normalize(raw)
>>> normed
[0.25, 0.5, 0.25]

回答by gboffi

How long is the list you're going to normalize?

您要规范化的列表有多长?

def psum(it):
    "This function makes explicit how many calls to sum() are done."
    print "Another call!"
    return sum(it)

raw = [0.07,0.14,0.07]
print "How many calls to sum()?"
print [ r/psum(raw) for r in raw]

print "\nAnd now?"
s = psum(raw)
print [ r/s for r in raw]

# if one doesn't want auxiliary variables, it can be done inside
# a list comprehension, but in my opinion it's quite Baroque    
print "\nAnd now?"
print [ r/s  for s in [psum(raw)] for r in raw]

Output

输出

# How many calls to sum()?
# Another call!
# Another call!
# Another call!
# [0.25, 0.5, 0.25]
# 
# And now?
# Another call!
# [0.25, 0.5, 0.25]
# 
# And now?
# Another call!
# [0.25, 0.5, 0.25]

回答by Nurul Akter Towhid

Try this :

尝试这个 :

from __future__ import division

raw = [0.07, 0.14, 0.07]  

def norm(input_list):
    norm_list = list()

    if isinstance(input_list, list):
        sum_list = sum(input_list)

        for value in input_list:
            tmp = value  /sum_list
            norm_list.append(tmp) 

    return norm_list

print norm(raw)

This will do what you asked. But I will suggest to try Min-Max normalization.

这将按照您的要求进行。 但我会建议尝试 Min-Max 归一化。

min-max normalization :

最小-最大归一化:

def min_max_norm(dataset):
    if isinstance(dataset, list):
        norm_list = list()
        min_value = min(dataset)
        max_value = max(dataset)

        for value in dataset:
            tmp = (value - min_value) / (max_value - min_value)
            norm_list.append(tmp)

    return norm_list

回答by blaylockbk

if your list has negative numbers, this is how you would normalize it

如果您的列表有负数,这就是您将其标准化的方式

a = range(-30,31,5)
norm = [(float(i)-min(a))/(max(a)-min(a)) for i in a]

回答by Tengerye

If you consider using numpy, you can get a faster solution.

如果您考虑使用numpy,您可以获得更快的解决方案。

import random, time
import numpy as np

a = random.sample(range(1, 20000), 10000)
since = time.time(); b = [i/sum(a) for i in a]; print(time.time()-since)
# 0.7956490516662598

since = time.time(); c=np.array(a);d=c/sum(a); print(time.time()-since)
# 0.001413106918334961

回答by vespertine venus

If working with data, many times pandasis the simple key

如果处理数据,很多次pandas是简单的关键

This particular code will put the rawinto one column, then normalize by column per row. (But we can put it into a row and do it by row per column, too! Just have to change the axisvalues where 0 is for row and 1 is for column.)

这个特定的代码将把它raw放入一列,然后每行按列标准化。(但我们也可以将它放在一行中,并且每列逐行执行!只需更改axis值,其中 0 代表行,1 代表列。)

import pandas as pd


raw = [0.07, 0.14, 0.07]  

raw_df = pd.DataFrame(raw)
normed_df = raw_df.div(raw_df.sum(axis=0), axis=1)
normed_df

where normed_dfwill display like:

其中normed_df将提示:

    0
0   0.25
1   0.50
2   0.25

and then can keep playing with the data, too!

然后也可以继续玩数据!