Python Pandas:使用范围内的随机整数在 df 中创建新列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30327417/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:15:23  来源:igfitidea点击:

Pandas: create new column in df with random integers from range

pythonpandasrandomintegerrange

提问by screechOwl

I have a pandas data frame with 50k rows. I'm trying to add a new column that is a randomly generated integer from 1 to 5.

我有一个包含 50k 行的 Pandas 数据框。我正在尝试添加一个新列,它是一个从 1 到 5 的随机生成的整数。

If I want 50k random numbers I'd use:

如果我想要 50k 个随机数,我会使用:

df1['randNumCol'] = random.sample(xrange(50000), len(df1))

but for this I'm not sure how to do it.

但为此,我不知道该怎么做。

Side note in R, I'd do:

R 中的旁注,我会这样做:

sample(1:5, 50000, replace = TRUE)

Any suggestions?

有什么建议?

采纳答案by Matt

One solution is to use numpy.random.randint:

一种解决方案是使用numpy.random.randint

import numpy as np
df1['randNumCol'] = np.random.randint(1, 6, df1.shape[0])

Or if the numbers are non-consecutive (albeit slower), you can use this:

或者,如果数字不连续(虽然速度较慢),您可以使用:

df1['randNumCol'] = np.random.choice([1, 9, 20], df1.shape[0])

In order to make the results reproducible you can set the seed with numpy.random.seed(e.g. np.random.seed(42))

为了使结果可重复,您可以使用numpy.random.seed(例如np.random.seed(42))设置种子

回答by smci

To add a column of random integers, use randint(low, high, size). There's no need to waste memory allocating range(low, high); that could be a lot of memory if highis large.

要添加一列随机整数,请使用randint(low, high, size)。没有必要浪费内存分配range(low, high);如果很大,那可能是很多内存high

df1['randNumCol'] = np.random.randint(0,5, size=len(df1))

(Note also that when we're just adding a single column, sizeis just an integer. In general if we want to generate an array/dataframe of randint()s, size can be a tuple, as in Pandas: How to create a data frame of random integers?)

(还要注意,当我们只添加一列时,size它只是一个整数。一般来说,如果我们想生成一个数组/数据帧randint()s,大小可以是一个元组,如Pandas: How to create a data frame of random整数?