在 Python 中存储多个数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/59648/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 19:28:21  来源:igfitidea点击:

Storing multiple arrays in Python

pythonarrays

提问by andy

I am writing a program to simulate the actual polling data companies like Gallup or Rasmussen publish daily: www.gallup.com and www.rassmussenreports.com

我正在编写一个程序来模拟像 Gallup 或 Rasmussen 这样的公司每天发布的实际民意调查数据:www.gallup.com 和 www.rassmussenreports.com

I'm using a brute force method, where the computer generates some random daily polling data and then calculates three day averages to see if the average of the random data matches pollsters numbers. (Most companies poll numbers are three day averages)

我使用的是蛮力方法,其中计算机生成一些随机的每日民意调查数据,然后计算三天的平均值,以查看随机数据的平均值是否与民意调查数字匹配。(大多数公司的民意调查数字是三天平均值)

Currently, it works well for one iteration, but my goal is to have it produce the most common simulation that matches the average polling data. I could then change the code of anywhere from 1 to 1000 iterations.

目前,它适用于一次迭代,但我的目标是让它产生与平均投票数据相匹配的最常见的模拟。然后我可以更改 1 到 1000 次迭代的任何代码。

And this is my problem. At the end of the test I have an array in a single variable that looks something like this:

这是我的问题。在测试结束时,我在单个变量中有一个数组,看起来像这样:

[40.1, 39.4, 56.7, 60.0, 20.0 ..... 19.0]

The program currently produces one array for each correct simulation. I can store each array in a single variable, but I then have to have a program that could generate 1 to 1000 variables depending on how many iterations I requested!?

该程序当前为每个正确的模拟生成一个数组。我可以将每个数组存储在一个变量中,但是我必须有一个程序可以根据我请求的迭代次数生成 1 到 1000 个变量!?

How do I avoid this? I know there is an intelligent way of doing this that doesn't require the program to generate variables to store arrays depending on how many simulations I want.

我如何避免这种情况?我知道有一种智能方法可以做到这一点,它不需要程序根据我想要的模拟次数生成变量来存储数组。

Code testing for McCain:

McCain 的代码测试:

 test = [] 

while x < 5: 

   test = round(100*random.random())

   mctest.append(test) 

   x = x +1 


mctestavg = (mctest[0] + mctest[1] + mctest[2])/3 

#mcavg is real data

if mctestavg == mcavg[2]: 
  mcwork = mctest 

How do I repeat without creating multiple mcwork vars?

如何在不创建多个 mcwork 变量的情况下重复?

采纳答案by dF.

Would something like this work?

这样的东西会起作用吗?

from random import randint    

mcworks = []

for n in xrange(NUM_ITERATIONS):
    mctest = [randint(0, 100) for i in xrange(5)]
    if sum(mctest[:3])/3 == mcavg[2]:
        mcworks.append(mctest) # mcavg is real data

In the end, you are left with a list of valid mctestlists.

最后,您会得到一个有效mctest列表的列表。

What I changed:

我改变了什么:

  • Used a list comprehensionto build the data instead of a for loop
  • Used random.randintto get random integers
  • Used slicesand sumto calculate the average of the first three items
  • (To answer your actual question :-) ) Put the results in a list mcworks, instead of creating a new variable for every iteration
  • 使用列表理解来构建数据而不是 for 循环
  • 用于random.randint获取随机整数
  • 使用切片sum计算前三项的平均值
  • (回答您的实际问题:-))将结果放在一个列表中mcworks,而不是为每次迭代创建一个新变量

回答by Nick Stinemates

Are you talking about doing this?

你是说做这个吗?

>>> a = [ ['a', 'b'], ['c', 'd'] ]
>>> a[1]
['c', 'd']
>>> a[1][1]
'd'

回答by dF.

Lists in python can contain any type of object -- If I understand the question correctly, will a listof lists do the job? Something like this (assuming you have a function generate_poll_data()which creates your data:

python 中的列表可以包含任何类型的对象——如果我正确理解了这个问题,a listof lists 会完成这项工作吗?像这样的东西(假设您有一个generate_poll_data()创建数据的函数:

data = []

for in xrange(num_iterations):
    data.append(generate_poll_data())

Then, data[n]will be the list of data from the (n-1)th run.

然后,data[n]将是(n-1)第 th 次运行的数据列表。

回答by Daren Thomas

since you are thinking in variables, you might prefer a dictionary over a list of lists:

由于您正在考虑变量,因此您可能更喜欢字典而不是列表列表:

data = {}
data['a'] = [generate_poll_data()]
data['b'] = [generate_poll_data()]

etc.

等等。

回答by Vinay

I would strongly consider using NumPyto do this. You get efficient N-dimensional arrays that you can quickly and easily process.

我会强烈考虑使用NumPy来做到这一点。您将获得可以快速轻松处理的高效 N 维数组。

回答by Mattias

A neat way to do it is to use a list of lists in combination with Pandas. Then you are able to create a 3-day rolling average. This makes it easy to search through the results by just adding the real ones as another column, and using the loc function for finding which ones that match.

一个巧妙的方法是将列表列表与 Pandas 结合使用。然后,您可以创建 3 天滚动平均值。这使得只需将真实结果添加为另一列并使用 loc 函数查找匹配的结果即可轻松搜索结果。

rand_vals = [randint(0, 100) for i in range(5))]
df = pd.DataFrame(data=rand_vals, columns=['generated data'])
df['3 day avg'] = df['generated data'].rolling(3).mean()
df['mcavg'] = mcavg # the list of real data
# Extract the resulting list of values
res = df.loc[df['3 day avg'] == df['mcavg']]['3 day avg'].values

This is also neat if you intend to use the same random values for different polls/persons, just add another column with their real values and perform the same search for them.

如果您打算对不同的民意调查/人使用相同的随机值,这也很不错,只需添加具有真实值的另一列并对它们执行相同的搜索。