在 Python 中存储多个数组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/59648/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Storing multiple arrays in Python
提问by andy
I am writing a program to simulate the actual polling data companies like Gallup or Rasmussen publish daily: www.gallup.com and www.rassmussenreports.com
我正在编写一个程序来模拟像 Gallup 或 Rasmussen 这样的公司每天发布的实际民意调查数据:www.gallup.com 和 www.rassmussenreports.com
I'm using a brute force method, where the computer generates some random daily polling data and then calculates three day averages to see if the average of the random data matches pollsters numbers. (Most companies poll numbers are three day averages)
我使用的是蛮力方法,其中计算机生成一些随机的每日民意调查数据,然后计算三天的平均值,以查看随机数据的平均值是否与民意调查数字匹配。(大多数公司的民意调查数字是三天平均值)
Currently, it works well for one iteration, but my goal is to have it produce the most common simulation that matches the average polling data. I could then change the code of anywhere from 1 to 1000 iterations.
目前,它适用于一次迭代,但我的目标是让它产生与平均投票数据相匹配的最常见的模拟。然后我可以更改 1 到 1000 次迭代的任何代码。
And this is my problem. At the end of the test I have an array in a single variable that looks something like this:
这是我的问题。在测试结束时,我在单个变量中有一个数组,看起来像这样:
[40.1, 39.4, 56.7, 60.0, 20.0 ..... 19.0]
The program currently produces one array for each correct simulation. I can store each array in a single variable, but I then have to have a program that could generate 1 to 1000 variables depending on how many iterations I requested!?
该程序当前为每个正确的模拟生成一个数组。我可以将每个数组存储在一个变量中,但是我必须有一个程序可以根据我请求的迭代次数生成 1 到 1000 个变量!?
How do I avoid this? I know there is an intelligent way of doing this that doesn't require the program to generate variables to store arrays depending on how many simulations I want.
我如何避免这种情况?我知道有一种智能方法可以做到这一点,它不需要程序根据我想要的模拟次数生成变量来存储数组。
Code testing for McCain:
McCain 的代码测试:
test = []
while x < 5:
test = round(100*random.random())
mctest.append(test)
x = x +1
mctestavg = (mctest[0] + mctest[1] + mctest[2])/3
#mcavg is real data
if mctestavg == mcavg[2]:
mcwork = mctest
How do I repeat without creating multiple mcwork vars?
如何在不创建多个 mcwork 变量的情况下重复?
采纳答案by dF.
Would something like this work?
这样的东西会起作用吗?
from random import randint
mcworks = []
for n in xrange(NUM_ITERATIONS):
mctest = [randint(0, 100) for i in xrange(5)]
if sum(mctest[:3])/3 == mcavg[2]:
mcworks.append(mctest) # mcavg is real data
In the end, you are left with a list of valid mctest
lists.
最后,您会得到一个有效mctest
列表的列表。
What I changed:
我改变了什么:
- Used a list comprehensionto build the data instead of a for loop
- Used
random.randint
to get random integers - Used slicesand
sum
to calculate the average of the first three items - (To answer your actual question :-) ) Put the results in a list
mcworks
, instead of creating a new variable for every iteration
回答by Nick Stinemates
Are you talking about doing this?
你是说做这个吗?
>>> a = [ ['a', 'b'], ['c', 'd'] ]
>>> a[1]
['c', 'd']
>>> a[1][1]
'd'
回答by dF.
Lists in python can contain any type of object -- If I understand the question correctly, will a list
of list
s do the job? Something like this (assuming you have a function generate_poll_data()
which creates your data:
python 中的列表可以包含任何类型的对象——如果我正确理解了这个问题,a list
of list
s 会完成这项工作吗?像这样的东西(假设您有一个generate_poll_data()
创建数据的函数:
data = []
for in xrange(num_iterations):
data.append(generate_poll_data())
Then, data[n]
will be the list of data from the (n-1)
th run.
然后,data[n]
将是(n-1)
第 th 次运行的数据列表。
回答by Daren Thomas
since you are thinking in variables, you might prefer a dictionary over a list of lists:
由于您正在考虑变量,因此您可能更喜欢字典而不是列表列表:
data = {}
data['a'] = [generate_poll_data()]
data['b'] = [generate_poll_data()]
etc.
等等。
回答by Vinay
回答by Mattias
A neat way to do it is to use a list of lists in combination with Pandas. Then you are able to create a 3-day rolling average. This makes it easy to search through the results by just adding the real ones as another column, and using the loc function for finding which ones that match.
一个巧妙的方法是将列表列表与 Pandas 结合使用。然后,您可以创建 3 天滚动平均值。这使得只需将真实结果添加为另一列并使用 loc 函数查找匹配的结果即可轻松搜索结果。
rand_vals = [randint(0, 100) for i in range(5))]
df = pd.DataFrame(data=rand_vals, columns=['generated data'])
df['3 day avg'] = df['generated data'].rolling(3).mean()
df['mcavg'] = mcavg # the list of real data
# Extract the resulting list of values
res = df.loc[df['3 day avg'] == df['mcavg']]['3 day avg'].values
This is also neat if you intend to use the same random values for different polls/persons, just add another column with their real values and perform the same search for them.
如果您打算对不同的民意调查/人使用相同的随机值,这也很不错,只需添加具有真实值的另一列并对它们执行相同的搜索。