如何在python中合并多个数组?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35973828/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:13:30  来源:igfitidea点击:

How to merge multiple arrays in python?

pythonarraysnumpy

提问by MarcelHodan

I'd like to read the content of multiple files, process their data individually (because of performance and hardware resources) and write my results into one 'big' netCDF4 file.

我想读取多个文件的内容,单独处理它们的数据(由于性能和硬件资源)并将我的结果写入一个“大”netCDF4 文件。

Right now I'm able to read the files, process their data, but I struggle with the resulting multiple arrays. I wasn't able to merge them correctly.

现在我能够读取文件,处理它们的数据,但我对由此产生的多个数组感到困惑。我无法正确合并它们。

I've got a 3d array (time,long,lat) containing my calculated value for each day. What I like to do is to merge all the arrays I've got into one big array before I write it into my netCDF4 file. (all days in one array)

我有一个 3d 数组 (time,long,lat) 包含我每天计算的值。我喜欢做的是在将所有数组合并到一个大数组中,然后再将其写入我的 netCDF4 文件。(所有天都在一个数组中)

Here two example arrays:

这里有两个示例数组:

  • day1[19790101][-25][35]=95
  • day2[19790102][-15][25]=93
  • 第1天[19790101][-25][35]=95
  • 第2天[19790102][-15][25]=93

My expected result is:

我的预期结果是:

  • allDays[19790101][-25][35]=95
  • allDays[19790102][-15][25]=93
  • allDays[19790101][-25][35]=95
  • allDays[19790102][-15][25]=93

How can I achive that structure?

我怎样才能达到这种结构?

  • When I use: allDays=day1+day2my data will be aggregated.
  • When I use:

    allDays=[]
    allDays.append(day1)
    allDays.append(day2)
    

    my data will be surrounded by a new array.

  • 当我使用时:allDays=day1+day2我的数据将被聚合。
  • 当我使用:

    allDays=[]
    allDays.append(day1)
    allDays.append(day2)
    

    我的数据将被一个新数组包围。

FYI: I'm using Ubuntu 14.04 and Python: 3.5 (Anaconda)

仅供参考:我使用的是 Ubuntu 14.04 和 Python: 3.5 (Anaconda)

回答by J.J

When you do

当你做

allDays=[]
allDays.append(day1)
allDays.append(day2)

You are making a list of pointers to existing data, rather than repackaging the data. You could do:

您正在制作指向现有数据的指针列表,而不是重新打包数据。你可以这样做:

allDays=[]
allDays.append(day1[:])
allDays.append(day2[:])

And now it will copy the data out of day1 and into the new allDays array. This will double your memory usage, so perhaps best to issue a del day1after each addition to allDays.

现在它会将数据从 day1 复制到新的 allDays 数组中。这将使您的内存使用量增加一倍,因此最好del day1在每次添加 allDays 后发出一个。

Having said all that, if you use Pandas (usually recommended for time series data) or Numpy, this whole thing would be a lot quicker and use a lot less memory. Numpy arrays cannot hold pointers like python lists can, so the copy there is implied. Hope that clears some things up for you :) I can also highly recommend this videoby Ned

话虽如此,如果您使用 Pandas(通常推荐用于时间序列数据)或 Numpy,整个过程会快得多,而且使用的内存也少得多。Numpy 数组不能像 python 列表那样保存指针,因此隐含了副本。希望能为你解决一些问题:) 我也强烈推荐Ned 的这个视频

回答by Reti43

Let's start with some random data.

让我们从一些随机数据开始。

>>> import numpy as np
>>> day1 = np.random.randint(255, size=(1, 81, 141))

Your array has a dimension of size 1, so every time you want to access an element, you'll have to painstalkingly type day1[0,x,y]. You can remove that necessary dimension with np.squeeze().

您的数组的维度为 1,因此每次您想要访问一个元素时,您都必须煞费苦心地键入day1[0,x,y]. 您可以使用 删除必要的维度np.squeeze()

>>> day1[0,50,50]
36
>>> day1 = np.squeeze(day1)
>>> day1.shape
(81, 141)
>>> day1[50,50]
36

Now let's make some more of these.

现在让我们制作更多这些。

>>> day2 = np.random.randint(255, size=day1.shape)
>>> day3 = np.random.randint(255, size=day1.shape)

You can put all of these in one big list and pass them to np.array()which will create an array of size (N, 81, 141), where Nis the number of days you have.

您可以将所有这些放在一个大列表中并将它们传递给np.array()这将创建一个 size 数组(N, 81, 141),其中N是您拥有的天数。

>>> allDays = np.array([day1, day2, day3])
>>> allDays.shape
(3, 81, 141)

All the data from day1are in index 0, from day2in index 1, etc.

所有数据来自day1索引 0,来自day2索引 1,等等。

>>> allDays[0,50,50]
36

回答by Stop harming Monica

Use allDays = np.concatenate((day1, day2)).

使用allDays = np.concatenate((day1, day2)).