Python 将列表元素转换为数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24112274/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 04:00:39  来源:igfitidea点击:

Convert list elements into array

pythonarrayslisthadoop

提问by rond

I have a list tsv file which I am parsing and want to convert it into an array.

我有一个列表 tsv 文件,我正在解析它并希望将其转换为数组。

Here is the file format -

这是文件格式 -

jobname1 queue maphours reducehours
jobname2 queue maphours reducehours

code

代码

with open(file.tsv) as tsv:
    line = [elem.strip().split('\t') for elem in tsv]
    vals = np.asarray(line)
    print vals[0]
    print vals[4]

Vals currently returns the following output -

Vals 当前返回以下输出 -

['job1', 'queue', '1.0', '0.0\n']
['job2', 'queue', '1.0', '0.0\n']

I want to convert each element in a row in the entire file to an array object -

我想将整个文件中一行中的每个元素转换为一个数组对象 -

vals[0] = job1 vals[1] = queue vals[2] = 1.0 vals[3] = 0.0 

How do i achieve this?

我如何实现这一目标?

采纳答案by Marcin

From what I understand you would like to create 2D array in numpy where each row of the file is a row corresponds to the created array, and column in a file is a column in the array. If so, you could do this as follows:

据我了解,您想在 numpy 中创建二维数组,其中文件的每一行都是一行对应于创建的数组,文件中的列是数组中的一列。如果是这样,您可以按如下方式执行此操作:

For example, if your data file is:

例如,如果您的数据文件是:

jobname1    queue   1   3
jobname2    queue   2   4
jobname41   queue   1   1
jobname32   queue   2   2
jobname21   queue   3   4
jobname12   queue   1   6

The following code:

以下代码:

with open(file) as tsv:
    line = [elem.strip().split('\t') for elem in tsv]

vals = np.asarray(line) 

will result in the following valsarray:

将导致以下vals数组:

[['jobname1' 'queue' '1' '3']
 ['jobname2' 'queue' '2' '4']
 ['jobname41' 'queue' '1' '1']
 ['jobname32' 'queue' '2' '2']
 ['jobname21' 'queue' '3' '4']
 ['jobname12' 'queue' '1' '6']]

The get the job names you can do:

获取您可以执行的工作名称:

print(vals[:,0])
% gives ['jobname1' 'jobname2' 'jobname41' 'jobname32' 'jobname21' 'jobname12']

Or if you want rows containing some job, you can do:

或者,如果您想要包含某些作业的行,您可以执行以下操作:

print(vals[np.apply_along_axis(lambda row: row[0] == 'jobname1', 1, vals)])

回答by rofls

Are you sure you need an array? @Marcin's answer is more complete if you want a Numpy array.

你确定你需要一个数组吗?如果你想要一个 Numpy 数组,@Marcin 的答案会更完整。

Python doesn't have an array data structure (there's a list of Python data structures here). There is a "thin wrapper around the C array". In order to use the wrapper around the C array, you have to specify a type that the array will hold (hereyou'll find a list of typecodes, at the top, and examples at the bottom):

Python没有一个数组(有Python数据结构的列表在这里)。有一个“围绕 C 数组的薄包装”。为了在 C 数组周围使用包装器,您必须指定数组将保存的类型(在这里,您将在顶部找到类型代码列表,在底部找到示例):

If you want to use a numpy array, this should work:

如果你想使用一个 numpy 数组,这应该可以工作:

import numpy as np
myarray = np.asarray(yourList)

adopted from here.

这里采用。