Python 使用 numpy 将 csv 加载到 2D 矩阵中进行绘图

Question

提问by dgorissen

Given this CSV file:

鉴于此 CSV 文件：

"A","B","C","D","E","F","timestamp"
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12

I simply want to load it as a matrix/ndarray with 3 rows and 7 columns. However, for some reason, all I can get out of numpy is an ndarray with 3 rows (one per line) and no columns.

我只是想将它加载为 3 行 7 列的矩阵/ndarray。然而，出于某种原因，我能从 numpy 中得到的只是一个 ndarray 有 3 行（每行一个）并且没有列。

r = np.genfromtxt(fname,delimiter=',',dtype=None, names=True)
print r
print r.shape

[ (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291111964948.0)
 (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291113113366.0)
 (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291120650486.0)]
(3,)

I can manually iterate and hack it into the shape I want, but this seems silly. I just want to load it as a proper matrix so I can slice it across different dimensions and plot it, just like in matlab.

我可以手动迭代并将其修改为我想要的形状，但这似乎很愚蠢。我只想将它作为一个适当的矩阵加载，这样我就可以将它切片到不同的维度并绘制它，就像在 matlab 中一样。

Answer 1

采纳答案by Kaveh_kh

Pure numpy

纯麻木

numpy.loadtxt(open("test.csv", "rb"), delimiter=",", skiprows=1)

Check out the loadtxtdocumentation.

查看loadtxt文档。

You can also use python's csv module:

您还可以使用 python 的 csv 模块：

import csv
import numpy
reader = csv.reader(open("test.csv", "rb"), delimiter=",")
x = list(reader)
result = numpy.array(x).astype("float")

You will have to convert it to your favorite numeric type. I guess you can write the whole thing in one line:

您必须将其转换为您喜欢的数字类型。我想你可以在一行中写出整件事：

result = numpy.array(list(csv.reader(open("test.csv", "rb"), delimiter=","))).astype("float")

Added Hint:

添加提示：

You could also use pandas.io.parsers.read_csvand get the associated numpyarray which can be faster.

您还可以使用pandas.io.parsers.read_csv并获取numpy可以更快的关联数组。

Answer 2

回答by mtrw

I think using dtypewhere there is a name row is confusing the routine. Try

我认为dtype在有名称行的地方使用会混淆例程。尝试

>>> r = np.genfromtxt(fname, delimiter=',', names=True)
>>> r
array([[  6.11882430e+02,   9.08956010e+03,   5.13300000e+03,
          8.64075140e+02,   1.71537476e+03,   7.65227770e+02,
          1.29111196e+12],
       [  6.11882430e+02,   9.08956010e+03,   5.13300000e+03,
          8.64075140e+02,   1.71537476e+03,   7.65227770e+02,
          1.29111311e+12],
       [  6.11882430e+02,   9.08956010e+03,   5.13300000e+03,
          8.64075140e+02,   1.71537476e+03,   7.65227770e+02,
          1.29112065e+12]])
>>> r[:,0]    # Slice 0'th column
array([ 611.88243,  611.88243,  611.88243])

Answer 3

回答by Mike T

You can read a CSV file with headers into a NumPy structured arraywith np.genfromtxt. For example:

您可以使用np.genfromtxt 将带有标题的 CSV 文件读入NumPy 结构化数组。例如：

import numpy as np

csv_fname = 'file.csv'
with open(csv_fname, 'w') as fp:
    fp.write("""\
"A","B","C","D","E","F","timestamp"
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12
""")

# Read the CSV file into a Numpy record array
r = np.genfromtxt(csv_fname, delimiter=',', names=True, case_sensitive=True)
print(repr(r))

which looks like this:

看起来像这样：

array([(611.88243, 9089.5601, 5133., 864.07514, 1715.37476, 765.22777, 1.29111196e+12),
       (611.88243, 9089.5601, 5133., 864.07514, 1715.37476, 765.22777, 1.29111311e+12),
       (611.88243, 9089.5601, 5133., 864.07514, 1715.37476, 765.22777, 1.29112065e+12)],
      dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<f8'), ('D', '<f8'), ('E', '<f8'), ('F', '<f8'), ('timestamp', '<f8')])

You can access a named column like this r['E']:

您可以像这样访问命名列r['E']：

array([1715.37476, 1715.37476, 1715.37476])

Note: this answer previously used np.recfromcsvto read the data into a NumPy record array. While there was nothing wrong with that method, structured arrays are generally better than record arrays for speed and compatibility.

注意：此答案以前使用np.recfromcsv将数据读入NumPy 记录数组。虽然该方法没有任何问题，但结构化数组在速度和兼容性方面通常优于记录数组。

Python 使用 numpy 将 csv 加载到 2D 矩阵中进行绘图

提问by dgorissen

采纳答案by Kaveh_kh

回答by mtrw

回答by Mike T

相关推荐

最近更新

标签

Python 使用 numpy 将 csv 加载到 2D 矩阵中进行绘图

提问by dgorissen

采纳答案by Kaveh_kh

回答by mtrw

回答by Mike T

相关推荐

如何从 Python 中的字符串中提取数字？

Python 更改 matplotlib 轴设置

如何在python解释器shell中重复最后一个命令？

如何在不知道编码的情况下将字节写入 Python 3 中的文件？

相关推荐

最近更新

标签