Python:CSV 按列而不是按行写入

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4155106/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 14:31:09  来源:igfitidea点击:

Python: CSV write by column rather than row

pythoncsv

提问by Harpal

I have a python script that generates a bunch of data in a while loop. I need to write this data to a CSV file, so it writes by column rather than row.

我有一个 python 脚本,它在 while 循环中生成一堆数据。我需要将此数据写入 CSV 文件,因此它按列而不是按行写入。

For example in loop 1 of my script I generate:

例如在我生成的脚本的循环 1 中:

(1, 2, 3, 4)

I need this to reflect in my csv script like so:

我需要这个来反映在我的 csv 脚本中,如下所示:

Result_1    1
Result_2    2
Result_3    3
Result_4    4

On my second loop i generate:

在我的第二个循环中,我生成:

(5, 6, 7, 8)

I need this to look in my csv file like so:

我需要这个来查看我的 csv 文件,如下所示:

Result_1    1    5
Result_2    2    6
Result_3    3    7
Result_4    4    8

and so forth until the while loop finishes. Can anybody help me?

依此类推,直到 while 循环结束。有谁能够帮助我?



EDIT

编辑

The while loop can last over 100,000 loops

while 循环可以持续超过 100,000 次循环

采纳答案by Ignacio Vazquez-Abrams

The reason csvdoesn't support that is because variable-length lines are not really supported on most filesystems. What you should do instead is collect all the data in lists, then call zip()on them to transpose them after.

csv不支持的原因是因为大多数文件系统并不真正支持可变长度行。你应该做的是收集列表中的所有数据,然后调用zip()它们来转置它们。

>>> l = [('Result_1', 'Result_2', 'Result_3', 'Result_4'), (1, 2, 3, 4), (5, 6, 7, 8)]
>>> zip(*l)
[('Result_1', 1, 5), ('Result_2', 2, 6), ('Result_3', 3, 7), ('Result_4', 4, 8)]

回答by mouad

what about Result_*there also are generated in the loop (because i don't think it's possible to add to the csv file)

什么Result_*也有一些在回路中产生(因为我不认为这是可能添加到CSV文件)

i will go like this ; generate all the data at one rotate the matrix write in the file:

我会这样;一次生成所有数据旋转矩阵写入文件:

A = []

A.append(range(1, 5))  # an Example of you first loop

A.append(range(5, 9))  # an Example of you second loop

data_to_write = zip(*A)

# then you can write now row by row

回答by lazy1

Updating lines in place in a file is not supported on most file system (a line in a file is just some data that ends with newline, the next line start just after that).

大多数文件系统不支持更新文件中的行(文件中的一行只是一些以换行符结尾的数据,下一行紧随其后)。

As I see it you have two options:

在我看来,您有两个选择:

  1. Have your data generating loops be generators, this way they won't consume a lot of memory - you'll get data for each row "just in time"
  2. Use a database (sqlite?) and update the rows there. When you're done - export to CSV
  1. 让您的数据生成循环成为生成器,这样它们就不会消耗大量内存 - 您将“及时”获得每一行的数据
  2. 使用数据库(sqlite?)并更新那里的行。完成后 - 导出为 CSV

Small example for the first method:

第一种方法的小例子:

from itertools import islice, izip, count
print list(islice(izip(count(1), count(2), count(3)), 10))

This will print

这将打印

[(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9), (8, 9, 10), (9, 10, 11), (10, 11, 12)]

even though countgenerate an infinite sequence of numbers

即使count生成无限的数字序列

回答by Gregg Lind

As an alternate streaming approach:

作为替代流媒体方法:

  • dump each col into a file
  • use python or unix paste command to rejoin on tab, csv, whatever.
  • 将每个 col 转储到一个文件中
  • 使用 python 或 unix paste 命令重新加入选项卡、csv 等。

Both steps should handle steaming just fine.

这两个步骤都应该可以很好地处理蒸汽。

Pitfalls:

陷阱:

  • if you have 1000s of columns, you might run into the unix file handle limit!
  • 如果您有 1000 列,您可能会遇到 unix 文件句柄限制!

回答by John Machin

Let's assume that (1) you don't have a large memory (2) you have row headings in a list (3) all the data values are floats; if they're all integers up to 32- or 64-bits worth, that's even better.

让我们假设 (1) 您没有大内存 (2) 您在列表中有行标题 (3) 所有数据值都是浮点数;如果它们都是高达 32 位或 64 位的整数,那就更好了。

On a 32-bit Python, storing a float in a list takes 16 bytes for the float object and 4 bytes for a pointer in the list; total 20. Storing a float in an array.array('d')takes only 8 bytes. Increasingly spectacular savings are available if all your data are int (any negatives?) that will fit in 8, 4, 2 or 1 byte(s) -- especially on a recent Python where all ints are longs.

在 32 位 Python 上,在列表中存储浮点数需要 16 个字节用于浮点对象,4 个字节用于列表中的指针;总计 20. 在array.array('d') 中存储一个浮点数只需要 8 个字节。如果您的所有数据都是适合 8、4、2 或 1 个字节的整数(任何负数?),则可以节省越来越多的费用——尤其是在最近的所有整数都是 long 的 Python 上。

The following pseudocode assumes floats stored in array.array('d'). In case you don't really have a memory problem, you can still use this method; I've put in comments to indicate the changes needed if you want to use a list.

以下伪代码假定浮点数存储在 array.array('d') 中。如果你真的没有内存问题,你仍然可以使用这个方法;如果您想使用列表,我已经添加了注释以指示所需的更改。

# Preliminary:
import array # list: delete
hlist = []
dlist = []
for each row: 
    hlist.append(some_heading_string)
    dlist.append(array.array('d')) # list: dlist.append([])
# generate data
col_index = -1
for each column:
    col_index += 1
    for row_index in xrange(len(hlist)):
        v = calculated_data_value(row_index, colindex)
        dlist[row_index].append(v)
# write to csv file
for row_index in xrange(len(hlist)):
    row = [hlist[row_index]]
    row.extend(dlist[row_index])
    csv_writer.writerow(row)

回答by Anthony Ebert

Read it in by row and then transpose it in the command line. If you're using Unix, install csvtool and follow the directions in: https://unix.stackexchange.com/a/314482/186237

逐行读取,然后在命令行中转置。如果您使用的是 Unix,请安装 csvtool 并按照以下说明操作:https://unix.stackexchange.com/a/314482/186237

回答by aybuke

wr.writerow(item)  #column by column
wr.writerows(item) #row by row

This is quite simple if your goal is just to write the output column by column.

如果您的目标只是逐列编写输出,这非常简单。

If your item is a list:

如果您的项目是列表:

yourList = []

with open('yourNewFileName.csv', 'w', ) as myfile:
    wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
    for word in yourList:
        wr.writerow([word])