pandas 熊猫 to_csv 标题与列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31696012/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:41:45  来源:igfitidea点击:

pandas to_csv header vs columns

pythonpandas

提问by firelynx

It seems that the pandas.to_csv function has two attributes which does the same thing.

似乎 pandas.to_csv 函数有两个做同样事情的属性。

Maybe I am missing something.

也许我错过了一些东西。

From the documentation:

从文档:

columns : sequence, optional

Columns to write

header : boolean or list of string, default True

Write out column names. If a list of string is given it is assumed to be aliases for the column names

列:序列,可选

要写入的列

header : 布尔值或字符串列表,默认为 True

写出列名。如果给出字符串列表,则假定它是列名的别名

When giving either a list of columns, they put the columns into the order I specify.

当给出一个列列表时,他们将列按我指定的顺序排列。

import pandas as pd
from StringIO import StringIO

df = pd.DataFrame({"foo":[1,2], "bar":[1,2]})

sio = StringIO()
df.to_csv(sio)
sio.getvalue()
',bar,foo\n0,1,1\n1,2,2\n'

sio = StringIO()
df.to_csv(sio, header=['foo', 'bar'])
sio.getvalue()
',foo,bar\n0,1,1\n1,2,2\n'
sio.close()

sio = StringIO()
df.to_csv(sio, columns=['foo', 'bar'])
sio.getvalue()
',foo,bar\n0,1,1\n1,2,2\n'
sio.close()

If I only want to sort the column order, which one is the properone to use?

如果我只想对列顺序进行排序,哪一个是合适的

The only scenario I see where it makes sense for these two named attributes to be different is if I want to select columns, but notwrite the header into the csv file.

我认为这两个命名属性不同的唯一情况是,如果我想选择列,但不想将标题写入 csv 文件。

This would mean that using columns=['foo', 'bar']is the proper option.

这意味着使用columns=['foo', 'bar']是正确的选择。

采纳答案by firelynx

While writing this question, I realized the answer and I thought I would share it right away.

在写这个问题的时候,我意识到了答案,我想我会马上分享它。

My example data did not show the problem

我的示例数据没有显示问题

Using columns, the column order is changed, both header and values.

使用列,列顺序会发生变化,包括标题和值。

df = pd.DataFrame({"foo":[1,2], "bar":[1111,2111]})
sio = StringIO()
df.to_csv(sio, columns=['foo', 'bar'])
sio.getvalue()
',foo,bar\n0,1,1111\n1,2,2111\n'

Using header, the header changes, but not the values in the columns.

使用标题,标题会更改,但不会更改列中的值。

sio = StringIO()
df.to_csv(sio, header=['foo', 'bar'])
sio.getvalue()
',foo,bar\n0,1111,1\n1,2111,2\n'

If you confuse columns=and header=, you're gonna have a bad time.

如果你混淆了columns=header=,你会过得很糟糕。