pandas 熊猫 to_csv 标题与列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31696012/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas to_csv header vs columns
提问by firelynx
It seems that the pandas.to_csv function has two attributes which does the same thing.
似乎 pandas.to_csv 函数有两个做同样事情的属性。
Maybe I am missing something.
也许我错过了一些东西。
From the documentation:
从文档:
columns : sequence, optional
Columns to write
header : boolean or list of string, default True
Write out column names. If a list of string is given it is assumed to be aliases for the column names
列:序列,可选
要写入的列
header : 布尔值或字符串列表,默认为 True
写出列名。如果给出字符串列表,则假定它是列名的别名
When giving either a list of columns, they put the columns into the order I specify.
当给出一个列列表时,他们将列按我指定的顺序排列。
import pandas as pd
from StringIO import StringIO
df = pd.DataFrame({"foo":[1,2], "bar":[1,2]})
sio = StringIO()
df.to_csv(sio)
sio.getvalue()
',bar,foo\n0,1,1\n1,2,2\n'
sio = StringIO()
df.to_csv(sio, header=['foo', 'bar'])
sio.getvalue()
',foo,bar\n0,1,1\n1,2,2\n'
sio.close()
sio = StringIO()
df.to_csv(sio, columns=['foo', 'bar'])
sio.getvalue()
',foo,bar\n0,1,1\n1,2,2\n'
sio.close()
If I only want to sort the column order, which one is the properone to use?
如果我只想对列顺序进行排序,哪一个是合适的?
The only scenario I see where it makes sense for these two named attributes to be different is if I want to select columns, but notwrite the header into the csv file.
我认为这两个命名属性不同的唯一情况是,如果我想选择列,但不想将标题写入 csv 文件。
This would mean that using columns=['foo', 'bar']is the proper option.
这意味着使用columns=['foo', 'bar']是正确的选择。
采纳答案by firelynx
While writing this question, I realized the answer and I thought I would share it right away.
在写这个问题的时候,我意识到了答案,我想我会马上分享它。
My example data did not show the problem
我的示例数据没有显示问题
Using columns, the column order is changed, both header and values.
使用列,列顺序会发生变化,包括标题和值。
df = pd.DataFrame({"foo":[1,2], "bar":[1111,2111]})
sio = StringIO()
df.to_csv(sio, columns=['foo', 'bar'])
sio.getvalue()
',foo,bar\n0,1,1111\n1,2,2111\n'
Using header, the header changes, but not the values in the columns.
使用标题,标题会更改,但不会更改列中的值。
sio = StringIO()
df.to_csv(sio, header=['foo', 'bar'])
sio.getvalue()
',foo,bar\n0,1111,1\n1,2111,2\n'
If you confuse columns=and header=, you're gonna have a bad time.
如果你混淆了columns=和header=,你会过得很糟糕。

