在 Pandas 中旋转 DataFrame 以输出到 CSV
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11105728/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pivoting a DataFrame in Pandas for output to CSV
提问by user1467068
This is a simple question for which answer are surprisingly difficult to find online. Here's the situation:
这是一个简单的问题,在网上很难找到答案。这是情况:
>>> A
[('hey', 'you', 4), ('hey', 'not you', 5), ('not hey', 'you', 2), ('not hey', 'not you', 6)]
>>> A_p = pandas.DataFrame(A)
>>> A_p
0 1 2
0 hey you 4
1 hey not you 5
2 not hey you 2
3 not hey not you 6
>>> B_p = A_p.pivot(0,1,2)
>>> B_p
1 not you you
0
hey 5 4
not hey 6 2
This isn't quite what's suggested in the documentation for pivot-- there, it shows results without the 1 and 0 in the upper-left-hand corner. And that's what I'm looking for, a DataFrame object that prints as
这与文档中的建议不完全相同pivot- 在那里,它显示的结果左上角没有 1 和 0。这就是我正在寻找的,一个打印为的 DataFrame 对象
not you you
hey 5 4
not hey 6 2
The problem is that the normal behavior results in a csv file whose first line is
问题是正常行为会产生一个 csv 文件,它的第一行是
0,not you,you
when I really want
当我真的想要
not you, you
When the normal csv file (with the preceding "0,") reads into R, it doesn't properly set the column and row names from the frame object, resulting in painful manual manipulation to get it in the right format. Is there a way to get pivot to give me a DataFrame object without that additional information in the upper-left corner?
当普通的 csv 文件(带有前面的“0”)读入 R 时,它没有正确设置框架对象的列名和行名,从而导致痛苦的手动操作以将其转换为正确的格式。有没有办法让枢轴给我一个 DataFrame 对象而没有左上角的附加信息?
回答by Wes McKinney
Well, you have:
那么,你有:
In [17]: B_p.to_csv(sys.stdout)
0,not you,you
hey,5.0,4.0
not hey,6.0,2.0
In [18]: B_p.to_csv(sys.stdout, index=False)
not you,you
5.0,4.0
6.0,2.0
But I assume you want the row names. Setting the index name to None (B_p.index.name = None) gives a leading comma:
但我假设你想要行名。将索引名称设置为 None ( B_p.index.name = None) 会给出一个前导逗号:
In [20]: B_p.to_csv(sys.stdout)
,not you,you
hey,5.0,4.0
not hey,6.0,2.0
This roughly matches (ignoring quoted strings) what R writes in write.csvwhen row.names=TRUE:
这大致匹配(忽略带引号的字符串)R 在write.csvwhen 中写入的内容row.names=TRUE:
"","a","b"
"foo",0.720538259472741,-0.848304940318957
"bar",-0.64266667412325,-0.442441171401282
"baz",-0.419181615269841,-0.658545964124229
"qux",0.881124313748992,0.36383198969179
"bar2",-1.35613767310069,-0.124014006180608
Any of these help?
这些有帮助吗?
EDIT: Added the index_label=Falseoption today which does what you want:
编辑:index_label=False今天添加了您想要的选项:
In [2]: df
Out[2]:
A B
one 1 4
two 2 5
three 3 6
In [3]: df.to_csv('foo.csv', index_
index_exp index_label= index_name=
In [3]: df.to_csv('foo.csv', index_name=False)
In [4]:
11:24 ~/code/pandas (master)$ R
R version 2.14.0 (2011-10-31)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
re> read.csv('foo.csv')
A B
one 1 4
two 2 5
three 3 6

