Python pandas to_csv 参数 float_format 和 decimal 不适用于索引列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31586162/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas to_csv arguments float_format and decimal not working for index column
提问by albert
Background
背景
I am doing some simulations resp. a system analysis by variing parameters (in this case rpm
only) and append every last line of a results dataframe results_df
to a summarizing dataframe df
containing giving the baviour of my system in depencence of the varied rpm
.
我正在做一些模拟。通过改变参数(rpm
仅在这种情况下)进行系统分析,并将结果数据框的每一行附加results_df
到汇总数据df
框,其中包含根据不同的rpm
.
In order to get an appropriate index for plotting and data analysis I converted the varied values (here rpm
) from the list into a pandas series ser
and concat this series with the summarizing dataframe df
containing the results I am interested in.
为了获得用于绘图和数据分析的适当索引,我将列表中的不同值(此处rpm
)转换为熊猫系列,ser
并将该系列与df
包含我感兴趣的结果的汇总数据框连接起来。
Since the results of each calculation I am interested in is only last line of each calculation I am extracting this data from the results dataframe results_df
by using .tail(1)
.
由于我感兴趣的每个计算的结果只是每个计算的最后一行,因此我results_df
使用.tail(1)
.
What I have done so far is shown in the following snippet:
到目前为止,我所做的工作显示在以下代码段中:
rpm = [0.25, 0.3, 0.5, 0.75, 1.0, 1.5, 2.0]
ser = pd.Series(rpm, name='rpm')
df = pd.DataFrame()
df_list = list()
for i, val in enumerate(rpm):
results_df = get_some_data_from_somwhere()
df_list.append(results_df.tail(1))
df = df.append(df_list, ignore_index=True)
df = pd.concat([df, ser], axis=1)
df.set_index('rpm', inplace=True)
with open('foo.csv', 'w') as f:
data.to_csv(f, index=True, header=True, decimal=',', sep=' ', float_format='%.3f')
Problem
问题
This csv-file what I get has the follwing format:
我得到的这个 csv 文件具有以下格式:
rpm cooling_inner heating_inner cooling_outlet heating_outlet
0.25 303,317 323,372 302,384 324,332
However, I expected having three decimal digits and a comma as decimal sign on my index column, like shown here:
但是,我希望索引列上有三个十进制数字和一个逗号作为十进制符号,如下所示:
rpm cooling_inner heating_inner cooling_outlet heating_outlet
0,250 303,317 323,372 302,384 324,332
So it seems that the index
and decimal
sign options are not applied to the index column when exporting dataframes to csv-files using the .to_csv
command.
因此,使用该命令将数据帧导出到 csv 文件时,似乎 theindex
和decimal
sign 选项不适用于索引列.to_csv
。
How could I achieve this behaviour since the index
option is set True
and all values (with exception to the index column) have the right format and decimal sign?
由于index
设置了选项True
并且所有值(索引列除外)都具有正确的格式和十进制符号,我怎么能实现这种行为?
Do I have to handle the index column somehow seperate?
我是否必须以某种方式单独处理索引列?
采纳答案by firelynx
I would rewrite your two bottom lines:
我会重写你的两条底线:
with open('foo.csv', 'w') as f:
data.to_csv(f, index=True, header=True, decimal=',', sep=' ', float_format='%.3f')
Into
进入
data.reset_index().to_csv('foo.csv', index=False, header=True, decimal=',', sep=' ', float_format='%.3f')
This is a bit of a workaround, but as you have noticed, the keyword arguments decimal=
and float_format=
only work on datacolumns, not on the index.
这是一种解决方法,但正如您所注意到的,关键字参数decimal=
和float_format=
仅适用于数据列,而不适用于索引。
What I do instead is to put the index into the dataframe with reset_index
and then I tell to_csv(index=False
not to save the index to the file (since it is now in the data).
我所做的是将索引放入数据框中,reset_index
然后我告诉to_csv(index=False
不要将索引保存到文件中(因为它现在在数据中)。
Also, opening a file stream yourself (with open('foo.csv', 'w') as f:
) is better left to pandas, which does this by itself when you just give it a string 'foo.csv'
as first argument.
此外,自己打开文件流 ( with open('foo.csv', 'w') as f:
) 最好留给熊猫,当您只给它一个字符串'foo.csv'
作为第一个参数时,它会自行完成。