pandas to_csv：在将熊猫写入 csv 时抑制 csv 文件中的科学记数法

Question

提问by ansonw

I am writing a pandas df to a csv. When I write it to a csv file, some of the elements in one of the columns are being incorrectly converted to scientific notation/numbers. For example, col_1 has strings such as '104D59' in it. The strings are mostly represented as strings in the csv file, as they should be. However, occasional strings, such as '104E59', are being converted into scientific notation (e.g., 1.04 E 61) and represented as integers in the ensuing csv file.

我正在将Pandas df 写入 csv。当我将其写入 csv 文件时，其中一列中的某些元素被错误地转换为科学记数法/数字。例如，col_1 中包含诸如 '104D59' 之类的字符串。字符串在 csv 文件中主要表示为字符串，正如它们应该的那样。但是，偶尔的字符串，例如“104E59”，会被转换为科学记数法（例如，1.04 E 61）并在随后的 csv 文件中表示为整数。

I am trying to export the csv file into a software package (i.e., pandas -> csv -> software_new) and this change in data type is causing problems with that export.

我正在尝试将 csv 文件导出到一个软件包中（即，pandas -> csv -> software_new），并且数据类型的这种更改导致该导出出现问题。

Is there a way to write the df to a csv, ensuring that all elements in df['problem_col'] are represented as string in the resulting csv or not converted to scientific notation?

有没有办法将 df 写入 csv，确保 df['problem_col'] 中的所有元素在结果 csv 中表示为字符串或不转换为科学记数法？

Here is the code I have used to write the pandas df to a csv: df.to_csv('df.csv', encoding='utf-8')

这是我用来将Pandas df 写入 csv 的代码： df.to_csv('df.csv', encoding='utf-8')

I also check the dtype of the problem column: for df.dtype, df['problem_column'] is an object

我还检查了问题列的 dtype：对于 df.dtype，df['problem_column'] 是一个对象

Answer 1

回答by n1tk

For python 3.xx (Python 3.7.2)&
In [2]: pd.__version__Out[2]: '0.23.4':

对于 python 3.xx ( Python 3.7.2)&
In [2]: pd.__version__Out[2]: '0.23.4'：

Options and Settings

选项和设置

For visualization of the dataframe pandas.set_option

用于数据框 pandas.set_option 的可视化

import pandas as pd #import pandas package

# for visualisation fo the float data once we read the float data:

pd.set_option('display.html.table_schema', True) # to can see the dataframe/table as a html
pd.set_option('display.precision', 5) # setting up the precision point so can see the data how looks, here is 5
df = pd.DataFrame(np.random.randn(20,4)* 10 ** -12) # create random dataframe

Output of the data:

数据输出：

df.dtypes # check datatype for columns

[output]:
0    float64
1    float64
2    float64
3    float64
dtype: object

Dataframe:

数据框：

df # output of the dataframe

[output]:
0   1   2   3
0   -2.01082e-12    1.25911e-12 1.05556e-12 -5.68623e-13
1   -6.87126e-13    1.91950e-12 5.25925e-13 3.72696e-13
2   -1.48068e-12    6.34885e-14 -1.72694e-12    1.72906e-12
3   -5.78192e-14    2.08755e-13 6.80525e-13 1.49018e-12
4   -9.52408e-13    1.61118e-13 2.09459e-13 2.10940e-13
5   -2.30242e-13    -1.41352e-13    2.32575e-12 -5.08936e-13
6   1.16233e-12 6.17744e-13 1.63237e-12 1.59142e-12
7   1.76679e-13 -1.65943e-12    2.18727e-12 -8.45242e-13
8   7.66469e-13 1.29017e-13 -1.61229e-13    -3.00188e-13
9   9.61518e-13 9.71320e-13 8.36845e-14 -6.46556e-13
10  -6.28390e-13    -1.17645e-12    -3.59564e-13    8.68497e-13
11  3.12497e-13 2.00065e-13 -1.10691e-12    -2.94455e-12
12  -1.08365e-14    5.36770e-13 1.60003e-12 9.19737e-13
13  -1.85586e-13    1.27034e-12 -1.04802e-12    -3.08296e-12
14  1.67438e-12 7.40403e-14 3.28035e-13 5.64615e-14
15  -5.31804e-13    -6.68421e-13    2.68096e-13 8.37085e-13
16  -6.25984e-13    1.81094e-13 -2.68336e-13    1.15757e-12
17  7.38247e-13 -1.76528e-12    -4.72171e-13    -3.04658e-13
18  -1.06099e-12    -1.31789e-12    -2.93676e-13    -2.40465e-13
19  1.38537e-12 9.18101e-13 5.96147e-13 -2.41401e-12

And now write to_csvusing the float_format='%.15f'parameter

现在使用float_format='%.15f'参数写入to_csv

df.to_csv('estc.csv',sep=',', float_format='%.15f') # write with precision .15

file output:

文件输出：

,0,1,2,3
0,-0.000000000002011,0.000000000001259,0.000000000001056,-0.000000000000569
1,-0.000000000000687,0.000000000001919,0.000000000000526,0.000000000000373
2,-0.000000000001481,0.000000000000063,-0.000000000001727,0.000000000001729
3,-0.000000000000058,0.000000000000209,0.000000000000681,0.000000000001490
4,-0.000000000000952,0.000000000000161,0.000000000000209,0.000000000000211
5,-0.000000000000230,-0.000000000000141,0.000000000002326,-0.000000000000509
6,0.000000000001162,0.000000000000618,0.000000000001632,0.000000000001591
7,0.000000000000177,-0.000000000001659,0.000000000002187,-0.000000000000845
8,0.000000000000766,0.000000000000129,-0.000000000000161,-0.000000000000300
9,0.000000000000962,0.000000000000971,0.000000000000084,-0.000000000000647
10,-0.000000000000628,-0.000000000001176,-0.000000000000360,0.000000000000868
11,0.000000000000312,0.000000000000200,-0.000000000001107,-0.000000000002945
12,-0.000000000000011,0.000000000000537,0.000000000001600,0.000000000000920
13,-0.000000000000186,0.000000000001270,-0.000000000001048,-0.000000000003083
14,0.000000000001674,0.000000000000074,0.000000000000328,0.000000000000056
15,-0.000000000000532,-0.000000000000668,0.000000000000268,0.000000000000837
16,-0.000000000000626,0.000000000000181,-0.000000000000268,0.000000000001158
17,0.000000000000738,-0.000000000001765,-0.000000000000472,-0.000000000000305
18,-0.000000000001061,-0.000000000001318,-0.000000000000294,-0.000000000000240
19,0.000000000001385,0.000000000000918,0.000000000000596,-0.000000000002414

And now write to_csvusing the float_format='%f'parameter

现在使用float_format='%f'参数写入to_csv

df.to_csv('estc.csv',sep=',', float_format='%f') # this will remove the extra zeros after the '.'

For more details check pandas.DataFrame.to_csv

有关更多详细信息，请查看 pandas.DataFrame.to_csv

Answer 2

回答by Andy Hayden

Use the float_formatargument:

使用float_format参数：

In [11]: df = pd.DataFrame(np.random.randn(3, 3) * 10 ** 12)

In [12]: df
Out[12]:
              0             1             2
0  1.757189e+12 -1.083016e+12  5.812695e+11
1  7.889034e+11  5.984651e+11  2.138096e+11
2 -8.291878e+11  1.034696e+12  8.640301e+08

In [13]: print(df.to_string(float_format='{:f}'.format))
                     0                     1                   2
0 1757188536437.788086 -1083016404775.687134 581269533538.170288
1  788903446803.216797   598465111695.240601 213809584103.112457
2 -829187757358.493286  1034695767987.889160    864030095.691202

Which works similarly for to_csv:

这对 to_csv 的工作方式类似：

df.to_csv('df.csv', float_format='{:f}'.format, encoding='utf-8')

Answer 3

回答by evil242

If you would like to use the values as formated string in a list, say as part of csvfile csv.writier, the numbers can be formated before creating a list:

如果您想将值用作列表中的格式化字符串，例如作为 csvfile csv.writier 的一部分，则可以在创建列表之前对数字进行格式化：

with open('results_actout_file','w',newline='') as csvfile:
     resultwriter = csv.writer(csvfile, delimiter=',')
     resultwriter.writerow(header_row_list)

     resultwriter.writerow(df['label'].apply(lambda x: '%.17f' % x).values.tolist())

pandas to_csv：在将熊猫写入 csv 时抑制 csv 文件中的科学记数法

提问by ansonw

回答by n1tk

Output of the data:

数据输出：

Dataframe:

数据框：

And now write to_csvusing the float_format='%.15f'parameter

现在使用float_format='%.15f'参数写入to_csv

file output:

文件输出：

And now write to_csvusing the float_format='%f'parameter

现在使用float_format='%f'参数写入to_csv

回答by Andy Hayden

回答by evil242

相关推荐

最近更新

标签

pandas to_csv：在将熊猫写入 csv 时抑制 csv 文件中的科学记数法

提问by ansonw

回答by n1tk

Output of the data:

数据输出：

Dataframe:

数据框：

And now write to_csvusing the float_format='%.15f'parameter

现在使用float_format='%.15f'参数写入to_csv

file output:

文件输出：

And now write to_csvusing the float_format='%f'parameter

现在使用float_format='%f'参数写入to_csv

回答by Andy Hayden

回答by evil242

相关推荐

在 Pandas 数据帧中查找字符串模式匹配并返回匹配的字符串

绘制 Pandas OLS 线性回归结果

Pytables/Pandas：组合（阅读？）按行拆分的多个 HDF5 存储

Pandas Groupby 应用函数计算大于零的值

相关推荐

最近更新

标签