pandas 如何从数据框中保存 CSV,以在带有数字的列中保留零?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48903008/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:12:49  来源:igfitidea点击:

How to save a CSV from dataframe, to keep zeros left in column with numbers?

pythonpandascsv

提问by Reinaldo Chaves

In Python 3 and pandas I have a dataframe with a column cpfwith codes

在 Python 3 和 Pandas 中,我有一个带有代码的列cpf的数据

candidatos_2014.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 26245 entries, 0 to 1063
Data columns (total 7 columns):
uf                 26245 non-null object
cargo              26245 non-null object
nome_completo      26245 non-null object
cpf                26245 non-null object
nome_urna          26245 non-null object
partido_eleicao    26245 non-null object
situacao           26245 non-null object
dtypes: object(7)
memory usage: 1.6+ MB

The codes are numbers like these: "00229379273", "84274662268", "09681949153", "53135636534"...

代码是这样的数字:“00229379273”、“84274662268”、“09681949153”、“53135636534”……

I saved as CSV

我保存为 CSV

candidatos_2014.to_csv('candidatos_2014.csv')

I use Ubuntu and LibreOffice. But when I opened the file the cpfcolumn does not show the leading zeros:

我使用 Ubuntu 和 LibreOffice。但是当我打开文件时,cpf列没有显示前导零:

"229379273", "9681949153"

Please, is there a way to save a CSV that keeps zeros to the left in a column that only has numbers?

请问,有没有办法在只有数字的列中保存左侧为零的CSV?

采纳答案by Sociopath

Specify dtype as string while reading the csv file as below:

在读取 csv 文件时将 dtype 指定为字符串,如下所示:

# if you are reading data with leading zeros
candidatos_2014 = pd.read_csv('candidatos_2014.csv', dtype ='str')

or convert data column into string

或将数据列转换为字符串

# if data is generated in python you can convert column into string first
candidatos_2014['cpf'] = candidatos_2014['cpf'].astype('str')
candidatos_2014.to_csv('candidatos_2014.csv')

回答by Ivan S.

First, make sure that output in your csv file does not have zeros. If it does, but you are opening that file in Excel or another spreadsheet, you still sometimes can see values without leading zeros. In this case, Go to Data menu, then Import form Text. Excel's import utility will give you options to define each column's data type.

首先,确保 csv 文件中的输出没有零。如果是这样,但您正在 Excel 或其他电子表格中打开该文件,您有时仍会看到没有前导零的值。在这种情况下,转到数据菜单,然后导入表单文本。Excel 的导入实用程序将为您提供定义每列数据类型的选项。

I am sure that it should be similar in other apps.

我相信它在其他应用程序中应该是相似的。

Hope it helps!

希望能帮助到你!

回答by Joe Germuska

TLDR: you don't have to do anything if your pandas columns are type object

TLDR:如果您的 Pandas 列是类型,则您无需执行任何操作 object

I feel like both answers here, but especially the accepted answer, are confusing. The short answer is that, if the dtypeof your column is object, then pandas will write it with leading zeros. There's nothing to do.

我觉得这里的两个答案,尤其是公认的答案,都令人困惑。简短的回答是,如果dtype您的列的 是object,那么Pandas会用前导零来写它。没什么可做的。

If like me, you came here because you didn't know that for sure and when you opened the CSV, the leading zeros were gone, then follow Ivan S's advice -- take a look at the file you wrote to verify, but you should see the leading zeros there.

如果像我一样,你来到这里是因为你不确定,当你打开 CSV 时,前导零消失了,然后按照 Ivan S 的建议——看看你写的文件来验证,但你应该看到那里的前导零。

If you do, then both answers give guidance on how to read the data back in preserving leading zeros.

如果您这样做,那么这两个答案都提供了有关如何在保留前导零的情况下读回数据的指导。

If you don't, the the datatype wasn't correct in pandas when you saved the CSV. Just changing that column using astypewouldn't restore the zeros. You'd also need to use str.zfillas described in this SO answer.

如果不这样做,则当您保存 CSV 时,pandas 中的数据类型不正确。仅使用更改该列astype不会恢复零。您还需要str.zfill按照此SO 答案中的说明使用。