pandas 如何从数据框中保存 CSV,以在带有数字的列中保留零?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48903008/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to save a CSV from dataframe, to keep zeros left in column with numbers?
提问by Reinaldo Chaves
In Python 3 and pandas I have a dataframe with a column cpfwith codes
在 Python 3 和 Pandas 中,我有一个带有代码的列cpf的数据框
candidatos_2014.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 26245 entries, 0 to 1063
Data columns (total 7 columns):
uf 26245 non-null object
cargo 26245 non-null object
nome_completo 26245 non-null object
cpf 26245 non-null object
nome_urna 26245 non-null object
partido_eleicao 26245 non-null object
situacao 26245 non-null object
dtypes: object(7)
memory usage: 1.6+ MB
The codes are numbers like these: "00229379273", "84274662268", "09681949153", "53135636534"...
代码是这样的数字:“00229379273”、“84274662268”、“09681949153”、“53135636534”……
I saved as CSV
我保存为 CSV
candidatos_2014.to_csv('candidatos_2014.csv')
I use Ubuntu and LibreOffice. But when I opened the file the cpfcolumn does not show the leading zeros:
我使用 Ubuntu 和 LibreOffice。但是当我打开文件时,cpf列没有显示前导零:
"229379273", "9681949153"
Please, is there a way to save a CSV that keeps zeros to the left in a column that only has numbers?
请问,有没有办法在只有数字的列中保存左侧为零的CSV?
采纳答案by Sociopath
Specify dtype as string while reading the csv file as below:
在读取 csv 文件时将 dtype 指定为字符串,如下所示:
# if you are reading data with leading zeros
candidatos_2014 = pd.read_csv('candidatos_2014.csv', dtype ='str')
or convert data column into string
或将数据列转换为字符串
# if data is generated in python you can convert column into string first
candidatos_2014['cpf'] = candidatos_2014['cpf'].astype('str')
candidatos_2014.to_csv('candidatos_2014.csv')
回答by Ivan S.
First, make sure that output in your csv file does not have zeros. If it does, but you are opening that file in Excel or another spreadsheet, you still sometimes can see values without leading zeros. In this case, Go to Data menu, then Import form Text. Excel's import utility will give you options to define each column's data type.
首先,确保 csv 文件中的输出没有零。如果是这样,但您正在 Excel 或其他电子表格中打开该文件,您有时仍会看到没有前导零的值。在这种情况下,转到数据菜单,然后导入表单文本。Excel 的导入实用程序将为您提供定义每列数据类型的选项。
I am sure that it should be similar in other apps.
我相信它在其他应用程序中应该是相似的。
Hope it helps!
希望能帮助到你!
回答by Joe Germuska
TLDR: you don't have to do anything if your pandas columns are type object
TLDR:如果您的 Pandas 列是类型,则您无需执行任何操作 object
I feel like both answers here, but especially the accepted answer, are confusing. The short answer is that, if the dtype
of your column is object
, then pandas will write it with leading zeros. There's nothing to do.
我觉得这里的两个答案,尤其是公认的答案,都令人困惑。简短的回答是,如果dtype
您的列的 是object
,那么Pandas会用前导零来写它。没什么可做的。
If like me, you came here because you didn't know that for sure and when you opened the CSV, the leading zeros were gone, then follow Ivan S's advice -- take a look at the file you wrote to verify, but you should see the leading zeros there.
如果像我一样,你来到这里是因为你不确定,当你打开 CSV 时,前导零消失了,然后按照 Ivan S 的建议——看看你写的文件来验证,但你应该看到那里的前导零。
If you do, then both answers give guidance on how to read the data back in preserving leading zeros.
如果您这样做,那么这两个答案都提供了有关如何在保留前导零的情况下读回数据的指导。
If you don't, the the datatype wasn't correct in pandas when you saved the CSV. Just changing that column using astype
wouldn't restore the zeros. You'd also need to use str.zfill
as described in this SO answer.
如果不这样做,则当您保存 CSV 时,pandas 中的数据类型不正确。仅使用更改该列astype
不会恢复零。您还需要str.zfill
按照此SO 答案中的说明使用。