Python 如何将 Pandas 数据框的数据类型更改为具有定义格式的字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22276503/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:39:12  来源:igfitidea点击:

How do I change data-type of pandas data frame to string with a defined format?

pythonstringfloating-pointpandasformat

提问by user1718097

I'm starting to tear my hair out with this - so I hope someone can help. I have a pandas DataFrame that was created from an Excel spreadsheet using openpyxl. The resulting DataFrame looks like:

我开始用这个撕掉我的头发 - 所以我希望有人能帮忙。我有一个使用 openpyxl 从 Excel 电子表格创建的 Pandas DataFrame。生成的 DataFrame 如下所示:

print image_name_data
     id           image_name
0  1001  1001_mar2014_report
1  1002  1002_mar2014_report
2  1003  1003_mar2014_report

[3 rows x 2 columns]

…with the following datatypes:

...具有以下数据类型:

print image_name_data.dtypes
id            float64
image_name     object
dtype: object

The issue is that the numbers in the id column are, in fact, identification numbers and I need to treat them as strings. I've tried converting the id column to strings using:

问题是 id 列中的数字实际上是标识号,我需要将它们视为字符串。我尝试使用以下方法将 id 列转换为字符串:

image_name_data['id'] = image_name_data['id'].astype('str')

This seems a bit ugly but it does produce a variable of type 'object' rather than 'float64':

这看起来有点难看,但它确实产生了一个类型为“object”而不是“float64”的变量:

print image_name_data.dyptes
id            object
image_name    object
dtype: object

However, the strings that are created have a decimal point, as shown:

但是,创建的字符串有一个小数点,如下所示:

print image_name_data
       id           image_name
0  1001.0  1001_mar2014_report
1  1002.0  1002_mar2014_report
2  1003.0  1003_mar2014_report

[3 rows x 2 columns]

How can I convert a float64 column in a pandas DataFrame to a string with a given format (in this case, for example, '%10.0f')?

如何将 Pandas DataFrame 中的 float64 列转换为具有给定格式的字符串(在本例中,例如,'%10.0f')?

采纳答案by exp1orer

I'm unable to reproduce your problem but have you tried converting it to an integer first?

我无法重现您的问题,但您是否尝试先将其转换为整数?

image_name_data['id'] = image_name_data['id'].astype(int).astype('str')

Then, regarding your more general question you could use map(as in this answer). In your case:

然后,关于您可以使用的更一般的问题map如本答案所示)。在你的情况下:

image_name_data['id'] = image_name_data['id'].map('{:.0f}'.format)

回答by exp1orer

I'm putting this in a new answer because no linebreaks / codeblocks in comments. I assume you want those nans to turn into a blank string? I couldn't find a nice way to do this, only do the ugly method:

我将此放在一个新答案中,因为注释中没有换行符/代码块。我假设您希望这些 nans 变成空白字符串?我找不到一个很好的方法来做到这一点,只能做丑陋的方法:

s = pd.Series([1001.,1002.,None])
a = s.loc[s.isnull()].fillna('')
b = s.loc[s.notnull()].astype(int).astype(str)
result = pd.concat([a,b])

回答by smishra

If you could reload this, you might be able to use dtypes argument.

如果您可以重新加载它,您也许可以使用 dtypes 参数。

pd.read_csv(..., dtype={'COL_NAME':'str'})