Python 如何将 Pandas 数据框的数据类型更改为具有定义格式的字符串？

Question

提问by user1718097

I'm starting to tear my hair out with this - so I hope someone can help. I have a pandas DataFrame that was created from an Excel spreadsheet using openpyxl. The resulting DataFrame looks like:

我开始用这个撕掉我的头发 - 所以我希望有人能帮忙。我有一个使用 openpyxl 从 Excel 电子表格创建的 Pandas DataFrame。生成的 DataFrame 如下所示：

print image_name_data
     id           image_name
0  1001  1001_mar2014_report
1  1002  1002_mar2014_report
2  1003  1003_mar2014_report

[3 rows x 2 columns]

…with the following datatypes:

...具有以下数据类型：

print image_name_data.dtypes
id            float64
image_name     object
dtype: object

The issue is that the numbers in the id column are, in fact, identification numbers and I need to treat them as strings. I've tried converting the id column to strings using:

问题是 id 列中的数字实际上是标识号，我需要将它们视为字符串。我尝试使用以下方法将 id 列转换为字符串：

image_name_data['id'] = image_name_data['id'].astype('str')

This seems a bit ugly but it does produce a variable of type 'object' rather than 'float64':

这看起来有点难看，但它确实产生了一个类型为“object”而不是“float64”的变量：

print image_name_data.dyptes
id            object
image_name    object
dtype: object

However, the strings that are created have a decimal point, as shown:

但是，创建的字符串有一个小数点，如下所示：

print image_name_data
       id           image_name
0  1001.0  1001_mar2014_report
1  1002.0  1002_mar2014_report
2  1003.0  1003_mar2014_report

[3 rows x 2 columns]

How can I convert a float64 column in a pandas DataFrame to a string with a given format (in this case, for example, '%10.0f')?

如何将 Pandas DataFrame 中的 float64 列转换为具有给定格式的字符串（在本例中，例如，'%10.0f'）？

Answer 1

采纳答案by exp1orer

I'm unable to reproduce your problem but have you tried converting it to an integer first?

我无法重现您的问题，但您是否尝试先将其转换为整数？

image_name_data['id'] = image_name_data['id'].astype(int).astype('str')

Then, regarding your more general question you could use map(as in this answer). In your case:

然后，关于您可以使用的更一般的问题map（如本答案所示）。在你的情况下：

image_name_data['id'] = image_name_data['id'].map('{:.0f}'.format)

Answer 2

回答by exp1orer

I'm putting this in a new answer because no linebreaks / codeblocks in comments. I assume you want those nans to turn into a blank string? I couldn't find a nice way to do this, only do the ugly method:

我将此放在一个新答案中，因为注释中没有换行符/代码块。我假设您希望这些 nans 变成空白字符串？我找不到一个很好的方法来做到这一点，只能做丑陋的方法：

s = pd.Series([1001.,1002.,None])
a = s.loc[s.isnull()].fillna('')
b = s.loc[s.notnull()].astype(int).astype(str)
result = pd.concat([a,b])

Answer 3

回答by smishra

If you could reload this, you might be able to use dtypes argument.

如果您可以重新加载它，您也许可以使用 dtypes 参数。

pd.read_csv(..., dtype={'COL_NAME':'str'})

Python 如何将 Pandas 数据框的数据类型更改为具有定义格式的字符串？

提问by user1718097

采纳答案by exp1orer

回答by exp1orer

回答by smishra

相关推荐

最近更新

标签

Python 如何将 Pandas 数据框的数据类型更改为具有定义格式的字符串？

提问by user1718097

采纳答案by exp1orer

回答by exp1orer

回答by smishra

相关推荐

Python matplotlib图例中的项目顺序是如何确定的？

Python Matplotlib - 强制绘图显示然后返回主代码

如何在 Python 中使用 sha256 哈希

如何按数字顺序按关键字对字典进行排序 Python

相关推荐

最近更新

标签