Python 熊猫：将系列的数据类型更改为字符串

Question

提问by Zhubarb

I use Pandas 'ver 0.12.0' with Python 2.7 and have a dataframe as below:

我将 Pandas 'ver 0.12.0' 与 Python 2.7 一起使用，并具有如下数据框：

df = pd.DataFrame({'id' : [123,512,'zhub1', 12354.3, 129, 753, 295, 610],
                    'colour': ['black', 'white','white','white',
                            'black', 'black', 'white', 'white'],
                    'shape': ['round', 'triangular', 'triangular','triangular','square',
                                        'triangular','round','triangular']
                    },  columns= ['id','colour', 'shape'])

The idSeries consists of some integers and strings. Its dtypeby default is object. I want to convert all contents of idto strings. I tried astype(str), which produces the output below.

该id系列包括一些整数和字符串。它dtype的默认值为object. 我想将的所有内容转换id为字符串。我试过astype(str)，它产生下面的输出。

df['id'].astype(str)
0    1
1    5
2    z
3    1
4    1
5    7
6    2
7    6

1)How can I convert all elements of idto String?

1)如何将的所有元素转换id为字符串？

2)I will eventually use idfor indexing for dataframes. Would having String indices in a dataframe slow things down, compared to having an integer index?

2）我最终将id用于数据帧的索引。与整数索引相比，数据帧中的字符串索引会减慢速度吗？

Answer 1

采纳答案by Amit Verma

You can convert all elements of id to strusing apply

您可以将 id 的所有元素转换为str使用apply

df.id.apply(str)

0        123
1        512
2      zhub1
3    12354.3
4        129
5        753
6        295
7        610

Edit by OP:

OP编辑：

I think the issue was related to the Python version (2.7.), this worked:

我认为这个问题与 Python 版本（2.7.）有关，这有效：

df['id'].astype(basestring)
0        123
1        512
2      zhub1
3    12354.3
4        129
5        753
6        295
7        610
Name: id, dtype: object

Answer 2

回答by Rishil Antony

You must assign it, like this:-

您必须分配它，如下所示：-

df['id']= df['id'].astype(str)

Answer 3

回答by manesioz

Personally none of the above worked for me. What did:

就我个人而言，以上都不适合我。做了什么：

new_str = [str(x) for x in old_obj][0]

Answer 4

回答by rocksNwaves

A new answer to reflect the most current practices: as of version 1.0.1, neither astype('str')nor astype(str)work.

反映最新实践的新答案：从 1.0.1 版开始，既不工作astype('str')也不astype(str)工作。

As per the documentation, a Series can be converted to the string datatype in the following ways:

根据文档，可以通过以下方式将系列转换为字符串数据类型：

df['id'] = df['id'].astype("string")

df['id'] = pandas.Series(df['id'], dtype="string")

df['id'] = pandas.Series(df['id'], dtype=pandas.StringDtype)

Answer 5

回答by user3423349

You can use:

您可以使用：

df.loc[:,'id'] = df.loc[:, 'id'].astype(str)

This is why they recommend this solution: Pandas doc

这就是他们推荐此解决方案的原因： Pandas doc

TD;LR

To reflect some of the answers:

为了反映一些答案：

df['id'] = df['id'].astype("string")

This will break on the given example because it will try to convert to StringArraywhich can not handle any number in the 'string'.

这将在给定的示例中中断，因为它将尝试转换为StringArray，而后者无法处理“字符串”中的任何数字。

df['id']= df['id'].astype(str)

For me this solution throw some warning:

对我来说，这个解决方案会发出一些警告：

> SettingWithCopyWarning:  
> A value is trying to be set on a copy of a
> slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

Answer 6

回答by shekhar chander

Your problem can easily be solved by converting it to the object first. After it is converted to object, just use "astype" to convert it to str.

您的问题可以通过首先将其转换为对象来轻松解决。转换为对象后，只需使用“astype”将其转换为str。

obj = lambda x:x[1:]
df['id']=df['id'].apply(obj).astype('str')

Python 熊猫：将系列的数据类型更改为字符串

提问by Zhubarb

采纳答案by Amit Verma

回答by Rishil Antony

回答by manesioz

回答by rocksNwaves

回答by user3423349

回答by shekhar chander

相关推荐

最近更新

标签

Python 熊猫：将系列的数据类型更改为字符串

提问by Zhubarb

采纳答案by Amit Verma

回答by Rishil Antony

回答by manesioz

回答by rocksNwaves

回答by user3423349

回答by shekhar chander

相关推荐

python pandas用数字替换数据框中的字符串

Python 将数据从 Django 传递到 D3

Python UnicodeDecodeError: 'utf8' 编解码器无法解码位置 0 中的字节 0xa5：起始字节无效

Python 在 PyCharm 中重命名文件

相关推荐

最近更新

标签