Python 如何将 dtype 作为对象的列转换为 Pandas Dataframe 中的字符串

Question

提问by user3546523

When I read a csv file to pandas dataframe, each column is cast to its own datatypes. I have a column that was converted to an object. I want to perform string operations for this column such as splitting the values and creating a list. But no such operation is possible because its dtype is object. Can anyone please let me know the way to convert all the items of a column to strings instead of objects?

当我将 csv 文件读取到 Pandas 数据帧时，每一列都被转换为它自己的数据类型。我有一个已转换为对象的列。我想对此列执行字符串操作，例如拆分值和创建列表。但是没有这样的操作是可能的，因为它的 dtype 是对象。任何人都可以让我知道将列的所有项目转换为字符串而不是对象的方法吗？

I tried several ways but nothing worked. I used astype, str(), to_string etc.

我尝试了几种方法，但没有任何效果。我使用了 astype、str()、to_string 等。

a=lambda x: str(x).split(',')
df['column'].apply(a)

or

或者

df['column'].astype(str)

Answer 1

回答by Hypothetical Ninja

Did you try assigning it back to the column?

您是否尝试将其分配回列？

df['column'] = df['column'].astype('str')

Referring to this question, the pandas dataframe stores the pointers to the strings and hence it is of type 'object'. As per the docs,You could try:

参考这个问题，pandas 数据帧存储指向字符串的指针，因此它的类型为“对象”。根据文档，您可以尝试：

df['column_new'] = df['column'].str.split(',')

Answer 2

回答by Siraj S.

since strings data types have variable length, it is by default stored as object dtype. If you want to store them as string type, you can do something like this.

由于字符串数据类型具有可变长度，因此默认情况下存储为对象数据类型。如果你想将它们存储为字符串类型，你可以这样做。

df['column'] = df['column'].astype('|S80') #where the max length is set at 80 bytes,

or alternatively

或者

df['column'] = df['column'].astype('|S') # which will by default set the length to the max len it encounters

Answer 3

回答by koshmaster

You could try using df['column'].str.and then use any string function. Pandas documentation includes those like split

您可以尝试使用df['column'].str.然后使用任何字符串函数。Pandas 文档包括诸如split 之类的文档

Answer 4

回答by zurfyx

Not answering the question directly, but it might help someone else.

不直接回答问题，但它可能会帮助其他人。

I have a column called Volume, having both -(invalid/NaN) and numbers formatted with ,

我有一个名为的列Volume，其中包含-（无效/NaN）和数字格式,

df['Volume'] = df['Volume'].astype('str')
df['Volume'] = df['Volume'].str.replace(',', '')
df['Volume'] = pd.to_numeric(df['Volume'], errors='coerce')

Casting to string is requiredfor it to apply to str.replace

需要转换为字符串才能应用于str.replace

pandas.Series.str.replace
pandas.to_numeric

Python 如何将 dtype 作为对象的列转换为 Pandas Dataframe 中的字符串

提问by user3546523

or

或者

回答by Hypothetical Ninja

回答by Siraj S.

回答by koshmaster

回答by zurfyx

相关推荐

最近更新

标签

Python 如何将 dtype 作为对象的列转换为 Pandas Dataframe 中的字符串

提问by user3546523

or

或者

回答by Hypothetical Ninja

回答by Siraj S.

回答by koshmaster

回答by zurfyx

相关推荐

Python 计算 PostgreSQL 中给定 GPS 坐标的日出和日落时间

通过 Python 运行 Excel 宏？

Python 从 .pdf 中提取特定数据并保存在 Excel 文件中

Python 无需重新启动应用程序即可动态更改日志级别

相关推荐

最近更新

标签