Python - 将数据框中的所有项目转换为字符串

Question

提问by theprowler

I followed the following procedure: In Python, how do I convert all of the items in a list to floats?because each column of my Dataframe is list, but instead of floatsI chose to change all the values to strings.

我遵循以下过程：在 Python 中，如何将列表中的所有项目转换为浮点数？因为我的 Dataframe 的每一列都是list，而不是floats我选择将所有值更改为strings.

df = [str(i) for i in df]

But this failed.

但这失败了。

It simply erased all the data except for the first row of column names.

它只是擦除了除第一行列名之外的所有数据。

Then, trying df = [str(i) for i in df.values]resulted in changing the entire Dataframe into one big list, but that messes up the data too much to be able to meet the goal of my script which is to export the Dataframe to my Oracle table.

然后，尝试df = [str(i) for i in df.values]导致将整个 Dataframe 更改为一个大列表，但这会将数据弄得一团糟，无法满足我的脚本的目标，即将 Dataframe 导出到我的 Oracle 表。

Is there a way to convert all the items that are in my Dataframe that are NOT strings into strings?

有没有办法将我的 Dataframe 中所有不是字符串的项目转换为字符串？

Answer 1

回答by PdevG

You can use this:

你可以使用这个：

df = df.astype(str)

out of curiosity I decided to see if there is any difference in efficiency between the accepted solution and mine.

出于好奇，我决定看看已接受的解决方案和我的解决方案在效率上是否有任何差异。

The results are below:

结果如下：

example df:

示例 df：

df = pd.DataFrame([list(range(1000))], index=[0])

test df.astype:

测试df.astype：

%timeit df.astype(str) 
>> 100 loops, best of 3: 2.18 ms per loop

test df.applymap:

测试df.applymap：

%timeit df.applymap(str)
1 loops, best of 3: 245 ms per loop

It seems df.astypeis quite a lot faster :)

似乎df.astype快了很多:)

Answer 2

回答by Psidom

You can use applymapmethod:

您可以使用applymap方法：

df = df.applymap(str)

Answer 3

回答by Sander van den Oord

With pandas >= 1.0 there is now a dedicated string datatype:

pandas >= 1.0 现在有一个专用的字符串数据类型：

You can convert your column to this pandas string datatypeusing .astype('string'):

您可以使用.astype('string')将您的列转换为此Pandas字符串数据类型：

df = df.astype('string')

This is different from using strwhich sets the pandas 'object' datatype:

这与 using strwhich 设置熊猫“对象”数据类型不同：

df = df.astype(str)

You can see the difference in datatypes when you look at the info of the dataframe:

当您查看数据框的信息时，您可以看到数据类型的差异：

df = pd.DataFrame({
    'zipcode_str': [90210, 90211] ,
    'zipcode_string': [90210, 90211],
})

df['zipcode_str'] = df['zipcode_str'].astype(str)
df['zipcode_string'] = df['zipcode_str'].astype('string')

df.info()

# you can see that the first column has dtype object
# while the second column has the new dtype string
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   zipcode_str     2 non-null      object
 1   zipcode_string  2 non-null      string
dtypes: object(1), string(1)

From the docs:

从文档：

The 'string' extension type solves several issues with object-dtype NumPy arrays:
1) You can accidentally store a mixture of strings and non-strings in an object dtype array. A StringArray can only store strings.
2) object dtype breaks dtype-specific operations like DataFrame.select_dtypes(). There isn't a clear way to select just text while excluding non-text, but still object-dtype columns.
3) When reading code, the contents of an object dtype array is less clear than string.

'string' 扩展类型解决了对象数据类型 NumPy 数组的几个问题：
1) 您可能会意外地在对象 dtype 数组中存储字符串和非字符串的混合物。StringArray 只能存储字符串。
2) object dtype 破坏了特定于 dtype 的操作，如 DataFrame.select_dtypes()。没有明确的方法来选择文本同时排除非文本，但仍然是对象类型的列。
3）在阅读代码时，对象dtype数组的内容不如字符串清晰。

Information about pandas 1.0 can be found here:
https://pandas.pydata.org/pandas-docs/version/1.0.0/whatsnew/v1.0.0.html

可以在此处找到有关 pandas 1.0 的信息：https:
//pandas.pydata.org/pandas-docs/version/1.0.0/whatsnew/v1.0.0.html

Answer 4

回答by Sarbari Roy

This worked for me:

这对我有用：

dt.applymap(lambda x: x[0] if type(x) is list else None)

Python - 将数据框中的所有项目转换为字符串

提问by theprowler

回答by PdevG

回答by Psidom

回答by Sander van den Oord

回答by Sarbari Roy

相关推荐

最近更新

标签

Python - 将数据框中的所有项目转换为字符串

提问by theprowler

回答by PdevG

回答by Psidom

回答by Sander van den Oord

回答by Sarbari Roy

相关推荐

Python 使用交叉验证评估逻辑回归

Python 套接字错误 TypeError：需要一个类似字节的对象，而不是带有发送函数的“str”

Python 熊猫：将 dtype 'object' 转换为 int

Python 为什么我的 Pandas DataFrame 不使用 `sort_values` 显示新订单？

相关推荐

最近更新

标签