Python 将包含字符串的 Pandas 系列转换为布尔值

Question

提问by working4coins

I have a DataFrame named dfas

我有一个名为数据帧df作为

  Order Number       Status
1         1668  Undelivered
2        19771  Undelivered
3    100032108  Undelivered
4         2229    Delivered
5        00056  Undelivered

I would like to convert the Statuscolumn to boolean (Truewhen Status is Delivered and Falsewhen Status is Undelivered) but if Status is neither 'Undelivered' neither 'Delivered' it should be considered as NotANumberor something like that.

我想将Status列转换为布尔值（True当状态已交付False时状态未交付时）但如果状态既不是“未交付”也不是“已交付”，则应将其视为NotANumber或类似的内容。

I would like to use a dict

我想使用字典

d = {
  'Delivered': True,
  'Undelivered': False
}

so I could easily add other string which could be either considered as Trueor False.

所以我可以轻松添加其他字符串，这些字符串可以被视为True或False。

Answer 1

采纳答案by joris

You can just use map:

你可以只使用map：

In [7]: df = pd.DataFrame({'Status':['Delivered', 'Delivered', 'Undelivered',
                                     'SomethingElse']})

In [8]: df
Out[8]:
          Status
0      Delivered
1      Delivered
2    Undelivered
3  SomethingElse

In [9]: d = {'Delivered': True, 'Undelivered': False}

In [10]: df['Status'].map(d)
Out[10]:
0     True
1     True
2    False
3      NaN
Name: Status, dtype: object

Answer 2

回答by Dan Allan

You've got everything you need. You'll be happy to discover replace:

你有你需要的一切。你会很高兴地发现replace：

df.replace(d)

Answer 3

回答by Kappa Leonis

An example of replacemethod to replace values only in the specified column C2and get result as DataFrametype.

replace仅替换指定列中的值C2并将结果作为DataFrame类型获取的方法示例。

import pandas as pd
df = pd.DataFrame({'C1':['X', 'Y', 'X', 'Y'], 'C2':['Y', 'Y', 'X', 'X']})

  C1 C2
0  X  Y
1  Y  Y
2  X  X
3  Y  X

df.replace({'C2': {'X': True, 'Y': False}})

  C1     C2
0  X  False
1  Y  False
2  X   True
3  Y   True

Answer 4

回答by Yaakov Bressler

Expanding on the previous answers:

扩展以前的答案：

Map method explained:

地图方法解释：

Pandas will lookup each row's value in the corresponding ddictionary, replacing any found keys with values from d.
Values without keys in dwill be set as NaN. This can be corrected with fillna()methods.
Does not work on multiple columns, since pandas operates through serialization of pd.Serieshere.
Documentation: pd.Series.map

Pandas 将在相应的d字典中查找每一行的值，用来自的值替换任何找到的键d。
没有键的值d将被设置为NaN. 这可以通过fillna()方法来纠正。
不适用于多列，因为 Pandas 是通过pd.Serieshere 的序列化操作的。
文档：pd.Series.map

d = {'Delivered': True, 'Undelivered': False}
df["Status"].map(d)

Replace method explained:

替换方法说明：

Pandas will lookup each row's value in the corresponding ddictionary, and attemptto replace any found keys with values from d.
Values without keys in dwill be be retained.
Works with single and multiple columns (pd.Seriesor pd.DataFrameobjects).
Documentation: pd.DataFrame.replace

Pandas 将在相应的d字典中查找每一行的值，并尝试用来自的值替换任何找到的键d。
没有键的值d将被保留。
适用于单列和多列（pd.Series或pd.DataFrame对象）。
文档：pd.DataFrame.replace

d = {'Delivered': True, 'Undelivered': False}
df["Status"].replace(d)

Overall, the replace method is more robustand allows finer control over how data is mapped + how to handle missing or nan values.

总的来说，replace 方法更健壮，可以更好地控制数据的映射方式+如何处理缺失值或 nan 值。

Python 将包含字符串的 Pandas 系列转换为布尔值

提问by working4coins

采纳答案by joris

回答by Dan Allan

回答by Kappa Leonis

回答by Yaakov Bressler

Map method explained:

地图方法解释：

Replace method explained:

替换方法说明：

相关推荐

最近更新

标签

Python 将包含字符串的 Pandas 系列转换为布尔值

提问by working4coins

采纳答案by joris

回答by Dan Allan

回答by Kappa Leonis

回答by Yaakov Bressler

Map method explained:

地图方法解释：

Replace method explained:

替换方法说明：

相关推荐

Unix：Python 脚本是否一直在运行最佳实践？

Python 无法从烧瓶中的 send_from_directory() 检索文件

Python 3 UnicodeDecodeError：“charmap”编解码器无法解码字节 0x9d

Python 中的散点图和颜色映射

相关推荐

最近更新

标签