pandas 当行包含特定文本时计算行数

Question

提问by F1990

Probably a simple question but I could not find a simple answer. Let's for example take the following column Status within a dataframe df1:

可能是一个简单的问题，但我找不到简单的答案。例如，让我们以数据帧 df1 中的以下列状态为例：

**Status**
Planned
Unplanned
Missing
Corrected

I would like to count the rows when a cell contains, Planned and Missing. I tried the following:

我想计算单元格包含计划和缺失时的行数。我尝试了以下方法：

test1 = df1['Status'].str.contains('Planned|Missing').value_counts()

The column Status is from the type: object. What's wrong with my line of code?

状态列来自类型：对象。我的代码行有什么问题？

Answer 1

回答by EdChum

You can just filter the df with your boolean condition and then call len:

您可以使用布尔条件过滤 df 然后调用len：

In [155]:
len(df[df['Status'].str.contains('Planned|Missing')])

Out[155]:
2

Or use the index Truefrom your value_counts:

或者使用True您的索引value_counts：

In [158]:   
df['Status'].str.contains('Planned|Missing').value_counts()[True]

Out[158]:
2

Answer 2

回答by Scotty

Give a try to the following one:

试试下面的方法：

df["Status"].value_counts()[['Planned','Missing']].sum()

Answer 3

回答by jpp

pd.Series.str.containswhen coupled with na=Falseguarantees you have a Boolean series. Note also True/ Falseact like 1/ 0with numeric computations. You can now use pd.Series.sumdirectly:

pd.Series.str.contains再加上na=False保证你有一个布尔系列。还要注意True/False像1/0与数字计算一样。您现在可以pd.Series.sum直接使用：

count = df['Status'].str.contains('Planned|Missing', na=False).sum()

This avoids unnecessary and expensive dataframe indexing operations.

这避免了不必要和昂贵的数据帧索引操作。

pandas 当行包含特定文本时计算行数

提问by F1990

回答by EdChum

回答by Scotty

回答by jpp

相关推荐

最近更新

标签

pandas 当行包含特定文本时计算行数

提问by F1990

回答by EdChum

回答by Scotty

回答by jpp

相关推荐

pandas python dask DataFrame，是否支持（平凡可并行化）行应用？

pandas read_table vs. read_csv vs. from_csv vs. read_excel的性能差异？

pandas 在python中使用时间序列进行预测

pandas 查找季度给定日期的结束日期，熊猫

相关推荐

最近更新

标签