pandas 计算熊猫数据帧行中的非空单元格并将计数添加为一列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48906828/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Count non-empty cells in pandas dataframe rows and add counts as a column
提问by Atiqul Islam
回答by Keith Dowd
To count the number of cells missing data in each row, you probably want to do something like this:
要计算每行中缺少数据的单元格数量,您可能需要执行以下操作:
df.apply(lambda x: x.isnull().sum(), axis='columns')
Replace df
with the label of your data frame.
替换df
为数据框的标签。
You can create a new column and write the count to it using something like:
您可以创建一个新列并使用以下内容将计数写入其中:
df['MISSING'] = df.apply(lambda x: x.isnull().sum(), axis='columns')
The column will be created at the end (rightmost) of your data frame.
该列将在数据框的末尾(最右侧)创建。
You can move your columns around like this:
您可以像这样移动列:
df = df[['Count', 'M', 'A', 'B', 'C']]
Update
更新
I'm wondering if your missing cells are actually empty strings as opposed to NaN
values. Can you confirm? I copied your screenshot into an Excel workbook. My full code is below:
我想知道您丢失的单元格是否实际上是空字符串而不是NaN
值。你可否确认?我将您的屏幕截图复制到 Excel 工作簿中。我的完整代码如下:
df = pd.read_excel('count.xlsx', na_values=['', ' '])
df.head() # You should see NaN for empty cells
df['M']=df.apply(lambda x: x.isnull().sum(), axis='columns')
df.head() # Column M should report the values: first row: 0, second row: 1, third row: 2
df = df[['Count', 'M', 'A', 'B', 'C']]
df.head() # Column order should be Count, M, A, B, C
Notice the na_values
parameter in the pd.read_excel
method.
注意方法中的na_values
参数pd.read_excel
。