如何在python中从第k列开始删除具有空值的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14991195/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to remove rows with null values from kth column onward in python
提问by user1140126
I need to remove all rows in which elements from column 3 onwards are all NaN
我需要删除所有从第 3 列开始的元素都是 NaN 的行
df = DataFrame(np.random.randn(6, 5), index=['a', 'c', 'e', 'f', 'g','h'], columns=['one', 'two', 'three', 'four', 'five'])
df2 = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
df2.ix[1][0] = 111
df2.ix[1][1] = 222
In the example above, my final data frame would not be having rows 'b' and 'c'.
在上面的例子中,我的最终数据框不会有“b”和“c”行。
How to use df.dropna()in this case?
df.dropna()在这种情况下如何使用?
采纳答案by Andy Hayden
You can call dropnawith arguments subsetand how:
您可以dropna使用参数subset和调用how:
df2.dropna(subset=['three', 'four', 'five'], how='all')
As the names suggests:
顾名思义:
how='all'requires every column (ofsubset) in the row to beNaNin order to be dropped, as opposed to the default'any'.subsetis those columns to inspect forNaNs.
how='all'要求行中的每一列 (ofsubset)NaN都被删除,而不是默认的'any'.subset是要检查NaNs 的那些列。
As @PaulHpoints out, we can generalise to drop the last kcolumns with:
正如@PaulH指出的那样,我们可以概括为删除最后一k列:
subset=df2.columns[k:]
Indeed, we could even do something more complicated if desired:
事实上,如果需要,我们甚至可以做一些更复杂的事情:
subset=filter(lambda x: len(x) > 3, df2.columns)

