pandas 删除 NaN 行在熊猫中不起作用
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45111589/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Dropping NaN rows doesn't work in pandas
提问by user3088202
I have a file with about 7k rows and 4 columns. A lot of the cells are empty and I have tried to drop them using a number of pandas functions but nothing seems to work. Functions I have tried and the code are below:
我有一个大约有 7k 行和 4 列的文件。很多单元格都是空的,我尝试使用许多 Pandas 函数删除它们,但似乎没有任何效果。我尝试过的功能和代码如下:
What I have tried:
我尝试过的:
df = df.dropna(thresh=2)
and
和
df.dropna(axis=0, how='all')
My code:
我的代码:
file = "pc-dirty-data.csv"
path = root + file
name_cols = ['GUID1', 'GUID2', 'Record ID', 'Name', 'Org Name', 'Title']
pull_cols = ['Record ID', 'Name', 'Org Name', 'Title']
df = df.dropna(thresh=2)
df.dropna(axis=0, how='all')
df = pd.read_csv(path, header=None, encoding="ISO-8859-1", names=name_cols, usecols=pull_cols, index_col=False)
df.info()
Dataframe:
数据框:
RangeIndex: 6599 entries, 0 to 6598
Data columns (total 4 columns):
Record ID 5874 non-null float64
Name 5874 non-null object
Org Name 5852 non-null object
Title 5615 non-null object
dtypes: float64(1), object(3)
回答by Scott Boston
dropna
is not an inplace operation, you need to reassign it back to the variable or use the inplace
parameter set to True.
dropna
不是就地操作,您需要将其重新分配回变量或使用inplace
设置为 True的参数。
df = df.dropna(axis=0, how='all')
or
或者
df.dropna(axis=0, how='all', inplace=True)
Edit
编辑
Jay points out in the comments that, you need to reorder you code logic such that you dropna after the read_csv.
Jay 在评论中指出,您需要对代码逻辑重新排序,以便在 read_csv 之后删除。