pandas python - 类型错误:无法排序的类型:str() > float()
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35310710/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python - TypeError: unorderable types: str() > float()
提问by Thoram Mastero
i have a csv file and has v3 column but that column has some 'nan' rows. How can i except the rows.
我有一个 csv 文件并且有 v3 列,但该列有一些“nan”行。我怎么能除了行。
dataset = pd.read_csv('mypath')
enc = LabelEncoder()
enc.fit(dataset['v3'])
print('fitting')
dataset['v3'] = enc.transform(dataset['v3'])
print('transforming')
print(dataset['v3'])
print('end')
Edit: V3 columns has A,C,B,A,C,D,,,A,S, like that,and i want to convert it to (1,2,3,1,2,4,,,1,7)
编辑:V3 列有 A、C、B、A、C、D、、、A、S,就像那样,我想将其转换为 (1,2,3,1,2,4,,,1, 7)
回答by Rob
Mask the nan values by using ~isnull():
使用 ~isnull() 屏蔽 nan 值:
mask = ~dataset['v3'].isnull()
dataset['v3'][mask] = enc.fit_transform(dataset['v3'][mask])
Another way is to use the pandas.factorize function, which takes care of the nans automatically (assigns them -1):
另一种方法是使用 pandas.factorize 函数,它会自动处理 nans(为它们分配 -1):
dataset['v3'] = dataset['v3'].factorize()[0]