在 Pandas 中用 NaN 替换空字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40711900/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
replacing empty strings with NaN in Pandas
提问by doctorer
I have a pandas dataframe (that was created by importing a csv file). I want to replace blank values with NaN. Some of these blank values are empty and some contain a (variable number) of spaces ''
, ' '
, ' '
, etc.
我有一个 Pandas 数据框(它是通过导入一个 csv 文件创建的)。我想用 NaN 替换空白值。一些这些空白值的是空的,一些包含一个(变量数)的空间''
,' '
,' '
等。
Using the suggestion from this threadI have
使用这个线程的建议我有
df.replace(r'\s+', np.nan, regex=True, inplace = True)
which does replace all the strings that only contain spaces, but also replaces every string that has a space in it, which is not what I want.
它确实替换了所有只包含空格的字符串,但也替换了每个包含空格的字符串,这不是我想要的。
How do I replace only strings with justspaces and empty strings?
如何仅用空格和空字符串替换字符串?
采纳答案by Rajshekar Reddy
If you are reading a csv
file and want to convert all empty strings to nan
while reading the fileitself then you can use the option
如果您正在读取csv
文件并希望nan
在读取文件本身时将所有空字符串转换为,那么您可以使用该选项
skipinitialspace=True
Example code
示例代码
pd.read_csv('Sample.csv', skipinitialspace=True)
This will remove any white spaces that appear after the delimiters, Thus making all the empty strings as nan
这将删除出现在分隔符之后的任何空格,从而使所有空字符串成为 nan
From the documentation http://pandas.pydata.org/pandas-docs/stable/io.html
从文档http://pandas.pydata.org/pandas-docs/stable/io.html
Note:This option will remove preceding white spaces even from valid data, if for any reason you want to retain the preceding white space then this option is not a good choice.
注意:此选项甚至会从有效数据中删除前面的空格,如果出于任何原因您想保留前面的空格,则此选项不是一个好的选择。
回答by Boud
Indicate it has to start with blank and end with blanks with ^ and $ :
表示它必须以空格开头并以空格结尾 ^ 和 $ :
df.replace(r'^\s*$', np.nan, regex=True, inplace = True)