pandas 熊猫将“NA”转换为 NaN
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16596188/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Convert 'NA' to NaN
提问by ericmjl
I just picked up Pandas to do with some data analysis work in my biology research. Turns out one of the proteins I'm analyzing is called 'NA'.
我刚刚拿起 Pandas 来做我生物学研究中的一些数据分析工作。原来我正在分析的一种蛋白质称为“NA”。
I have a matrix with pairwise 'HA, M1, M2, NA, NP...' on the column headers, and the same as "row headers" (for the biologists who might read this, I'm working with influenza).
我有一个矩阵,列标题上有成对的“HA、M1、M2、NA、NP...”,与“行标题”相同(对于可能会阅读本文的生物学家,我正在研究流感)。
When I import the data into Pandas directly from a CSV file, it reads the "row headers" as 'HA, M1, M2...' and then NA gets read as NaN. Is there any way to stop this? The column headers are fine - 'HA, M1, M2, NA, NP etc...'
当我直接从 CSV 文件将数据导入 Pandas 时,它会将“行标题”读取为“HA、M1、M2...”,然后将 NA 读取为 NaN。有什么办法可以阻止这种情况吗?列标题很好 - 'HA、M1、M2、NA、NP 等......'
回答by Dan Allan
Turn off NaN detection this way: pd.read_csv(filename, keep_default_na=False)
以这种方式关闭 NaN 检测: pd.read_csv(filename, keep_default_na=False)
I originally suggested na_filter=False, which gets the job done. But, if I understand Jeff's comments below, this is a cleaner solution.
我最初建议na_filter=False,它可以完成工作。但是,如果我理解杰夫在下面的评论,这是一个更清晰的解决方案。
Example:
例子:
In [1]: pd.read_csv('test')
Out[1]:[4]: pd.read_csv('test', keep_default_na=False)
Out[4]:1 2
2 3
回答by techvslife
Just ran into this issue--I specified a str converter for the column instead, so I could keep na elsewhere:
pd.read_csv(... , converters={ "file name": str, "company name": str})
刚遇到这个问题——我为列指定了一个 str 转换器,所以我可以将 na 保留在其他地方:
pd.read_csv(... , converters={ "file name": str, "company name": str})

