pandas 熊猫将“NA”转换为 NaN

Question

提问by ericmjl

I just picked up Pandas to do with some data analysis work in my biology research. Turns out one of the proteins I'm analyzing is called 'NA'.

我刚刚拿起 Pandas 来做我生物学研究中的一些数据分析工作。原来我正在分析的一种蛋白质称为“NA”。

I have a matrix with pairwise 'HA, M1, M2, NA, NP...' on the column headers, and the same as "row headers" (for the biologists who might read this, I'm working with influenza).

我有一个矩阵，列标题上有成对的“HA、M1、M2、NA、NP...”，与“行标题”相同（对于可能会阅读本文的生物学家，我正在研究流感）。

When I import the data into Pandas directly from a CSV file, it reads the "row headers" as 'HA, M1, M2...' and then NA gets read as NaN. Is there any way to stop this? The column headers are fine - 'HA, M1, M2, NA, NP etc...'

当我直接从 CSV 文件将数据导入 Pandas 时，它会将“行标题”读取为“HA、M1、M2...”，然后将 NA 读取为 NaN。有什么办法可以阻止这种情况吗？列标题很好 - 'HA、M1、M2、NA、NP 等......'

Answer 1

回答by Dan Allan

Turn off NaN detection this way: pd.read_csv(filename, keep_default_na=False)

以这种方式关闭 NaN 检测： pd.read_csv(filename, keep_default_na=False)

I originally suggested na_filter=False, which gets the job done. But, if I understand Jeff's comments below, this is a cleaner solution.

我最初建议na_filter=False，它可以完成工作。但是，如果我理解杰夫在下面的评论，这是一个更清晰的解决方案。

Example:

例子：

In [1]: pd.read_csv('test')
Out[1]:[4]: pd.read_csv('test', keep_default_na=False)
Out[4]:1   2
2   3

Answer 2

回答by techvslife

Just ran into this issue--I specified a str converter for the column instead, so I could keep na elsewhere: pd.read_csv(... , converters={ "file name": str, "company name": str})

刚遇到这个问题——我为列指定了一个 str 转换器，所以我可以将 na 保留在其他地方： pd.read_csv(... , converters={ "file name": str, "company name": str})

pandas 熊猫将“NA”转换为 NaN

提问by ericmjl

回答by Dan Allan

回答by techvslife

相关推荐

最近更新

标签

pandas 熊猫将“NA”转换为 NaN

提问by ericmjl

回答by Dan Allan

回答by techvslife

相关推荐

Pandas 数据框作为 matplotlib.pyplot.boxplot 的输入

pandas 以相反的顺序遍历 DataFrame 行索引

pandas 熊猫中的条件替换

Pandas read_csv 用字符串 'nan' 填充空值，而不是解析日期

相关推荐

最近更新

标签