pandas 保存熊猫数据框但保留 NA 值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36405994/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:59:36  来源:igfitidea点击:

Save pandas dataframe but conserving NA values

pythoncsvpandasnanna

提问by Náthali

I have this code

我有这个代码

import pandas as pd
import numpy as np
import csv
df = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),
               'size': list('SSMMMLL'),
               'weight': [8, 10, 11, 1, 20, 12, 12],
               'adult' : [False] * 5 + [True] * 2}); 

And I changed the weight with NA values:

我用 NA 值改变了权重:

df['weight'] = np.nan

And finally I saved it

最后我保存了它

df.to_csv("ejemplo.csv", sep=";", decimal=",", quoting=csv.QUOTE_NONNUMERIC, index=False)

But when I read the file I have "", instead of NA I want to put NA instead of Nan

但是当我阅读文件时我有“”,而不是 NA 我想把 NA 而不是 Nan

I want as output:

我想作为输出:

adult;animal;size;weight
False;"dog";"S";NA
False;"cat";"M";NA    

采纳答案by crashMOGWAI

To get that specific output, you'll have to pass the quotes in explicitly.

要获得该特定输出,您必须明确传递引号。

df = pd.DataFrame({'animal': r'"cat" "dog" "cat" "fish" "dog" "cat" "cat"'.split(),
           'size': list(r'"S" "S" "M" "M" "M" "L" "L"'.split()),
           'weight': [8, 10, 11, 1, 20, 12, 12],
           'adult' : [False] * 5 + [True] * 2}); 
df['weight'] = '%s' %('NA')
df.to_csv("ejemplo.csv", sep=';', decimal=',',quoting=csv.QUOTE_NONE, index=False)

回答by EdChum

If you want a string to represent NaNvalues then pass na_repto to_csv:

如果你想要一个字符串来表示NaN值,那么传递na_repto_csv

In [8]:
df.to_csv(na_rep='NA')

Out[8]:
',adult,animal,size,weight\n0,False,cat,S,NA\n1,False,dog,S,NA\n2,False,cat,M,NA\n3,False,fish,M,NA\n4,False,dog,M,NA\n5,True,cat,L,NA\n6,True,cat,L,NA\n'

If you want the NAin quotes then escape the quotes:

如果你想要NA引号,然后转义引号:

In [3]:
df = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),
               'size': list('SSMMMLL'),
               'weight': [8, 10, 11, 1, 20, 12, 12],
               'adult' : [False] * 5 + [True] * 2})
df['weight'] = np.NaN
df.to_csv(na_rep='\'NA\'')

Out[3]:
",adult,animal,size,weight\n0,False,cat,S,'NA'\n1,False,dog,S,'NA'\n2,False,cat,M,'NA'\n3,False,fish,M,'NA'\n4,False,dog,M,'NA'\n5,True,cat,L,'NA'\n6,True,cat,L,'NA'\n"

EDIT

编辑

To get the desired output use these params:

要获得所需的输出,请使用以下参数:

In [27]:
df.to_csv(na_rep='NA', sep=';', index=False,quoting=3)
?
Out[27]:
'adult;animal;size;weight\nFalse;cat;S;NA\nFalse;dog;S;NA\nFalse;cat;M;NA\nFalse;fish;M;NA\nFalse;dog;M;NA\nTrue;cat;L;NA\nTrue;cat;L;NA\n'