在 Python Pandas 中删除 NaN 并转换为 float32

Question

提问by tsotsi

I am reading in data from a csv file into a data frame, trying to remove all rows that contain NaNs and then convert it from float64 to float32. I have tried various solutions I've found online, nothing seems to work. Any thoughts?

我正在将 csv 文件中的数据读入数据框中，尝试删除所有包含 NaN 的行，然后将其从 float64 转换为 float32。我尝试了在网上找到的各种解决方案，似乎没有任何效果。有什么想法吗？

Answer 1

采纳答案by DalekSec

I think this does what you want:

我认为这可以满足您的要求：

pd.read_csv('Filename.csv').dropna().astype(np.float32)

To keep rows that only have someNaN values, do this:

要保留只有一些NaN 值的行，请执行以下操作：

pd.read_csv('Filename.csv').dropna(how='all').astype(np.float32)

To replace each NaN with a number instead of dropping rows, do this:

要将每个 NaN 替换为数字而不是删除行，请执行以下操作：

pd.read_csv('Filename.csv').fillna(1e6).astype(np.float32)

(I replaced NaN with 1,000,000 just as an example.)

（作为示例，我用 1,000,000 替换了 NaN。）

Answer 2

回答by Alexander

You can also specify the dtypewhen you read the csv file:

您还可以指定dtype读取 csv 文件的时间：

dtype : Type name or dict of column -> type Data type for data or columns. E.g. {'a': np.float64, 'b': np.int32}

dtype ：类型名称或列的 dict -> 类型数据或列的数据类型。例如 {'a': np.float64, 'b': np.int32}

pd.read_csv(my_file, dtype={col: np.float32 for col in ['col_1', 'col_2']})

Example:

例子：

df_out = pd.DataFrame(np.random.random([5,5]), columns=list('ABCDE'))
df_out.iat[1,0] = np.nan 
df_out.iat[2,1] = np.nan
df_out.to_csv('my_file.csv')

df = pd.read_csv('my_file.csv', dtype={col: np.float32 for col in list('ABCDE')})
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 0 to 4
Data columns (total 6 columns):
Unnamed: 0    5 non-null int64
A             4 non-null float32
B             4 non-null float32
C             5 non-null float32
D             5 non-null float32
E             5 non-null float32
dtypes: float32(5), int64(1)
memory usage: 180.0 bytes

>>> df.dropna(axis=0, how='any')
   Unnamed: 0         A         B         C         D         E
0           0  0.176224  0.943918  0.322430  0.759862  0.028605
3           3  0.723643  0.105813  0.884290  0.589643  0.913065
4           4  0.654378  0.400152  0.763818  0.416423  0.847861

在 Python Pandas 中删除 NaN 并转换为 float32

提问by tsotsi

采纳答案by DalekSec

回答by Alexander

相关推荐

最近更新

标签

在 Python Pandas 中删除 NaN 并转换为 float32

提问by tsotsi

采纳答案by DalekSec

回答by Alexander

相关推荐

如何在长 Pandas 系列上应用三次样条插值？

从 pandas.dataframe 中删除低频值

pandas 在熊猫中分配线条颜色

pandas Python 熊猫相关 corr() TypeError：无法将 ['pearson'] 与块值进行比较

相关推荐

最近更新

标签