pandas 在熊猫数据框中的任何列中删除带有“问号”值的行

Question

提问by Anonymous

I want to remove all rows (or take all rows without) a question mark symbol in any column. I also want to change the elements to floattype.

我想删除任何列中的所有行（或删除所有行）一个问号符号。我也想将元素更改为浮动类型。

Input:

输入：

Output:

输出：

X Y Z
1 2 3
4 4 4

Preferably using pandas dataframe operations.

最好使用Pandas数据框操作。

Answer 1

回答by jezrael

You can try first find string ?in columns, create boolean mask and last filter rows - use boolean indexing. If you need convert columns to float, use astype:

您可以尝试首先?在列中查找字符串，创建布尔掩码并最后过滤行 - 使用布尔索引。如果您需要将列转换为float，请使用astype：

print ~((df['X'] == '?' )  (df['Y'] == '?' ) | (df['Z'] == '?' ))
0    False
1     True
2    False
3     True
4    False
dtype: bool


df1 = df[~((df['X'] == '?' ) | (df['Y'] == '?' ) | (df['Z'] == '?' ))].astype(float)
print df1
   X  Y  Z
1  1  2  3
3  4  4  4

print df1.dtypes
X    float64
Y    float64
Z    float64
dtype: object

Or you can try:

或者你可以试试：

df['X'] = pd.to_numeric(df['X'], errors='coerce')
df['Y'] = pd.to_numeric(df['Y'], errors='coerce')
df['Z'] = pd.to_numeric(df['Z'], errors='coerce')
print df
    X   Y   Z
0   0   1 NaN
1   1   2   3
2 NaN NaN   4
3   4   4   4
4 NaN   2   5
print ((df['X'].notnull() ) & (df['Y'].notnull() ) & (df['Z'].notnull() ))
0    False
1     True
2    False
3     True
4    False
dtype: bool

print df[ ((df['X'].notnull() ) & (df['Y'].notnull() ) & (df['Z'].notnull() )) ].astype(float)
   X  Y  Z
1  1  2  3
3  4  4  4

Better is use:

更好的是使用：

df = df[(df != '?').all(axis=1)]

Or:

或者：

df = df[~(df == '?').any(axis=1)]

Answer 2

回答by Naidu Jithendra

You can try replacing ?with null values

您可以尝试用?空值替换

import numpy as np

data = df.replace("?", "np.Nan")

if you want to replace particular column try this:

如果要替换特定列，请尝试以下操作：

data = df["column name"].replace("?", "np.Nan")

pandas 在熊猫数据框中的任何列中删除带有“问号”值的行

提问by Anonymous

回答by jezrael

回答by Naidu Jithendra

相关推荐

最近更新

标签

pandas 在熊猫数据框中的任何列中删除带有“问号”值的行

提问by Anonymous

回答by jezrael

回答by Naidu Jithendra

相关推荐

pandas 如何创建具有重复字符串值的数据框列？

使用 Pandas 数据框绘制误差线 matplotlib

pandas 将列表列表插入到pandas df的单列中

为什么 Seaborn 调色板不适用于 Pandas 条形图？

相关推荐

最近更新

标签