Python 如何在 Pandas 中找到数字列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25039626/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I find numeric columns in Pandas?
提问by Hanan Shteingart
Let's say dfis a pandas DataFrame.
I would like to find all columns of numeric type.
Something like:
假设df是一个 Pandas DataFrame。我想找到所有数字类型的列。就像是:
isNumeric = is_numeric(df)
回答by Hanan Shteingart
def is_type(df, baseType):
import numpy as np
import pandas as pd
test = [issubclass(np.dtype(d).type, baseType) for d in df.dtypes]
return pd.DataFrame(data = test, index = df.columns, columns = ["test"])
def is_float(df):
import numpy as np
return is_type(df, np.float)
def is_number(df):
import numpy as np
return is_type(df, np.number)
def is_integer(df):
import numpy as np
return is_type(df, np.integer)
回答by Garrett
Adapting this answer, you could do
改编这个答案,你可以做
df.ix[:,df.applymap(np.isreal).all(axis=0)]
Here, np.applymap(np.isreal)shows whether every cell in the data frame is numeric, and .axis(all=0)checks if all values in a column are True and returns a series of Booleans that can be used to index the desired columns.
在这里,np.applymap(np.isreal)显示数据框中的每个单元格是否都是数字,并.axis(all=0)检查列中的所有值是否都为 True,并返回一系列可用于索引所需列的布尔值。
回答by Anand
You could use select_dtypesmethod of DataFrame. It includes two parameters include and exclude. So isNumeric would look like:
您可以使用select_dtypesDataFrame 的方法。它包括两个参数包括和排除。所以 isNumeric 看起来像:
numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
newdf = df.select_dtypes(include=numerics)
回答by Kathirmani Sukumar
You can use the undocumented function _get_numeric_data()to filter only numeric columns:
您可以使用未记录的函数_get_numeric_data()仅过滤数字列:
df._get_numeric_data()
Example:
例子:
In [32]: data
Out[32]:
A B
0 1 s
1 2 s
2 3 s
3 4 s
In [33]: data._get_numeric_data()
Out[33]:
A
0 1
1 2
2 3
3 4
Note that this is a "private method" (i.e., an implementation detail) and is subject to change or total removal in the future. Use with caution.
请注意,这是一个“私有方法”(即实现细节),将来可能会更改或完全删除。谨慎使用。
回答by YOBEN_S
df.select_dtypes(exclude=['object'])
回答by Anvesh_vs
This is another simple code for finding numeric column in pandas data frame,
这是在熊猫数据框中查找数字列的另一个简单代码,
numeric_clmns = df.dtypes[df.dtypes != "object"].index
回答by stackoverflowuser2010
Simple one-line answer to create a new dataframe with only numeric columns:
创建仅包含数字列的新数据框的简单单行答案:
df.select_dtypes(include=np.number)
If you want the names of numeric columns:
如果您想要数字列的名称:
df.select_dtypes(include=np.number).columns.tolist()
Complete code:
完整代码:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': range(7, 10),
'B': np.random.rand(3),
'C': ['foo','bar','baz'],
'D': ['who','what','when']})
df
# A B C D
# 0 7 0.704021 foo who
# 1 8 0.264025 bar what
# 2 9 0.230671 baz when
df_numerics_only = df.select_dtypes(include=np.number)
df_numerics_only
# A B
# 0 7 0.704021
# 1 8 0.264025
# 2 9 0.230671
colnames_numerics_only = df.select_dtypes(include=np.number).columns.tolist()
colnames_numerics_only
# ['A', 'B']
回答by mickey
Please see the below code:
请看下面的代码:
if(dataset.select_dtypes(include=[np.number]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.number]).describe())
if(dataset.select_dtypes(include=[np.object]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.object]).describe())
This way you can check whether the value are numeric such as float and int or the srting values. the second if statement is used for checking the string values which is referred by the object.
通过这种方式,您可以检查值是否为数字(例如 float 和 int)或 srting 值。第二个 if 语句用于检查对象引用的字符串值。
回答by Hukmaram
Following codes will return list of names of the numeric columns of a data set.
以下代码将返回数据集的数字列的名称列表。
cnames=list(marketing_train.select_dtypes(exclude=['object']).columns)
here marketing_trainis my data set and select_dtypes()is function to select data types using exclude and include arguments and columns is used to fetch the column name of data set
output of above code will be following:
这marketing_train是我的数据集,select_dtypes()是使用 exclude 和 include 参数选择数据类型的函数,列用于获取上述代码的数据集输出的列名,如下所示:
['custAge',
'campaign',
'pdays',
'previous',
'emp.var.rate',
'cons.price.idx',
'cons.conf.idx',
'euribor3m',
'nr.employed',
'pmonths',
'pastEmail']
Thanks
谢谢
回答by nimbous
Simple one-liner:
简单的单线:
df.select_dtypes('number').columns

