pandas 如何在熊猫数据框中获取数字列名称

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51684585/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:53:22  来源:igfitidea点击:

how to get numeric column names in pandas dataframe

pythonpandas

提问by Neil

I have pandas dataframe which has object,int64,float64datatypes. I want to get column names for int64 and float64columns. I am using following command in pandas,but it does not seem to work

我有具有object,int64,float64数据类型的Pandas数据框。我想获取列的列名int64 and float64。我在Pandas中使用以下命令,但它似乎不起作用

cat_num_prv_app = [num for num in list(df.columns) if isinstance(num, (np.int64,np.float64))]

Following are my datatypes

以下是我的数据类型

 df.info()
 <class 'pandas.core.frame.DataFrame'>
 RangeIndex: 1670214 entries, 0 to 1670213
 Data columns (total 37 columns):
 ID               1670214 non-null int64
 NAME             1670214 non-null object
 ANNUITY          1297979 non-null float64
 AMOUNT           1670214 non-null float64
 CREDIT           1670213 non-null float64

I want to store column names ID,ANNUITY,AMOUNT and CREDITin a variable,which I can use later to subset the dataframe.

我想将列名存储ID,ANNUITY,AMOUNT and CREDIT在一个变量中,稍后我可以使用它来对数据帧进行子集化。

回答by jezrael

Use select_dtypeswith np.numberfor select all numeric columns:

使用select_dtypesnp.number用于选择所有数字列:

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4.5,5,4,5,5,4],
                   'C':[7.4,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':list('aaabbb')})

print (df)
   A    B    C  D  E
0  a  4.5  7.4  1  a
1  b  5.0  8.0  3  a
2  c  4.0  9.0  5  a
3  d  5.0  4.0  7  b
4  e  5.0  2.0  1  b
5  f  4.0  3.0  0  b

print (df.dtypes)
A     object
B    float64
C    float64
D      int64
E     object
dtype: object

cols = df.select_dtypes([np.number]).columns
print (cols)
Index(['B', 'C', 'D'], dtype='object')

Here is possible specify float64and int64:

这里可以指定float64int64

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4.5,5,4,5,5,4],
                   'C':[7,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':list('aaabbb')})

df['D'] = df['D'].astype(np.int32)
print (df.dtypes)
A     object
B    float64
C      int64
D      int32
E     object
dtype: object

cols = df.select_dtypes([np.int64,np.float64]).columns
print (cols)
Index(['B', 'C'], dtype='object')

回答by Teoretic

Alternative solution using "np.where"
(uglier than approved answer though)

使用“np.where”的替代解决方案
(虽然比批准的答案更丑)

df.iloc[:, (np.where((df.dtypes == np.int64) | (df.dtypes == np.float64)))[0]].columns

df.iloc[:, (np.where((df.dtypes == np.int64) | (df.dtypes == np.float64)))[0]].columns

Sample code:

示例代码:

import pandas as pd
import numpy as np

df = pd.DataFrame({"A": [1, 2, 3], "B": [1.0, 2.0, 3.0], "C": ["a", "b", "c"]})

print(df.iloc[:, (np.where((df.dtypes == np.int64) | 
                 (df.dtypes == np.float64)))[0]].columns)

> Index(['A', 'B'], dtype='object')