pandas 如何在熊猫数据框中获取数字列名称

Question

提问by Neil

I have pandas dataframe which has object,int64,float64datatypes. I want to get column names for int64 and float64columns. I am using following command in pandas,but it does not seem to work

我有具有object,int64,float64数据类型的Pandas数据框。我想获取列的列名int64 and float64。我在Pandas中使用以下命令，但它似乎不起作用

cat_num_prv_app = [num for num in list(df.columns) if isinstance(num, (np.int64,np.float64))]

Following are my datatypes

以下是我的数据类型

 df.info()
 <class 'pandas.core.frame.DataFrame'>
 RangeIndex: 1670214 entries, 0 to 1670213
 Data columns (total 37 columns):
 ID               1670214 non-null int64
 NAME             1670214 non-null object
 ANNUITY          1297979 non-null float64
 AMOUNT           1670214 non-null float64
 CREDIT           1670213 non-null float64

I want to store column names ID,ANNUITY,AMOUNT and CREDITin a variable,which I can use later to subset the dataframe.

我想将列名存储ID,ANNUITY,AMOUNT and CREDIT在一个变量中，稍后我可以使用它来对数据帧进行子集化。

Answer 1

回答by jezrael

Use select_dtypeswith np.numberfor select all numeric columns:

使用select_dtypes与np.number用于选择所有数字列：

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4.5,5,4,5,5,4],
                   'C':[7.4,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':list('aaabbb')})

print (df)
   A    B    C  D  E
0  a  4.5  7.4  1  a
1  b  5.0  8.0  3  a
2  c  4.0  9.0  5  a
3  d  5.0  4.0  7  b
4  e  5.0  2.0  1  b
5  f  4.0  3.0  0  b

print (df.dtypes)
A     object
B    float64
C    float64
D      int64
E     object
dtype: object

cols = df.select_dtypes([np.number]).columns
print (cols)
Index(['B', 'C', 'D'], dtype='object')

Here is possible specify float64and int64:

这里可以指定float64和int64：

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4.5,5,4,5,5,4],
                   'C':[7,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':list('aaabbb')})

df['D'] = df['D'].astype(np.int32)
print (df.dtypes)
A     object
B    float64
C      int64
D      int32
E     object
dtype: object

cols = df.select_dtypes([np.int64,np.float64]).columns
print (cols)
Index(['B', 'C'], dtype='object')

Answer 2

回答by Teoretic

Alternative solution using "np.where"
(uglier than approved answer though)

使用“np.where”的替代解决方案
（虽然比批准的答案更丑）

df.iloc[:, (np.where((df.dtypes == np.int64) | (df.dtypes == np.float64)))[0]].columns

Sample code:

示例代码：

import pandas as pd
import numpy as np

df = pd.DataFrame({"A": [1, 2, 3], "B": [1.0, 2.0, 3.0], "C": ["a", "b", "c"]})

print(df.iloc[:, (np.where((df.dtypes == np.int64) | 
                 (df.dtypes == np.float64)))[0]].columns)

> Index(['A', 'B'], dtype='object')

pandas 如何在熊猫数据框中获取数字列名称

提问by Neil

回答by jezrael

回答by Teoretic

相关推荐

最近更新

标签

pandas 如何在熊猫数据框中获取数字列名称

提问by Neil

回答by jezrael

回答by Teoretic

相关推荐

pandas 如何将值添加到熊猫数据框中的新列？

pandas 使用 python 和 matplotlib 的时间线条形图

什么是 Pandas 中 dataframe.loc() 的 Numpy 等价物

在 Python Pandas 中同时融化多个列

相关推荐

最近更新

标签