Python 根据数据类型获取熊猫数据框列的列表

Question

提问by yoshiserry

If I have a dataframe with the following columns:

如果我有一个包含以下列的数据框：

1. NAME                                     object
2. On_Time                                      object
3. On_Budget                                    object
4. %actual_hr                                  float64
5. Baseline Start Date                  datetime64[ns]
6. Forecast Start Date                  datetime64[ns]

I would like to be able to say: here is a dataframe, give me a list of the columns which are of type Object or of type DateTime?

我想说的是：这是一个数据框，给我一个 Object 类型或 DateTime 类型的列的列表？

I have a function which converts numbers (Float64) to two decimal places, and I would like to use this list of dataframe columns, of a particular type, and run it through this function to convert them all to 2dp.

我有一个将数字 (Float64) 转换为两位小数的函数，我想使用此特定类型的数据帧列列表，并通过此函数运行它以将它们全部转换为 2dp。

Maybe:

也许：

For c in col_list: if c.dtype = "Something"
list[]
List.append(c)?

Answer 1

采纳答案by DSM

If you want a list of columns of a certain type, you can use groupby:

如果您想要某种类型的列列表，您可以使用groupby：

>>> df = pd.DataFrame([[1, 2.3456, 'c', 'd', 78]], columns=list("ABCDE"))
>>> df
   A       B  C  D   E
0  1  2.3456  c  d  78

[1 rows x 5 columns]
>>> df.dtypes
A      int64
B    float64
C     object
D     object
E      int64
dtype: object
>>> g = df.columns.to_series().groupby(df.dtypes).groups
>>> g
{dtype('int64'): ['A', 'E'], dtype('float64'): ['B'], dtype('O'): ['C', 'D']}
>>> {k.name: v for k, v in g.items()}
{'object': ['C', 'D'], 'int64': ['A', 'E'], 'float64': ['B']}

Answer 2

回答by Andy Hayden

You can use boolean mask on the dtypes attribute:

您可以在 dtypes 属性上使用布尔掩码：

In [11]: df = pd.DataFrame([[1, 2.3456, 'c']])

In [12]: df.dtypes
Out[12]: 
0      int64
1    float64
2     object
dtype: object

In [13]: msk = df.dtypes == np.float64  # or object, etc.

In [14]: msk
Out[14]: 
0    False
1     True
2    False
dtype: bool

You can look at just those columns with the desired dtype:

您可以仅查看具有所需 dtype 的那些列：

In [15]: df.loc[:, msk]
Out[15]: 
        1
0  2.3456

Now you can use round (or whatever) and assign it back:

现在您可以使用 round（或其他）并将其分配回：

In [16]: np.round(df.loc[:, msk], 2)
Out[16]: 
      1
0  2.35

In [17]: df.loc[:, msk] = np.round(df.loc[:, msk], 2)

In [18]: df
Out[18]: 
   0     1  2
0  1  2.35  c

Answer 3

回答by qmorgan

As of pandas v0.14.1, you can utilize select_dtypes()to select columns by dtype

从 pandas v0.14.1 开始，您可以利用 dtypeselect_dtypes()选择列

In [2]: df = pd.DataFrame({'NAME': list('abcdef'),
    'On_Time': [True, False] * 3,
    'On_Budget': [False, True] * 3})

In [3]: df.select_dtypes(include=['bool'])
Out[3]:
  On_Budget On_Time
0     False    True
1      True   False
2     False    True
3      True   False
4     False    True
5      True   False

In [4]: mylist = list(df.select_dtypes(include=['bool']).columns)

In [5]: mylist
Out[5]: ['On_Budget', 'On_Time']

Answer 4

回答by qmorgan

If you want a list of only the object columns you could do:

如果您只需要对象列的列表，您可以执行以下操作：

non_numerics = [x for x in df.columns \
                if not (df[x].dtype == np.float64 \
                        or df[x].dtype == np.int64)]

and then if you want to get another list of only the numerics:

然后如果你想得到另一个只有数字的列表：

numerics = [x for x in df.columns if x not in non_numerics]

Answer 5

回答by Ashish Sahu

Using dtypewill give you desired column's data type:

使用dtype将为您提供所需列的数据类型：

dataframe['column1'].dtype

if you want to know data types of all the column at once, you can use plural of dtypeas dtypes:

如果你想一次知道所有列的数据类型，你可以使用复数dtype作为dtypes：

dataframe.dtypes

Answer 6

回答by Tanmoy

list(df.select_dtypes(['object']).columns)

This should do the trick

这应该可以解决问题

Answer 7

回答by Koo

use df.info(verbose=True)where dfis a pandas datafarme, by default verbose=False

默认情况下使用df.info(verbose=True)哪里df是熊猫数据场verbose=False

Answer 8

回答by MLKing

The most direct way to get a list of columns of certain dtype e.g. 'object':

获取某些 dtype 的列列表的最直接方法，例如“对象”：

df.select_dtypes(include='object').columns

For example:

例如：

>>df = pd.DataFrame([[1, 2.3456, 'c', 'd', 78]], columns=list("ABCDE"))
>>df.dtypes

A      int64
B    float64
C     object
D     object
E      int64
dtype: object

To get all 'object' dtype columns:

要获取所有“对象”dtype 列：

>>df.select_dtypes(include='object').columns

Index(['C', 'D'], dtype='object')

For just the list:

仅用于列表：

>>list(df.select_dtypes(include='object').columns)

['C', 'D']

Answer 9

回答by geekidharsh

I came up with this three liner.

我想出了这三个班轮。

Essentially, here's what it does:

本质上，这是它的作用：

Fetch the column names and their respective data types.
I am optionally outputting it to a csv.

获取列名及其各自的数据类型。
我可以选择将其输出到 csv。

inp = pd.read_csv('filename.csv') # read input. Add read_csv arguments as needed
columns = pd.DataFrame({'column_names': inp.columns, 'datatypes': inp.dtypes})
columns.to_csv(inp+'columns_list.csv', encoding='utf-8') # encoding is optional

This made my life much easier in trying to generate schemason the fly. Hope this helps

这让我在尝试动态生成模式时变得更加轻松。希望这可以帮助

Answer 10

回答by itthrill

for yoshiserry;

为yoshiserry;

def col_types(x,pd):
    dtypes=x.dtypes
    dtypes_col=dtypes.index
    dtypes_type=dtypes.value
    column_types=dict(zip(dtypes_col,dtypes_type))
    return column_types

Python 根据数据类型获取熊猫数据框列的列表

提问by yoshiserry

采纳答案by DSM

回答by Andy Hayden

回答by qmorgan

回答by qmorgan

回答by Ashish Sahu

回答by Tanmoy

回答by Koo

回答by MLKing

回答by geekidharsh

回答by itthrill

相关推荐

最近更新

标签

Python 根据数据类型获取熊猫数据框列的列表

提问by yoshiserry

采纳答案by DSM

回答by Andy Hayden

回答by qmorgan

回答by qmorgan

回答by Ashish Sahu

回答by Tanmoy

回答by Koo

回答by MLKing

回答by geekidharsh

回答by itthrill

相关推荐

Python 使用 urllib3 下载文件的最佳方式是什么

使用 PyCharm 绘制乌龟（python）

Python `sorted(list)` 和 `list.sort()` 有什么区别？

Python从视频文件中提取wav

相关推荐

最近更新

标签