Python 如何获取包含特定数据类型的 pandas.DataFrame 列

Question

提问by Charlie_M

I'm using df.columns.values to make a list of column names which I then iterate over and make charts, etc... but when I set this up I overlooked the non-numeric columns in the df. Now, I'd much rather not simply drop those columns from the df (or a copy of it). Instead, I would like to find a slick way to eliminate them from the list of column names.

我正在使用 df.columns.values 来制作列名列表，然后我对其进行迭代并制作图表等......但是当我设置它时，我忽略了 df 中的非数字列。现在，我宁愿不简单地从 df （或它的副本）中删除这些列。相反，我想找到一种巧妙的方法将它们从列名列表中消除。

Now I have:

我现在有：

names = df.columns.values

what I'd like to get to is something that behaves like:

我想要的是这样的行为：

names = df.columns.values(column_type=float64)

Is there any slick way to do this? I suppose I could make a copy of the df, and drop those non-numeric columns before doing columns.values, but that strikes me as clunky.

有没有什么巧妙的方法来做到这一点？我想我可以制作 df 的副本，并在执行 column.values 之前删除那些非数字列，但这让我觉得很笨拙。

Welcome any inputs/suggestions. Thanks.

欢迎任何意见/建议。谢谢。

Answer 1

采纳答案by Woody Pride

Someone will give you a better answe than this possibly, but one thing I tend to do is if all my numeric data are int64or float64objects, then you can create a dict of the column data types and then use the values to create your list of columns.

有人会给你一个比这更好的答案，但我倾向于做的一件事是，如果我所有的数字数据都是int64或float64对象，那么你可以创建一个列数据类型的字典，然后使用这些值来创建你的列列表.

So for example, in a dataframe where I have columns of type float64, int64and objectfirstly you can look at the data types as so:

因此，例如，在一个数据帧在那里我有类型的列float64，int64并object首先你可以看一下数据类型为这样：

DF.dtypes

and if they conform to the standard whereby the non-numeric columns of data are all objecttypes (as they are in my dataframes), then you can do the following to get a list of the numeric columns:

如果它们符合标准，即非数字数据列都是object类型（就像它们在我的数据框中一样），那么您可以执行以下操作来获取数字列的列表：

[key for key in dict(DF.dtypes) if dict(DF.dtypes)[key] in ['float64', 'int64']]

Its just a simple list comprehension. Nothing fancy. Again, though whether this works for you will depend upon how you set up you dataframe...

它只是一个简单的列表理解。没有什么花哨。同样，尽管这是否适合您将取决于您如何设置数据框......

Answer 2

回答by chrisb

There's a new feature in 0.14.1, select_dtypesto select columns by dtype, by providing a list of dtypes to include or exclude.

0.14.1 中有一个新功能select_dtypes，通过提供要包含或排除的 dtype 列表，按dtype选择列。

For example:

例如：

df = pd.DataFrame({'a': np.random.randn(1000),
                   'b': range(1000),
                   'c': ['a'] * 1000,
                   'd': pd.date_range('2000-1-1', periods=1000)})


df.select_dtypes(['float64','int64'])

Out[129]: 
            a    b
0    0.153070    0
1    0.887256    1
2   -1.456037    2
3   -1.147014    3
...

Answer 3

回答by Arthur Zennig

dtypes is a Pandas Series. That means it contains index & values attributes. If you only need the column names:

dtypes 是 Pandas 系列。这意味着它包含索引和值属性。如果您只需要列名：

headers = df.dtypes.index

it will return a list containing the column names of "df" dataframe.

它将返回一个包含“df”数据框列名的列表。

Answer 4

回答by J11

To get the column names from pandas dataframe in python3- Here I am creating a data frame from a fileName.csv file

从python3中的pandas数据框中获取列名-这里我从fileName.csv文件创建一个数据框

>>> import pandas as pd
>>> df = pd.read_csv('fileName.csv')
>>> columnNames = list(df.head(0)) 
>>> print(columnNames)

Answer 5

回答by Ritik Raj Srivastava

You can also try to get the column names from panda data frame that returns columnn name as well dtype. here i'll read csv file from https://mlearn.ics.uci.edu/databases/autos/imports-85.databut you have define header that contain columns names.

您还可以尝试从返回 columnn name 和 dtype 的 panda 数据框中获取列名。在这里，我将从https://mlearn.ics.uci.edu/databases/autos/imports-85.data读取 csv 文件，但您已经定义了包含列名称的标题。

import pandas as pd

url="https://mlearn.ics.uci.edu/databases/autos/imports-85.data"

df=pd.read_csv(url,header = None)

headers=["symboling","normalized-losses","make","fuel-type","aspiration","num-of-doors","body-style",
         "drive-wheels","engine-location","wheel-base","length","width","height","curb-weight","engine-type",
         "num-of-cylinders","engine-size","fuel-system","bore","stroke","compression-ratio","horsepower","peak-rpm"
         ,"city-mpg","highway-mpg","price"]

df.columns=headers

print df.columns

Python 如何获取包含特定数据类型的 pandas.DataFrame 列

提问by Charlie_M

采纳答案by Woody Pride

回答by chrisb

回答by Arthur Zennig

回答by J11

回答by Ritik Raj Srivastava

相关推荐

最近更新

标签

Python 如何获取包含特定数据类型的 pandas.DataFrame 列

提问by Charlie_M

采纳答案by Woody Pride

回答by chrisb

回答by Arthur Zennig

回答by J11

回答by Ritik Raj Srivastava

相关推荐

字节文字的 Python 比较

在不使用第三个变量的情况下交换 2 个变量的 2 个值；Python

Python 使用 urllib 下载 pdf？

Python 如何使用 Tkinter 创建自动更新的 GUI？

相关推荐

最近更新

标签