pandas 我怎么知道熊猫数据框单元格的类型

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49926897/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:29:35  来源:igfitidea点击:

How can I know the type of a pandas dataframe cell

pythonexcelpandasdataframe

提问by John Smith

I have a dataframe, for example:

我有一个数据框,例如:

1
1.3
2,5
4
5

With the following code, I am trying to know what are the types of the different cells of my pandas dataframe:

使用以下代码,我想知道我的 Pandas 数据帧的不同单元格的类型是什么:

for i in range (len(data.columns)) :
                print (" lenth of  columns : " + str(len(data.columns)) )
                for j in range (len(data[i])) :
                    data[i][j]=re.sub(r'(\d*)\.(\d*)',r',',str(data[i][j]))
                    print(str(data[i][j]))

                    print(" est de type : "type(data[i][j]))
                    if str(data[i][j]).isdigit():
                        print(str(data[i][j]) + " contain a number  " )

The problem is when a cell of the dataframe contain a dot, pandas thinks it is a string. So I used regex, in order to change the dot into a comma.

问题是当数据帧的一个单元格包含一个点时,pandas 认为它​​是一个字符串。所以我使用了正则表达式,以便将点更改为逗号。

But after that, the types of all my dataframe cells changed to string. My question is: How can I know if a cell of the dataframe is an int or a float? I already tried isinstance(x, int)

但在那之后,我所有数据帧单元格的类型都更改为字符串。我的问题是:我怎么知道数据帧的单元格是 int 还是 float?我已经试过了isinstance(x, int)

edit : How can I count the number of int and float, with the output of the df.apply(type) for example , I want to know how many cells of my column are int or float

编辑:如何计算 int 和 float 的数量,例如 df.apply(type) 的输出,我想知道我的列中有多少个单元格是 int 或 float

My second question is, why when I have 2.5 , the dataframe give him the str type ?

我的第二个问题是,为什么当我有 2.5 时,数据框会给他 str 类型?

    0       <class 'int'>
1       <class 'str'>
2     <class 'float'>
3     <class 'float'>
4       <class 'int'>
5       <class 'str'>
6       <class 'str'>

Thanks.

谢谢。

回答by rafaelc

If you have a column with different types, e.g.

如果您有不同类型的列,例如

>>> df = pd.DataFrame(data = {"l": [1,"a", 10.43, [1,3,4]]})
>>> df
           l
0          1
1          a
2      10.43
4  [1, 3, 4]

Pandas will just state that this Seriesis of dtype object. However, you can get each entry type by simply applying typefunction

Pandas 只会声明这Series是 dtype object。但是,您可以通过简单地应用type函数来获取每个条目类型

>>> df.l.apply(type)
0     <type 'int'>
1     <type 'str'>
2     <type 'float'>
4     <type 'list'>

However, if you have a dataset with very different data types, you probably should reconsider its design..

但是,如果您有一个数据类型非常不同的数据集,您可能应该重新考虑其设计。