pandas 取列的对数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47229267/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:45:24  来源:igfitidea点击:

Taking logarithm of column

pythonpandasnumpydataframe

提问by Kate

Im quite new to programming (in python) and I would like to create a new variable that is the logarithm of a column (from an imported excel file). I have tried different solutions from this site, but I keep getting an error. My latest error is AttributeError: 'str' object has no attribute 'log'.I have already dropped all the values that are not "numbers', but I still don't know how to convert the values from strings to integers (if this is the case, because 'int(neighborhood)' doesn't work).

我对编程(在 python 中)很陌生,我想创建一个新变量,它是列的对数(来自导入的 excel 文件)。我尝试了该站点的不同解决方案,但一直出现错误。我的最新错误是AttributeError: 'str' object has no attribute 'log'.我已经删除了所有不是“数字”的值,但我仍然不知道如何将值从字符串转换为整数(如果是这种情况,因为“int(neighborhood)”没有“工作)。

This is the code I have now:

这是我现在的代码:

import pandas as pd
import numpy as np

df=pd.read_excel("kwb-2016_del_col_del_row.xls")
df = df[df.m_woz != "."] # drop rows with values "."
neighborhood=df[df.recs=="Neighborhood"]
neighborhood=neighborhood["m_woz"]
print(neighborhood)

np.log(neighborhood)

and this is the error I'm getting:

这是我得到的错误:

AttributeError                            Traceback (most recent call last)
<ipython-input-66-46698de51811> in <module>()
     12 print(neighborhood)
     13 
---> 14 np.log(neighborhood)


AttributeError: 'str' object has no attribute 'log'

Could someone help me please?

有人可以帮我吗?

回答by akubot

Perhaps you are not removing the data you think you are?
Try printing the data types to see what they are.
In a DataFrame, your column might be filled with objectsinstead of numbers.

也许您没有删除您认为的数据?
尝试打印数据类型以查看它们是什么。
在 DataFrame 中,您的列可能会填充对象而不是数字。

print(df.dtypes)

print(df.dtypes)

Also, you might want to look at these two pages

另外,你可能想看看这两页

Select row from a DataFrame based on the type of the object(i.e. str)

根据对象的类型(即 str )从 DataFrame 中选择行

Pandas: convert dtype 'object' to int

Pandas:将 dtype 'object' 转换为 int

Here's an example I constructed and ran interactively that correctly gets the logarithms (don't type >>>):

这是我构建并以交互方式运行的示例,它可以正确获取对数(不要输入 >>>):

>>> raw_data = {'m_woz': ['abc', 'def', 1.23, 45.6, '.xyz'], 
    'recs': ['Neighborhood', 'Neighborhood', 
    'unknown', 'Neighborhood', 'whatever']}
>>> df = pd.DataFrame(raw_data, columns = ['m_woz', 'recs'])
>>> print(df.dtypes)
m_woz    object
recs     object
dtype: object

Note that the type is object, not floator intor str

注意类型是object,not floator intorstr

Continuing on, here is what dfand neighborhoodlook like:

继续,这里是什么dfneighborhood看起来像:

>>> df
  m_woz          recs
0    42  Neighborhood
1   def  Neighborhood
2  1.23       unknown
3  45.6  Neighborhood
4  .xyz      whatever

>>> neighborhood=df[df.recs=="Neighborhood"]
>>> neighborhood

  m_woz          recs
0    42  Neighborhood
1   def  Neighborhood
3  45.6  Neighborhood

And here are the tricks... This line selects all rows in neighborhoodthat are intor float(be careful to fix indents if you copy/paste this

而这里的窍门......这条线选择中的所有行neighborhoodint或者float如果复制(小心修复缩进/粘贴此

>>> df_num_strings = neighborhood[neighborhood['m_woz'].
        apply(lambda x: type(x) in (int, float))]

>>> df_num_strings
  m_woz          recs
0    42  Neighborhood
3  45.6  Neighborhood

Almost there... convert the numbers to floating point from string

差不多了……将数字从字符串转换为浮点数

>>> df_float = df_num_strings['m_woz'].astype(str).astype(float)
>>> df_float
0    42.0
3    45.6

Finally, compute logarithms:

最后,计算对数:

>>> np.log(df_float)
0    3.737670
3    3.819908
Name: m_woz, dtype: float64