pandas 取列的对数

Question

提问by Kate

Im quite new to programming (in python) and I would like to create a new variable that is the logarithm of a column (from an imported excel file). I have tried different solutions from this site, but I keep getting an error. My latest error is AttributeError: 'str' object has no attribute 'log'.I have already dropped all the values that are not "numbers', but I still don't know how to convert the values from strings to integers (if this is the case, because 'int(neighborhood)' doesn't work).

我对编程（在 python 中）很陌生，我想创建一个新变量，它是列的对数（来自导入的 excel 文件）。我尝试了该站点的不同解决方案，但一直出现错误。我的最新错误是AttributeError: 'str' object has no attribute 'log'.我已经删除了所有不是“数字”的值，但我仍然不知道如何将值从字符串转换为整数（如果是这种情况，因为“int(neighborhood)”没有“工作）。

This is the code I have now:

这是我现在的代码：

import pandas as pd
import numpy as np

df=pd.read_excel("kwb-2016_del_col_del_row.xls")
df = df[df.m_woz != "."] # drop rows with values "."
neighborhood=df[df.recs=="Neighborhood"]
neighborhood=neighborhood["m_woz"]
print(neighborhood)

np.log(neighborhood)

and this is the error I'm getting:

这是我得到的错误：

AttributeError                            Traceback (most recent call last)
<ipython-input-66-46698de51811> in <module>()
     12 print(neighborhood)
     13 
---> 14 np.log(neighborhood)


AttributeError: 'str' object has no attribute 'log'

Could someone help me please?

有人可以帮我吗？

Answer 1

回答by akubot

Perhaps you are not removing the data you think you are?
Try printing the data types to see what they are.
In a DataFrame, your column might be filled with objectsinstead of numbers.

也许您没有删除您认为的数据？
尝试打印数据类型以查看它们是什么。
在 DataFrame 中，您的列可能会填充对象而不是数字。

print(df.dtypes)

Also, you might want to look at these two pages

另外，你可能想看看这两页

Select row from a DataFrame based on the type of the object(i.e. str)

根据对象的类型（即 str ）从 DataFrame 中选择行

Pandas: convert dtype 'object' to int

Pandas：将 dtype 'object' 转换为 int

Here's an example I constructed and ran interactively that correctly gets the logarithms (don't type >>>):

这是我构建并以交互方式运行的示例，它可以正确获取对数（不要输入 >>>）：

>>> raw_data = {'m_woz': ['abc', 'def', 1.23, 45.6, '.xyz'], 
    'recs': ['Neighborhood', 'Neighborhood', 
    'unknown', 'Neighborhood', 'whatever']}
>>> df = pd.DataFrame(raw_data, columns = ['m_woz', 'recs'])
>>> print(df.dtypes)
m_woz    object
recs     object
dtype: object

Note that the type is object, not floator intor str

注意类型是object，not floator intorstr

Continuing on, here is what dfand neighborhoodlook like:

继续，这里是什么df和neighborhood看起来像：

>>> df
  m_woz          recs
0    42  Neighborhood
1   def  Neighborhood
2  1.23       unknown
3  45.6  Neighborhood
4  .xyz      whatever

>>> neighborhood=df[df.recs=="Neighborhood"]
>>> neighborhood

  m_woz          recs
0    42  Neighborhood
1   def  Neighborhood
3  45.6  Neighborhood

And here are the tricks... This line selects all rows in neighborhoodthat are intor float(be careful to fix indents if you copy/paste this

而这里的窍门......这条线选择中的所有行neighborhood是int或者float如果复制（小心修复缩进/粘贴此

>>> df_num_strings = neighborhood[neighborhood['m_woz'].
        apply(lambda x: type(x) in (int, float))]

>>> df_num_strings
  m_woz          recs
0    42  Neighborhood
3  45.6  Neighborhood

Almost there... convert the numbers to floating point from string

差不多了……将数字从字符串转换为浮点数

>>> df_float = df_num_strings['m_woz'].astype(str).astype(float)
>>> df_float
0    42.0
3    45.6

Finally, compute logarithms:

最后，计算对数：

>>> np.log(df_float)
0    3.737670
3    3.819908
Name: m_woz, dtype: float64

pandas 取列的对数

提问by Kate

回答by akubot

相关推荐

最近更新

标签

pandas 取列的对数

提问by Kate

回答by akubot

相关推荐

pandas 向数据框的所有行添加值

pandas Python 在列表或数组中的范围之间查找数字

使用 Pandas 和 spaCy 进行标记

pandas 如何将元组值设置为熊猫数据框？

相关推荐

最近更新

标签