pandas 熊猫系列/数据框的对数

Question

提问by durbachit

In short: How can I get a logarithm of a column of a pandas dataframe? I thought numpy.log()should work on it, but it isn't. I suspect it's because I have some NaNs in the dataframe?

简而言之：如何获得Pandas数据框列的对数？我认为numpy.log()应该解决它，但事实并非如此。我怀疑这是因为我NaN在数据框中有一些s？

My whole code is below. It may seem a bit chaotic, basically my ultimate goal (a little exaggerated) is to plot different rows of different selected columns in several selected columns into several subplots (hence the three embedded for loops iterating between different groups... if you suggest a more elegant solution, I will appreciate it but it is not the main thing that's pressing me). I need to plot a logarithm of some values from one dataframe + 1 versus some values of the other dataframe. And here is the problem, on the plotting line with np.log I get this error: AttributeError: 'float' object has no attribute 'log'(and if I use math instead of np, I get this: TypeError: cannot convert the series to <type 'float'>) What may I do about it?

我的整个代码如下。这可能看起来有点混乱，基本上我的最终目标（有点夸张）是将几个选定列中不同选定列的不同行绘制成几个子图（因此三个嵌入式 for 循环在不同组之间迭代......如果你建议一个更优雅的解决方案，我会很感激，但这不是我的主要压力）。我需要绘制来自一个数据帧 + 1 的某些值与另一个数据帧的某些值的对数。这就是问题所在，在 np.log 的绘图线上，我收到此错误：（AttributeError: 'float' object has no attribute 'log'如果我使用数学而不是 np，我会得到这个：）我该TypeError: cannot convert the series to <type 'float'>怎么办？

Thank you. Here is the code:

谢谢你。这是代码：

import numpy as np
import math
import pandas as pd
import matplotlib.pyplot as plt

hf = pd.DataFrame({'Z':np.arange(0,100,1),'A':(10*np.random.rand(100)), 'B':(10*np.random.rand(100)),'C':(10*np.random.rand(100)),'D':(10*np.random.rand(100)),'E':(10*np.random.rand(100)),'F':(10*np.random.rand(100))})
df = pd.DataFrame({'Z':np.arange(0,100,1),'A':(10*np.random.rand(100)), 'B':(10*np.random.rand(100)),'C':(10*np.random.rand(100)),'D':(10*np.random.rand(100)),'E':(10*np.random.rand(100)),'F':(10*np.random.rand(100))})
hf.loc[0:5,'A']=np.nan
df.loc[0:5,'A']=np.nan
hf.loc[53:58,'B']=np.nan
df.loc[53:58,'B']=np.nan
hf.loc[90:,'C']=np.nan
df.loc[90:,'C']=np.nan
I = ['A','B']
II = ['C','D']
III = ['E','F']
IV = ['F','A']
runs = [I,II,III,IV]
inds = [10,20,30,40]

fig = plt.figure(figsize=(6,4))
for r in runs:
    data = pd.DataFrame(index=df.index,columns=r)
    HF = pd.DataFrame(index=hf.index,columns=r)
    #pdb.set_trace()
    for i in r:
        data.loc[:,i] = df.loc[:,i]
        HF.loc[:,i] = hf.loc[:,i]
        for c,z in enumerate(inds):
            ax=fig.add_subplot()
            ax = plt.plot(math.log1p(HF.loc[z]),Tdata.loc[z],linestyle=":",marker="o",markersize=5,label=inds[c].__str__())
# or the other version
#plt.plot(np.log(1 + HF.loc[z]),Tdata.loc[z],linestyle=":",marker="o",markersize=5,label=inds[c].__str__())

As @Jason pointed out, this answerdid the trick! Thank you!

正如@Jason 指出的，这个答案成功了！谢谢！

Answer 1

回答by juanpa.arrivillaga

The problem isn't that you have NaNvalues, it's that you don'thave NaNvalues, you have strings"NaN"which the ufuncnp.logdoesn't know how to deal with. Replace the beginning of your code with:

问题不在于你拥有NaN的价值观，那就是你不具有NaN价值，你有串"NaN"其ufuncnp.log不知道该如何处理。将代码的开头替换为：

h = {'Z': np.arange(0,100,1), 'A': 10*np.random.rand(100),
     'B': 10*np.random.rand(100), 'C': 10*np.random.rand(100),
     'D': 10*np.random.rand(100), 'E': 10*np.random.rand(100),
     'F': 10*np.random.rand(100)}
hf = pd.DataFrame(h)
f = {'Z': np.arange(0,100,1), 'A': 10*np.random.rand(100),
     'B': 10*np.random.rand(100), 'C': 10*np.random.rand(100),
     'D': 10*np.random.rand(100), 'E': 10*np.random.rand(100),
     'F': 10*np.random.rand(100)}
df = pd.DataFrame(f)
hf.loc[0:5,'A'] = np.nan
df.loc[0:5,'A'] = np.nan
hf.loc[53:58,'B'] = np.nan
df.loc[53:58,'B'] = np.nan
hf.loc[90:,'C'] = np.nan
df.loc[90:,'C'] = np.nan

And everything should work nicely with np.log

一切都应该很好地与 np.log

pandas 熊猫系列/数据框的对数

提问by durbachit

回答by juanpa.arrivillaga

相关推荐

最近更新

标签

pandas 熊猫系列/数据框的对数

提问by durbachit

回答by juanpa.arrivillaga

相关推荐

Pandas 将所有对象列转换为类别

pandas 越界纳秒时间戳

pandas PyCharm 中未显示数据帧头

Pandas：计算数据框中重复条目的平均值

相关推荐

最近更新

标签