pandas 熊猫系列/数据框的对数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40034378/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Logarithm of a pandas series/dataframe
提问by durbachit
In short: How can I get a logarithm of a column of a pandas dataframe?
I thought numpy.log()
should work on it, but it isn't. I suspect it's because I have some NaN
s in the dataframe?
简而言之:如何获得Pandas数据框列的对数?我认为numpy.log()
应该解决它,但事实并非如此。我怀疑这是因为我NaN
在数据框中有一些s?
My whole code is below. It may seem a bit chaotic, basically my ultimate goal (a little exaggerated) is to plot different rows of different selected columns in several selected columns into several subplots (hence the three embedded for loops iterating between different groups... if you suggest a more elegant solution, I will appreciate it but it is not the main thing that's pressing me). I need to plot a logarithm of some values from one dataframe + 1 versus some values of the other dataframe. And here is the problem, on the plotting line with np.log I get this error: AttributeError: 'float' object has no attribute 'log'
(and if I use math instead of np, I get this: TypeError: cannot convert the series to <type 'float'>
)
What may I do about it?
我的整个代码如下。这可能看起来有点混乱,基本上我的最终目标(有点夸张)是将几个选定列中不同选定列的不同行绘制成几个子图(因此三个嵌入式 for 循环在不同组之间迭代......如果你建议一个更优雅的解决方案,我会很感激,但这不是我的主要压力)。我需要绘制来自一个数据帧 + 1 的某些值与另一个数据帧的某些值的对数。这就是问题所在,在 np.log 的绘图线上,我收到此错误:(AttributeError: 'float' object has no attribute 'log'
如果我使用数学而不是 np,我会得到这个:)我该TypeError: cannot convert the series to <type 'float'>
怎么办?
Thank you. Here is the code:
谢谢你。这是代码:
import numpy as np
import math
import pandas as pd
import matplotlib.pyplot as plt
hf = pd.DataFrame({'Z':np.arange(0,100,1),'A':(10*np.random.rand(100)), 'B':(10*np.random.rand(100)),'C':(10*np.random.rand(100)),'D':(10*np.random.rand(100)),'E':(10*np.random.rand(100)),'F':(10*np.random.rand(100))})
df = pd.DataFrame({'Z':np.arange(0,100,1),'A':(10*np.random.rand(100)), 'B':(10*np.random.rand(100)),'C':(10*np.random.rand(100)),'D':(10*np.random.rand(100)),'E':(10*np.random.rand(100)),'F':(10*np.random.rand(100))})
hf.loc[0:5,'A']=np.nan
df.loc[0:5,'A']=np.nan
hf.loc[53:58,'B']=np.nan
df.loc[53:58,'B']=np.nan
hf.loc[90:,'C']=np.nan
df.loc[90:,'C']=np.nan
I = ['A','B']
II = ['C','D']
III = ['E','F']
IV = ['F','A']
runs = [I,II,III,IV]
inds = [10,20,30,40]
fig = plt.figure(figsize=(6,4))
for r in runs:
data = pd.DataFrame(index=df.index,columns=r)
HF = pd.DataFrame(index=hf.index,columns=r)
#pdb.set_trace()
for i in r:
data.loc[:,i] = df.loc[:,i]
HF.loc[:,i] = hf.loc[:,i]
for c,z in enumerate(inds):
ax=fig.add_subplot()
ax = plt.plot(math.log1p(HF.loc[z]),Tdata.loc[z],linestyle=":",marker="o",markersize=5,label=inds[c].__str__())
# or the other version
#plt.plot(np.log(1 + HF.loc[z]),Tdata.loc[z],linestyle=":",marker="o",markersize=5,label=inds[c].__str__())
As @Jason pointed out, this answerdid the trick! Thank you!
正如@Jason 指出的,这个答案成功了!谢谢!
回答by juanpa.arrivillaga
The problem isn't that you have NaN
values, it's that you don'thave NaN
values, you have strings"NaN"
which the ufunc
np.log
doesn't know how to deal with. Replace the beginning of your code with:
问题不在于你拥有NaN
的价值观,那就是你不具有NaN
价值,你有串"NaN"
其ufunc
np.log
不知道该如何处理。将代码的开头替换为:
h = {'Z': np.arange(0,100,1), 'A': 10*np.random.rand(100),
'B': 10*np.random.rand(100), 'C': 10*np.random.rand(100),
'D': 10*np.random.rand(100), 'E': 10*np.random.rand(100),
'F': 10*np.random.rand(100)}
hf = pd.DataFrame(h)
f = {'Z': np.arange(0,100,1), 'A': 10*np.random.rand(100),
'B': 10*np.random.rand(100), 'C': 10*np.random.rand(100),
'D': 10*np.random.rand(100), 'E': 10*np.random.rand(100),
'F': 10*np.random.rand(100)}
df = pd.DataFrame(f)
hf.loc[0:5,'A'] = np.nan
df.loc[0:5,'A'] = np.nan
hf.loc[53:58,'B'] = np.nan
df.loc[53:58,'B'] = np.nan
hf.loc[90:,'C'] = np.nan
df.loc[90:,'C'] = np.nan
And everything should work nicely with np.log
一切都应该很好地与 np.log