pandas 如何将函数应用于适当的数据框

Question

提问by hlin117

Is there a way I could use a scipy function like norm.cdfin placeon a numpy.array(or pandas.DataFrame), using a variant of numpy.apply, numpy.apply_along_axs, etc?

有没有一种方法，我可以使用SciPy的功能就像在地方上一个（或）使用的变体，等等？norm.cdfnumpy.arraypandas.DataFramenumpy.applynumpy.apply_along_axs

The background is, I have a table of z-score values that I would like to convert to CDF values of the norm distribution. I'm currently using norm.cdffrom scipyfor this.

背景是，我有一个 z-score 值表，我想将其转换为范数分布的 CDF 值。我目前正在为此使用norm.cdffrom scipy。

I'm currently manipulating a dataframe that has non-numeric values.

我目前正在操作一个具有非数字值的数据框。

      Name      Val1      Val2      Val3      Val4 
0        A -1.540369 -0.077779  0.979606 -0.667112   
1        B -0.787154  0.048412  0.775444 -0.510904   
2        C -0.477234  0.414388  1.250544 -0.411658   
3        D -1.430851  0.258759  1.247752 -0.883293   
4        E -0.360181  0.485465  1.123589 -0.379157

(Making the Namevariable an index is a solution, but in my actual dataset, the names are not alphabetical characters.)

（使Name变量成为索引是一种解决方案，但在我的实际数据集中，名称不是字母字符。）

To modify only the numeric data, I'm using df._get_numeric_data()a private function that returns a dataframe containing a dataframe's numeric data. However, there is no setfunction. Hence, if I call

为了仅修改数字数据，我使用df._get_numeric_data()了一个私有函数，该函数返回一个包含数据帧数字数据的数据帧。但是，没有set功能。因此，如果我打电话

norm.cdf(df._get_numeric_data)

this won't change df's original data.

这不会改变df的原始数据。

I'm trying to circumvent this by applying norm.cdfto the numeric dataframe inplace, so this changes my original dataset.

我试图通过应用norm.cdf到数字数据框就地来规避这一点，所以这会改变我的原始数据集。

Answer 1

回答by Andy Hayden

I think I would prefer select_dtypesover _get_numeric_data:

我想，我宁愿select_dtypes过_get_numeric_data：

In [11]: df.select_dtypes(include=[np.number])
Out[11]:
       Val1      Val2      Val3      Val4
0 -1.540369 -0.077779  0.979606 -0.667112
1 -0.787154  0.048412  0.775444 -0.510904
2 -0.477234  0.414388  1.250544 -0.411658
3 -1.430851  0.258759  1.247752 -0.883293
4 -0.360181  0.485465  1.123589 -0.379157

Although apply doesn't offer an inplace, you could do something like the following (which I would argue was more explicit anyway):

尽管 apply 不提供就地，但您可以执行以下操作（无论如何我认为这更明确）：

num_df = df.select_dtypes(include=[np.number])
df[num_df.columns] = norm.cdf(num_df.values)

pandas 如何将函数应用于适当的数据框

提问by hlin117

回答by Andy Hayden

相关推荐

最近更新

标签

pandas 如何将函数应用于适当的数据框

提问by hlin117

回答by Andy Hayden

相关推荐

pandas 熊猫 - 将时间对象更改为浮点数？

pandas 带有熊猫的 DataFrames 的 DataFrame

pandas 在熊猫的多索引级别内按列排序

使用 numpy 数组修改 Pandas 数据帧值

相关推荐

最近更新

标签