pandas 熊猫，按列和行选择

Question

提问by Andrew Spott

I want to sum up all values that I select based on some function of column and row.

我想根据列和行的某些功能总结我选择的所有值。

Another way of putting it is that I want to use a function of the row index and column index to determine if a value should be included in a sum along an axis.

另一种表达方式是，我想使用行索引和列索引的函数来确定值是否应包含在沿轴的总和中。

Is there an easy way of doing this?

有没有简单的方法来做到这一点？

Answer 1

回答by Haleemur Ali

Columns can be selected using the syntax dataframe[<list of columns>]. The index (row) can be used for filtering using the dataframe.indexmethod.

可以使用语法选择列dataframe[<list of columns>]。索引（行）可用于使用该dataframe.index方法进行过滤。

import pandas as pd

df = pd.DataFrame({'a': [0.1, 0.2], 'b': [0.2, 0.1]})

odd_a = df['a'][df.index % 2 == 1]
even_b = df['b'][df.index % 2 == 0]
# odd_a: 
# 1    0.2
# Name: a, dtype: float64
# even_b: 
# 0    0.2
# Name: b, dtype: float64

Answer 2

回答by fixxxer

If dfis your dataframe :

如果df是您的数据框：

In [477]: df
Out[477]: 
   A   s2  B
0  1    5  5
1  2    3  5
2  4    5  5

You can access the odd rows like this :

您可以像这样访问奇数行：

In [478]: df.loc[1::2]
Out[478]: 
   A   s2  B
1  2    3  5

and the even ones like this:

偶数是这样的：

In [479]: df.loc[::2]
Out[479]: 
   A   s2  B
0  1    5  5
2  4    5  5

To answer your question, getting even rows and column Bwould be :

要回答您的问题，获得偶数行和列B将是：

In [480]: df.loc[::2,'B']
Out[480]: 
0    5
2    5
Name: B, dtype: int64

and odd rows and column Acan be done as:

和奇数行和列A可以这样做：

In [481]: df.loc[1::2,'A']
Out[481]: 
1    2
Name: A, dtype: int64

Answer 3

回答by Matt

I think this should be fairly general if not the cleanest implementation. This should allow applying separate functions for rows and columns depending on conditions (that I defined here in dictionaries).

我认为这应该是相当通用的，如果不是最干净的实现。这应该允许根据条件（我在字典中定义的）为行和列应用单独的函数。

import numpy as np
import pandas as pd

ran = np.random.randint(0,10,size=(5,5))
df = pd.DataFrame(ran,columns = ["a","b","c","d","e"])

# A dictionary to define what function is passed
d_col = {"high":["a","c","e"], "low":["b","d"]}
d_row = {"high":[1,2,3], "low":[0,4]}

# Generate list of Pandas boolean Series
i_col = [df[i].apply(lambda x: x>5) if i in d_col["high"] else df[i].apply(lambda x: x<5) for i in df.columns]

# Pass the series as a matrix
df = df[pd.concat(i_col,axis=1)]

# Now do this again for rows
i_row = [df.T[i].apply(lambda x: x>5) if i in d_row["high"] else df.T[i].apply(lambda x: x<5) for i in df.T.columns]

# Return back the DataFrame in original shape
df = df.T[pd.concat(i_row,axis=1)].T

# Perform the final operation such as sum on the returned DataFrame
print(df.sum().sum())

pandas 熊猫，按列和行选择

提问by Andrew Spott

回答by Haleemur Ali

回答by fixxxer

回答by Matt

相关推荐

最近更新

标签

pandas 熊猫，按列和行选择

提问by Andrew Spott

回答by Haleemur Ali

回答by fixxxer

回答by Matt

相关推荐

pandas 获取pandas中某个索引值前后的行数

在 Pandas 数据框列中访问字典键和值

如何更改 Pandas MultiIndex 列的顺序/分组/级别？

pandas 使用另一列值的 len() 添加 DataFrame 列

相关推荐

最近更新

标签