pandas 熊猫,按列和行选择
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/30033850/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas, selecting by column and row
提问by Andrew Spott
I want to sum up all values that I select based on some function of column and row.
我想根据列和行的某些功能总结我选择的所有值。
Another way of putting it is that I want to use a function of the row index and column index to determine if a value should be included in a sum along an axis.
另一种表达方式是,我想使用行索引和列索引的函数来确定值是否应包含在沿轴的总和中。
Is there an easy way of doing this?
有没有简单的方法来做到这一点?
回答by Haleemur Ali
Columns can be selected using the syntax dataframe[<list of columns>]. The index (row) can be used for filtering using the dataframe.indexmethod.
可以使用语法选择列dataframe[<list of columns>]。索引(行)可用于使用该dataframe.index方法进行过滤。
import pandas as pd
df = pd.DataFrame({'a': [0.1, 0.2], 'b': [0.2, 0.1]})
odd_a = df['a'][df.index % 2 == 1]
even_b = df['b'][df.index % 2 == 0]
# odd_a: 
# 1    0.2
# Name: a, dtype: float64
# even_b: 
# 0    0.2
# Name: b, dtype: float64
回答by fixxxer
If dfis your dataframe :
如果df是您的数据框:
In [477]: df
Out[477]: 
   A   s2  B
0  1    5  5
1  2    3  5
2  4    5  5
You can access the odd rows like this :
您可以像这样访问奇数行:
In [478]: df.loc[1::2]
Out[478]: 
   A   s2  B
1  2    3  5
and the even ones like this:
偶数是这样的:
In [479]: df.loc[::2]
Out[479]: 
   A   s2  B
0  1    5  5
2  4    5  5
To answer your question, getting even rows and column Bwould be :
要回答您的问题,获得偶数行和列B将是:
In [480]: df.loc[::2,'B']
Out[480]: 
0    5
2    5
Name: B, dtype: int64
and odd rows and column Acan be done as:
和奇数行和列A可以这样做:
In [481]: df.loc[1::2,'A']
Out[481]: 
1    2
Name: A, dtype: int64
回答by Matt
I think this should be fairly general if not the cleanest implementation. This should allow applying separate functions for rows and columns depending on conditions (that I defined here in dictionaries).
我认为这应该是相当通用的,如果不是最干净的实现。这应该允许根据条件(我在字典中定义的)为行和列应用单独的函数。
import numpy as np
import pandas as pd
ran = np.random.randint(0,10,size=(5,5))
df = pd.DataFrame(ran,columns = ["a","b","c","d","e"])
# A dictionary to define what function is passed
d_col = {"high":["a","c","e"], "low":["b","d"]}
d_row = {"high":[1,2,3], "low":[0,4]}
# Generate list of Pandas boolean Series
i_col = [df[i].apply(lambda x: x>5) if i in d_col["high"] else df[i].apply(lambda x: x<5) for i in df.columns]
# Pass the series as a matrix
df = df[pd.concat(i_col,axis=1)]
# Now do this again for rows
i_row = [df.T[i].apply(lambda x: x>5) if i in d_row["high"] else df.T[i].apply(lambda x: x<5) for i in df.T.columns]
# Return back the DataFrame in original shape
df = df.T[pd.concat(i_row,axis=1)].T
# Perform the final operation such as sum on the returned DataFrame
print(df.sum().sum())

