遍历 Dataframes Pandas 列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21169362/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Iterating through list of Dataframes Pandas
提问by user2587593
I currently have a series of 18 DataFrames (each representing a different year) consisting of 3 Columns and varying amounts of rows representing the normalize mutual information scores for amino acid residue positions like:
我目前有一系列 18 个数据帧(每个代表不同的年份),由 3 列和不同数量的行组成,代表氨基酸残基位置的标准化互信息分数,例如:
Year1
第一年
Pos1 Pos2 MI_Score
40 40 1.00
40 44 0.53
40 70 0.23
44 44 1.00
44 70 0.90
...
I would like to iterate through this list of DataFrames and trim off the rows that have Mutual Information scores less than 0.50 as well as the ones that are mutual information scores for a residue paired with itself. Here is what I've tried so far:
我想遍历这个 DataFrame 列表,并修剪掉互信息分数小于 0.50 的行以及与自身配对的残基的互信息分数的行。这是我迄今为止尝试过的:
MIs = [MI_95,MI_96,MI_97,MI_98,MI_99,MI_00,MI_01,MI_02,MI_03,MI_04,MI_05,MI_06,MI_07,MI_08,MI_09,MI_10,MI_11,MI_12,MI_13]
for MI in MIs:
p = []
for q in range(0, len(MI)):
if MI[0][q] != MI[1][q]:
if MI[2][q] > 0.5:
p.append([MI[0][q],MI[1][q],MI[2][q]])
MI = pd.DataFrame(p)
Yet this only trims the first item in MIs. Can someone help me find a way to iterate through the whole list and trim each dataframe?
然而,这只会修剪 MI 中的第一项。有人可以帮我找到一种方法来遍历整个列表并修剪每个数据框吗?
Thanks
谢谢
回答by Dan Allan
Avoid loops where possible. They are much slower, and usually less immediately easy to read, than "vectorized" methods that operate on all the data together. Here's one way.
尽可能避免循环。与同时对所有数据进行操作的“矢量化”方法相比,它们要慢得多,而且通常不太容易阅读。这是一种方法。
In [17]: self_paired = df['Pos1'] == df['Pos2']
In [18]: low_MI = df['MI_Score'] < 0.50
In [19]: df[~(low_MI | self_paired)]
Out[19]:
Pos1 Pos2 MI_Score
1 40 44 0.53
4 44 70 0.90
[2 rows x 3 columns]

