遍历 Dataframes Pandas 列表

Question

提问by user2587593

I currently have a series of 18 DataFrames (each representing a different year) consisting of 3 Columns and varying amounts of rows representing the normalize mutual information scores for amino acid residue positions like:

我目前有一系列 18 个数据帧（每个代表不同的年份），由 3 列和不同数量的行组成，代表氨基酸残基位置的标准化互信息分数，例如：

Year1

第一年

Pos1   Pos2   MI_Score
40     40     1.00    
40     44     0.53
40     70     0.23
44     44     1.00    
44     70     0.90
...

I would like to iterate through this list of DataFrames and trim off the rows that have Mutual Information scores less than 0.50 as well as the ones that are mutual information scores for a residue paired with itself. Here is what I've tried so far:

我想遍历这个 DataFrame 列表，并修剪掉互信息分数小于 0.50 的行以及与自身配对的残基的互信息分数的行。这是我迄今为止尝试过的：

MIs = [MI_95,MI_96,MI_97,MI_98,MI_99,MI_00,MI_01,MI_02,MI_03,MI_04,MI_05,MI_06,MI_07,MI_08,MI_09,MI_10,MI_11,MI_12,MI_13] 
for MI in MIs:    
    p = []
    for q in range(0, len(MI)):
        if MI[0][q] != MI[1][q]:
            if MI[2][q] > 0.5:
                p.append([MI[0][q],MI[1][q],MI[2][q]])
    MI = pd.DataFrame(p)

Yet this only trims the first item in MIs. Can someone help me find a way to iterate through the whole list and trim each dataframe?

然而，这只会修剪 MI 中的第一项。有人可以帮我找到一种方法来遍历整个列表并修剪每个数据框吗？

Thanks

谢谢

Answer 1

回答by Dan Allan

Avoid loops where possible. They are much slower, and usually less immediately easy to read, than "vectorized" methods that operate on all the data together. Here's one way.

尽可能避免循环。与同时对所有数据进行操作的“矢量化”方法相比，它们要慢得多，而且通常不太容易阅读。这是一种方法。

In [17]: self_paired = df['Pos1'] == df['Pos2']

In [18]: low_MI = df['MI_Score'] < 0.50

In [19]: df[~(low_MI | self_paired)]
Out[19]:
   Pos1  Pos2  MI_Score
1    40    44      0.53
4    44    70      0.90

[2 rows x 3 columns]

遍历 Dataframes Pandas 列表

提问by user2587593

回答by Dan Allan

相关推荐

最近更新

标签

遍历 Dataframes Pandas 列表

提问by user2587593

回答by Dan Allan

相关推荐

Pandas DataFrame 转 CSV

在 Python Pandas 中将管道分隔数据更改为 Dataframe

python pandas 添加前导零以使所有月份均为 2 位数

从 CSV 或 Pandas DataFrame 自动 PostgreSQL CREATE TABLE 和 INSERT

相关推荐

最近更新

标签