pandas 用python处理96孔板中的数据标签

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23712009/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:03:53  来源:igfitidea点击:

dealing with data labels in 96 well plate with python

pythonpandas

提问by Moritz

I do have data from a 96 well plate (mostly in excel):

我确实有来自 96 孔板的数据(主要在 excel 中):

A 96 well plate, schematic provided by http://www.cellsignet.com:

一个 96 孔板,示意图由http://www.cellsignet.com提供:

enter image description here
(source: cellsignet.com)

在此处输入图片说明
(来源:cellsignet.com

Out of each cell we can do some experiment and read values from it, the data looks like:

在每个单元格中,我们可以做一些实验并从中读取值,数据如下所示:

    1    2    3    4    .    . 
A   9.1  8.7  5.6  4.5
B   8.7  8.5  5.4  4.3
C   4.3  4.5  7.6  6.7
D   4.1  6.0  7.0  6.1
.

I also have excel files with the sample names:

我还有带有示例名称的 excel 文件:

    1    2    3    4    .    . 
A   l1   l2   l3   l4 
B   l1   l2   l3   l4
C   ds1  ds2  ds3  ds4
D   ds1  ds2  ds3  ds4
.

The duplicate entries are two wells with the same sample loaded.

重复条目是加载了相同样品的两个孔。

I would like to read in the data (no problem) and assign the labels to the data points and group the data according to the labels. In pandas i can read in the data and group it according to the column and row headers. But how can i group according to the sample names ?

我想读入数据(没问题)并将标签分配给数据点并根据标签对数据进行分组。在 Pandas 中,我可以读入数据并根据列和行标题对其进行分组。但是我如何根据样本名称分组?

采纳答案by CT Zhu

I will suggest just make a DataFramewith two columns, one stores the names, the other stores the readings.

我建议只DataFrame用两列制作一个,一个存储名称,另一个存储读数。

In [20]:

print data_df
print name_df
     1    2    3    4
A  9.1  8.7  5.6  4.5
B  8.7  8.5  5.4  4.3
C  4.3  4.5  7.6  6.7
D  4.1  6.0  7.0  6.1

[4 rows x 4 columns]
     1    2    3    4
A   l1   l2   l3   l4
B   l1   l2   l3   l4
C  ds1  ds2  ds3  ds4
D  ds1  ds2  ds3  ds4

[4 rows x 4 columns]
In [21]:

final_df=pd.DataFrame({'Name':name_df.values.ravel(), 'Reading':data_df.values.ravel()})
#if you have additional readings, i.e. from a different assay,
#from a different wavelength, add them there, as:
#'OTHER_Reading':OTHER_data_df.values.ravel()
print final_df
   Name  Reading
0    l1      9.1
1    l2      8.7
2    l3      5.6
3    l4      4.5
4    l1      8.7
5    l2      8.5
6    l3      5.4
7    l4      4.3
8   ds1      4.3
9   ds2      4.5
10  ds3      7.6
11  ds4      6.7
12  ds1      4.1
13  ds2      6.0
14  ds3      7.0
15  ds4      6.1

[16 rows x 2 columns]

This way you can do some calculations rather easily, such as:

通过这种方式,您可以相当轻松地进行一些计算,例如:

In [22]:

print final_df.groupby('Name').mean()
      Reading
Name         
ds1      4.20
ds2      5.25
ds3      7.30
ds4      6.40
l1       8.90
l2       8.60
l3       5.50
l4       4.40

[8 rows x 1 columns]