pandas 如何在python中使用matplotlib创建曼哈顿图？

Question

提问by Thomas M?bius

Unfortunately, I have not found a solution myself. How do I create a Manhattan plotwithin python using, e.g., matplotlib / pandas. The problem is that in these plots the x-axis is discrete.

不幸的是，我自己还没有找到解决方案。如何使用例如 matplotlib/pandas 在 python 中创建曼哈顿图。问题是在这些图中，x 轴是离散的。

from pandas import DataFrame
from scipy.stats import uniform
from scipy.stats import randint
import numpy as np

# some sample data
df = DataFrame({'gene' : ['gene-%i' % i for i in np.arange(1000)],
'pvalue' : uniform.rvs(size=1000),
'chromosome' : ['ch-%i' % i for i in randint.rvs(0,12,size=1000)]})

# -log_10(pvalue)
df['minuslog10pvalue'] = -np.log10(df.pvalue)
df = df.sort_values('chromosome')

# How to plot gene vs. -log10(pvalue) and colour it by chromosome?

Answer 1

采纳答案by Elad Joseph

You can use something like this:

你可以使用这样的东西：

from pandas import DataFrame
from scipy.stats import uniform
from scipy.stats import randint
import numpy as np
import matplotlib.pyplot as plt

# some sample data
df = DataFrame({'gene' : ['gene-%i' % i for i in np.arange(10000)],
'pvalue' : uniform.rvs(size=10000),
'chromosome' : ['ch-%i' % i for i in randint.rvs(0,12,size=10000)]})

# -log_10(pvalue)
df['minuslog10pvalue'] = -np.log10(df.pvalue)
df.chromosome = df.chromosome.astype('category')
df.chromosome = df.chromosome.cat.set_categories(['ch-%i' % i for i in range(12)], ordered=True)
df = df.sort_values('chromosome')

# How to plot gene vs. -log10(pvalue) and colour it by chromosome?
df['ind'] = range(len(df))
df_grouped = df.groupby(('chromosome'))

fig = plt.figure()
ax = fig.add_subplot(111)
colors = ['red','green','blue', 'yellow']
x_labels = []
x_labels_pos = []
for num, (name, group) in enumerate(df_grouped):
    group.plot(kind='scatter', x='ind', y='minuslog10pvalue',color=colors[num % len(colors)], ax=ax)
    x_labels.append(name)
    x_labels_pos.append((group['ind'].iloc[-1] - (group['ind'].iloc[-1] - group['ind'].iloc[0])/2))
ax.set_xticks(x_labels_pos)
ax.set_xticklabels(x_labels)
ax.set_xlim([0, len(df)])
ax.set_ylim([0, 3.5])
ax.set_xlabel('Chromosome')

I just created an extra column of running index to have control on the x labels locations.

我刚刚创建了一个额外的运行索引列来控制 x 标签位置。

Answer 2

回答by Jan Christoph Terasa

import matplotlib.pyplot als plt
from numpy.random import randn, random_sample

g = random_sample(int(1e5))*10 # uniform random values between 0 and 10
p = abs(randn(int(1e5))) # abs of normally distributed data

"""
plot g vs p in groups with different colors
colors are cycled automatically by matplotlib
use another colormap or define own colors for a different cycle
"""
for i in range(1,11): 
    plt.plot(g[abs(g-i)<1], p[abs(g-i)<1], ls='', marker='.')

plt.show()

You can also check out this script, which seems to offer a finished solution to your problem.

您还可以查看此脚本，它似乎为您的问题提供了完整的解决方案。

pandas 如何在python中使用matplotlib创建曼哈顿图？

提问by Thomas M?bius

采纳答案by Elad Joseph

回答by Jan Christoph Terasa

相关推荐

最近更新

标签

pandas 如何在python中使用matplotlib创建曼哈顿图？

提问by Thomas M?bius

采纳答案by Elad Joseph

回答by Jan Christoph Terasa

相关推荐

Pandas Dataframe 没有 Plot 功能

使用 Pandas 从 URL 读取 excel 文件 - XLRDError

pandas 将函数应用于两列熊猫数据框以获得两个新列

pandas 从本地 Jupyter Notebook 访问 Google BigQuery 数据

相关推荐

最近更新

标签