pandas 每个列数据框的分布概率,在一个图中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50952133/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Distribution probabilities for each column data frame, in one plot
提问by Annalix
I am creating probability distributions for each column of my data frame by distplot from seaborn library sns.distplot(). For one plot I do
我正在通过来自 seaborn 库 sns.distplot() 的 distplot 为我的数据框的每一列创建概率分布。对于一个情节我做
x = df['A']
sns.distplot(x);
I am trying to use the FacetGrid & Map to have all plots for each columns at once in this way. But doesn't work at all.
我正在尝试使用 FacetGrid 和 Map 以这种方式一次获得每列的所有图。但根本不起作用。
g = sns.FacetGrid(df, col = 'A','B','C','D','E')
g.map(sns.distplot())
回答by Scott Boston
I think you need to use melt
to reshape your dataframe to long format, see this MVCE:
我认为您需要使用melt
将数据帧重塑为长格式,请参阅此 MVCE:
df = pd.DataFrame(np.random.random((100,5)), columns = list('ABCDE'))
dfm = df.melt(var_name='columns')
g = sns.FacetGrid(dfm, col='columns')
g = (g.map(sns.distplot, 'value'))
回答by ImportanceOfBeingErnest
You're getting this wrong on two levels.
你在两个层面上都弄错了。
Python syntax.
FacetGrid(df, col = 'A','B','C','D','E')
is invalid, becausecol
gets set toA
and the remaining characters are interpreted as further arguments. But since they are not named, this is invalid python syntax.Seaborn concepts.
Seaborn expects a single column name as input for the
col
orrow
argument. This means that the dataframe needs to be in a format that has one column which determines to which column or row the respective datum belongs.You do not call the function to be used by map. The idea is of course that
map
itself calls it.
Python 语法。
FacetGrid(df, col = 'A','B','C','D','E')
无效,因为col
被设置为A
并且剩余的字符被解释为进一步的参数。但由于它们没有命名,这是无效的 python 语法。Seaborn 概念。
Seaborn 需要单个列名作为
col
orrow
参数的输入。这意味着数据框需要采用具有一列的格式,该列确定相应数据属于哪一列或哪一行。您不调用 map 使用的函数。这个想法当然是
map
它本身所称的。
Solutions:
解决方案:
Loop over columns:
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns df = pd.DataFrame(np.random.randn(14,5), columns=list("ABCDE")) fig, axes = plt.subplots(ncols=5) for ax, col in zip(axes, df.columns): sns.distplot(df[col], ax=ax) plt.show()
Melt dataframe
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns df = pd.DataFrame(np.random.randn(14,5), columns=list("ABCDE")) g = sns.FacetGrid(df.melt(), col="variable") g.map(sns.distplot, "value") plt.show()
循环列:
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns df = pd.DataFrame(np.random.randn(14,5), columns=list("ABCDE")) fig, axes = plt.subplots(ncols=5) for ax, col in zip(axes, df.columns): sns.distplot(df[col], ax=ax) plt.show()
融化数据框
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns df = pd.DataFrame(np.random.randn(14,5), columns=list("ABCDE")) g = sns.FacetGrid(df.melt(), col="variable") g.map(sns.distplot, "value") plt.show()
回答by nishant
I think the easiest approach is to just loop the columns and create a plot.
我认为最简单的方法是循环列并创建一个图。
import numpy as np
improt pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.random((100,5)), columns = list('ABCDE'))
for col in df.columns:
hist = df[col].hist(bins=10)
print("Plotting for column {}".format(col))
plt.show()
回答by E.Zolduoarrati
You can use the following:
您可以使用以下内容:
# listing dataframes types
list(set(df.dtypes.tolist()))
# include only float and integer
df_num = df.select_dtypes(include = ['float64', 'int64'])
# display what has been selected
df_num.head()
# plot
df_num.hist(figsize=(16, 20), bins=50, xlabelsize=8, ylabelsize=8);