Python R summary() 在 numpy 中等效
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33889310/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
R summary() equivalent in numpy
提问by iulian
Is there an equivalent of R's summary()function in numpy?
是否有等效于R的summary()函数numpy?
numpyhas std, mean, average functions separately, but does it have a function that sums up everything, like summarydoes in R?
numpy分别具有 std、mean、average 函数,但它是否具有总结所有内容的函数,就像summaryin 那样R?
If found thisquestion which relates to pandasand thisarticle with R-to-numpy equivalents, but it doesn't have what I seek for.
采纳答案by Eoin
No. You'll need to use pandas.
没有。你需要使用pandas.
R is for language for statistics, so many of the basic functionality you need, like summary()and lm(), are loaded when you boot it up. Python has many uses, so you need to install and import the appropriate statistical packages. numpyisn't a statistics package - it's for numerical computation more generally, so you need to use packages like pandas, scipyand statsmodelsto allow Python to do what R can do out of the box.
R 是用于统计的语言,因此启动时会加载您需要的许多基本功能,例如summary()和lm()。Python有很多用途,所以需要安装并导入相应的统计包。numpy不是一个统计包——它更普遍地用于数值计算,所以你需要使用像pandas,scipy并statsmodels允许 Python 做 R 可以做的事情。
回答by Thomas Hepner
1. Load Pandas in console and load csv data file
1. 在控制台加载 Pandas 并加载 csv 数据文件
import pandas as pd
data = pd.read_csv("data.csv", sep = ",")
2. Examine first few rows of data
2. 检查前几行数据
data.head()
3. Calculate summary statistics
3. 计算汇总统计
summary = data.describe()
4. Transpose statistics to get similar format as R summary() function
4. 转置统计数据以获得与 R summary() 函数类似的格式
summary = summary.transpose()
5. Visualize summary statistics in console
5.在控制台中可视化汇总统计
summary.head()
回答by SKB
If you are looking for details like summary() in R i.e
如果您正在 R 中寻找诸如 summary() 之类的详细信息
- 5 point summary for numeric variables
- Frequency of occurrence of each class for categorical variable
- 数值变量的 5 点总结
- 分类变量每个类别的出现频率
To achieve above in Python you can use df.describe(include= 'all').
要在 Python 中实现上述目标,您可以使用 df.describe(include='all')。

