Python R summary() 在 numpy 中等效
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33889310/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
R summary() equivalent in numpy
提问by iulian
Is there an equivalent of R
's summary()
function in numpy
?
是否有等效于R
的summary()
函数numpy
?
numpy
has std, mean, average functions separately, but does it have a function that sums up everything, like summary
does in R
?
numpy
分别具有 std、mean、average 函数,但它是否具有总结所有内容的函数,就像summary
in 那样R
?
If found thisquestion which relates to pandas
and thisarticle with R-to-numpy equivalents, but it doesn't have what I seek for.
采纳答案by Eoin
No. You'll need to use pandas
.
没有。你需要使用pandas
.
R is for language for statistics, so many of the basic functionality you need, like summary()
and lm()
, are loaded when you boot it up. Python has many uses, so you need to install and import the appropriate statistical packages. numpy
isn't a statistics package - it's for numerical computation more generally, so you need to use packages like pandas
, scipy
and statsmodels
to allow Python to do what R can do out of the box.
R 是用于统计的语言,因此启动时会加载您需要的许多基本功能,例如summary()
和lm()
。Python有很多用途,所以需要安装并导入相应的统计包。numpy
不是一个统计包——它更普遍地用于数值计算,所以你需要使用像pandas
,scipy
并statsmodels
允许 Python 做 R 可以做的事情。
回答by Thomas Hepner
1. Load Pandas in console and load csv data file
1. 在控制台加载 Pandas 并加载 csv 数据文件
import pandas as pd
data = pd.read_csv("data.csv", sep = ",")
2. Examine first few rows of data
2. 检查前几行数据
data.head()
3. Calculate summary statistics
3. 计算汇总统计
summary = data.describe()
4. Transpose statistics to get similar format as R summary() function
4. 转置统计数据以获得与 R summary() 函数类似的格式
summary = summary.transpose()
5. Visualize summary statistics in console
5.在控制台中可视化汇总统计
summary.head()
回答by SKB
If you are looking for details like summary() in R i.e
如果您正在 R 中寻找诸如 summary() 之类的详细信息
- 5 point summary for numeric variables
- Frequency of occurrence of each class for categorical variable
- 数值变量的 5 点总结
- 分类变量每个类别的出现频率
To achieve above in Python you can use df.describe(include= 'all').
要在 Python 中实现上述目标,您可以使用 df.describe(include='all')。