Python pandas Series 和单列 DataFrame 有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26047209/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:57:46  来源:igfitidea点击:

What is the difference between a pandas Series and a single-column DataFrame?

pythonpandas

提问by saroele

Why does pandas make a distinction between a Seriesand a single-column DataFrame?
In other words: what is the reason of existence of the Seriesclass?

为什么熊猫会区分 aSeries和 single-column DataFrame
换句话说:Series类存在的原因是什么?

I'm mainly using time series with datetime index, maybe that helps to set the context.

我主要使用带有日期时间索引的时间序列,也许这有助于设置上下文。

采纳答案by PythonNut

Quoting the Pandas docs

引用Pandas 文档

pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects.The primary pandas data structure.

pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

二维大小可变、具有标记轴(行和列)的潜在异构表格数据结构。算术运算在行和列标签上对齐。可以将其视为系列对象的类 dict 容器。主要的熊猫数据结构。

So, the Series is the data structure for a single column of a DataFrame, not only conceptually, but literally, i.e. the data in a DataFrameis actually stored in memory as a collection of Series.

因此,Series 是 a 单列的数据结构DataFrame,不仅在概念上,而且在字面上,即 a 中的数据DataFrame实际上作为 的集合存储在内存中Series

Analogously: We need both lists and matrices, because matrices are built with lists. Single row matricies, while equivalent to lists in functionality still cannot exists without the list(s) they're composed of.

类似地:我们需要列表和矩阵,因为矩阵是用列表构建的。单行矩阵,虽然在功能上等同于列表,但如果没有它们组成的列表,仍然无法存在。

They both have extremely similar APIs, but you'll find that DataFramemethods always cater to the possibility that you have more than one column. And, of course, you can always add another Series(or equivalent object) to a DataFrame, while adding a Seriesto another Seriesinvolves creating a DataFrame.

它们都具有极其相似的 API,但是您会发现这些DataFrame方法总是迎合您拥有不止一列的可能性。而且,当然,您始终可以将另一个Series(或等效对象)DataFrame添加到 a ,而将 a 添加Series到另一个Series涉及创建DataFrame.

回答by Umesh Kaushik

from the pandas doc http://pandas.pydata.org/pandas-docs/stable/dsintro.htmlSeries is a one-dimensional labeled array capable of holding any data type. To read data in form of panda Series:

来自熊猫文档http://pandas.pydata.org/pandas-docs/stable/dsintro.html系列是一个能够保存任何数据类型的一维标记数组。以熊猫系列的形式读取数据:

import pandas as pd
ds = pd.Series(data, index=index)

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types.

DataFrame 是一种二维标记数据结构,具有可能不同类型的列。

import pandas as pd
df = pd.DataFrame(data, index=index)

In both of the above index is list

在上述两个索引中都是列表

for example: I have a csv file with following data:

例如:我有一个包含以下数据的 csv 文件:

,country,popuplation,area,capital
BR,Brazil,10210,12015,Brasile
RU,Russia,1025,457,Moscow
IN,India,10458,457787,New Delhi

To read above data as series and data frame:

要将上述数据作为系列和数据框读取:

import pandas as pd
file_data = pd.read_csv("file_path", index_col=0)
d = pd.Series(file_data.country, index=['BR','RU','IN'] or index =  file_data.index)

output:

输出:

>>> d
BR           Brazil
RU           Russia
IN            India

df = pd.DataFrame(file_data.area, index=['BR','RU','IN'] or index = file_data.index )

output:

输出:

>>> df
      area
BR   12015
RU     457
IN  457787

回答by syed irfan

Series is a one-dimensional object that can hold any data type such as integers, floats and strings e.g

系列是一个一维对象,可以保存任何数据类型,例如整数、浮点数和字符串,例如

   import pandas as pd
   x = pd.Series([A,B,C]) 

0 A
1 B
2 C

The first column of Series is known as index i.e 0,1,2 the second column is your actual data i.e A,B,C

系列的第一列称为索引,即 0,1,2 第二列是您的实际数据,即 A,B,C

DataFrames is two-dimensional object that can hold series, list, dictionary

DataFrames 是二维对象,可以容纳系列、列表、字典

df=pd.DataFrame(rd(5,4),['A','B','C','D','E'],['W','X','Y','Z'])

回答by Yog

Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index. The basic method to create a Series is to call:

Series 是一个一维标记数组,能够保存任何数据类型(整数、字符串、浮点数、Python 对象等)。轴标签统称为索引。创建系列的基本方法是调用:

s = pd.Series(data, index=index)

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects.

DataFrame 是一种二维标记数据结构,具有可能不同类型的列。您可以将其视为电子表格或 SQL 表,或 Series 对象的字典。

 d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
 two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
 df = pd.DataFrame(d)

回答by abhishek_7081

Import cars data

导入汽车数据

import pandas as pd

cars = pd.read_csv('cars.csv', index_col = 0)

Here is how the cars.csv file looks.

这是cars.csv 文件的外观。

Print out drives_right column as Series:

将drives_right 列打印为系列:

print(cars.loc[:,"drives_right"])

    US      True
    AUS    False
    JAP    False
    IN     False
    RU      True
    MOR     True
    EG      True
    Name: drives_right, dtype: bool

The single bracket version gives a Pandas Series, the double bracket version gives a Pandas DataFrame.

单支架版本提供 Pandas 系列,双支架版本提供 Pandas DataFrame。

Print out drives_right column as DataFrame

将 drive_right 列打印为 DataFrame

print(cars.loc[:,["drives_right"]])

         drives_right
    US           True
    AUS         False
    JAP         False
    IN          False
    RU           True
    MOR          True
    EG           True

Adding a Series to another Series creates a DataFrame.

将一个系列添加到另一个系列会创建一个 DataFrame。