pandas 熊猫切片系列

Question

提问by bcf

Consider the Seriesobj:

考虑Seriesobj：

In [50]: obj = Series(np.arange(6,10), index = ['a', 'b', 'c', 'd'])

In [51]: obj
Out[51]: 
a    6
b    7
c    8
d    9
dtype: int64

I'd like to take a slice of obj, and I can do that in a couple of ways:

我想取一片obj，我可以通过以下几种方式做到这一点：

In [52]: obj[1:3]
Out[52]: 
b    7
c    8
dtype: int64

In [53]: obj['b' : 'c']
Out[53]: 
b    7
c    8
dtype: int64

Now consider the DataFrameparam_estimates_good:

现在考虑DataFrameparam_estimates_good：

In [54]: param_estimates_good
Out[54]: 
             a         b     sigma      a_se      b_se  sigma_se  success
1968  0.648508  1.803889  0.498017  0.784340  0.082366  0.529649        1
1972  0.539485  1.733304  0.451311  1.084170  0.174030  0.677134        1
1973  1.205704  2.054114  1.465606  0.095780  0.052851  0.090562        1
1974  1.398968  2.105287  2.029865  0.451929  0.056154  0.428696        1
1975  1.570900  1.877486  2.016978  0.186177  0.052413  0.183577        1
1976  0.688932  1.651232  0.874860  0.065038  0.099080  0.055247        1
1977  0.816918  1.949563  0.691899  0.516742  0.083973  0.385799        1
1980  0.730454  2.569974  2.297921  0.619403  0.157950  0.439383        1
1986  1.053362  1.770256  1.115229  0.235353  0.063867  0.202970        1
1993  2.531327  1.235418  2.005588  0.107785  0.011513  0.060647        1
1994 -0.759318  2.556910  0.175695  0.052099  0.078433  0.044315        1
1998  1.007787  1.548161  0.911332  2.538235  0.040285  2.001148        1
2000 -0.693261  1.518839 -0.290453  3.763934  1.329302  0.868444        1
2001  0.662391  0.650003  0.854752  0.550188  0.547999  0.376354        1
2002  0.652630  0.424864  0.524909  0.413478  0.334703  0.251172        1
2004 -0.169553  1.290054 -0.040504  0.279700  0.093937  0.115120        1
2005  0.146209  1.610219 -0.233461  0.171832  0.083844  0.123676        1
2007 -0.301397  0.822584  0.309423  1.119639  0.860818  0.377673        1
2008  1.334283  0.065856  1.704950  0.462811  0.489639  0.427041        1
2009  2.082782 -0.727128  1.072343  0.464726  0.093574  0.472603        1
2010  2.309353 -1.202509  0.906165  0.037950  0.080356  0.031981        1
2013  3.490101 -2.033734  1.468027  0.251317  0.030869  0.259732        1
2014  1.820431 -1.961015 -0.050831  0.262710  0.176057  0.266525        1
2016  1.818855 -0.580492  0.312369  0.450659  0.065661  0.474896        1

I take a slice of this to form a Seriesg:

我把这个切片形成一个Seriesg：

In [55]: g = param_estimates_good['a']

In [56]: g
Out[56]: 
1968    0.648508
1972    0.539485
1973    1.205704
1974    1.398968
1975    1.570900
1976    0.688932
1977    0.816918
1980    0.730454
1986    1.053362
1993    2.531327
1994   -0.759318
1998    1.007787
2000   -0.693261
2001    0.662391
2002    0.652630
2004   -0.169553
2005    0.146209
2007   -0.301397
2008    1.334283
2009    2.082782
2010    2.309353
2013    3.490101
2014    1.820431
2016    1.818855
Name: a, dtype: float64

The Indexof gare ints...

该Index的g是个整数...

In [57]: g.index
Out[57]: 
Int64Index([1968, 1972, 1973, 1974, 1975, 1976, 1977, 1980, 1986, 1993, 1994,
            1998, 2000, 2001, 2002, 2004, 2005, 2007, 2008, 2009, 2010, 2013,
            2014, 2016],
           dtype='int64')

... so I try to slice gin an analagous way to to obj:

...所以我尝试以g类似的方式切片到obj：

In [58]: g[0:7]
Out[58]: 
1968    0.648508
1972    0.539485
1973    1.205704
1974    1.398968
1975    1.570900
1976    0.688932
1977    0.816918
Name: a, dtype: float64


In [59]: g[1968 : 1977]
Out[59]: Series([], Name: a, dtype: float64)

Why does the latter method return an empty Series?

为什么后一种方法返回一个空的Series？

Answer 1

回答by jezrael

I think it wants to find rows from position 1968to 1977, because it selects rows by positions - Slicing ranges [] in docs:

我想，它要找到位置行1968到1977，因为它的位置选择行- Slicing ranges [] in docs：

With Series, the syntax works exactly as with an ndarray, returning a slice of the values and the corresponding labels

对于 Series，语法与 ndarray 完全一样，返回值的切片和相应的标签

It is same as Selection by position in docswith iloc.

它是一样Selection by position in docs有iloc。

print g.iloc[1968 : 1977]
Series([], Name: a, dtype: float64)

With locit works perfectly Selection By Label in docs:

有了loc它完美Selection By Label in docs：

print g.loc[1968 : 1977]
1968    0.648508
1972    0.539485
1973    1.205704
1974    1.398968
1975    1.570900
1976    0.688932
1977    0.816918
Name: a, dtype: float64

Answer 2

回答by vuvu

You get an empty Series because when using the slicing operator as in g[1968:1977], these are taken as locations (row indexes running from 0 to (N-1), where N is the size/length of the Series) and you seem to have 24 rows in g, so when you ask for all elements between locations 1968 and 1977 you get nothing (your last location is 23).

您会得到一个空系列，因为在使用 g[1968:1977] 中的切片运算符时，这些被视为位置（从 0 到 (N-1) 的行索引，其中 N 是系列的大小/长度）和你似乎在 g 中有 24 行，所以当你要求位置 1968 和 1977 之间的所有元素时，你什么也得不到（你的最后一个位置是 23）。

You want to use instead the g.index labels as in g.loc[1968:1977] and then you get all elements between labels 1968 and 1977 (inclusive).

您想改用 g.index 标签，如 g.loc[1968:1977] 中那样，然后您将获得标签 1968 和 1977（含）之间的所有元素。

pandas 熊猫切片系列

提问by bcf

回答by jezrael

回答by vuvu

相关推荐

最近更新

标签

pandas 熊猫切片系列

提问by bcf

回答by jezrael

回答by vuvu

相关推荐

pandas 避免熊猫中 pd.to_datetime 的错误

pandas 熊猫 xlsxwriter，格式标题

pandas 在python中计算*多*组地理坐标之间的距离

根据字符串条件为 Pandas 数据框列赋值

相关推荐

最近更新

标签

pandas 在python中计算多组地理坐标之间的距离