pandas 冒号(:) 在python 和pandas 中是如何工作的？

Question

提问by Peng He

I create a DataFrame：

我创建了一个DataFrame：

import pandas as pd
data = pd.DataFrame({'a':range(1,11),'b':['m','f','m','m','m','f','m','f','f','f'],'c':np.random.randn(10)})

Which looks like：

看起来像：

    a  b         c
0   1  m  0.495439
1   2  f  1.444694
2   3  m  0.150637
3   4  m -1.078252
4   5  m  0.618045
5   6  f -0.525368
6   7  m  0.188912
7   8  f  0.159014
8   9  f  0.536495
9  10  f  0.874598

When I want to select some rows, I run

当我想选择一些行时，我运行

data[:2] or data.ix[2]

But when I try:

但是当我尝试：

se = range(2)
data[se]

There's a error:

有一个错误：

KeyError: 'No column(s) named: [0 1]'

I know DataFrame select a col as default.What happened when I run data[se]? How colon(:) works in python?

我知道 DataFrame 选择一个 col 作为默认值。运行时发生了什么data[se]？冒号(:) 在python 中是如何工作的？

Answer 1

回答by urban

I have never used Pandas but a good explanation of slicing ([::]notation in python can be found here. Now from what I read in the manual

我从未使用过 Pandas，但对切片有一个很好的解释（[::]python 中的符号可以在这里找到。现在从我在手册中读到的内容）

With DataFrame, slicing inside of [] slices the rows. This is provided largely as a convenience since it is such a common operation.

In [32]: df[:3]
Out[32]: 
                   A         B         C         D
2000-01-01 -0.282863  0.469112 -1.509059 -1.135632
2000-01-02 -0.173215  1.212112  0.119209 -1.044236
2000-01-03 -2.104569 -0.861849 -0.494929  1.071804

In [33]: df[::-1]
Out[33]: 
                   A         B         C         D
2000-01-08 -1.157892 -0.370647 -1.344312  0.844885
2000-01-07  0.577046  0.404705 -1.715002 -1.039268
2000-01-06  0.113648 -0.673690 -1.478427  0.524988
2000-01-05  0.567020 -0.424972  0.276232 -1.087401
2000-01-04 -0.706771  0.721555 -1.039575  0.271860
2000-01-03 -2.104569 -0.861849 -0.494929  1.071804
2000-01-02 -0.173215  1.212112  0.119209 -1.044236
2000-01-01 -0.282863  0.469112 -1.509059 -1.135632

使用 DataFrame，在 [] 内部切片会切片行。这主要是为了方便，因为它是一种常见的操作。

In [32]: df[:3]
Out[32]: 
                   A         B         C         D
2000-01-01 -0.282863  0.469112 -1.509059 -1.135632
2000-01-02 -0.173215  1.212112  0.119209 -1.044236
2000-01-03 -2.104569 -0.861849 -0.494929  1.071804

In [33]: df[::-1]
Out[33]: 
                   A         B         C         D
2000-01-08 -1.157892 -0.370647 -1.344312  0.844885
2000-01-07  0.577046  0.404705 -1.715002 -1.039268
2000-01-06  0.113648 -0.673690 -1.478427  0.524988
2000-01-05  0.567020 -0.424972  0.276232 -1.087401
2000-01-04 -0.706771  0.721555 -1.039575  0.271860
2000-01-03 -2.104569 -0.861849 -0.494929  1.071804
2000-01-02 -0.173215  1.212112  0.119209 -1.044236
2000-01-01 -0.282863  0.469112 -1.509059 -1.135632

In your example where you use range(2)that gives you [0, 1]as list. What I think you need is data[0:1]to slice the DataFrameand get rows 0 and 1 which is the same as data[:1]omitting the zero. If you wanted for example rows 3,4 and 5 that would be data[3:5].

在您使用的示例中，range(2)它为您[0, 1]提供了列表。我认为您需要的是data[0:1]切片DataFrame并获取第 0 行和第 1 行，这与data[:1]省略零相同。例如，如果您想要第 3,4 和 5 行，那就是data[3:5].

Additionally, looking at some examples in the manual you can use step, so:

此外，查看手册中的一些示例，您可以使用step，因此：

data[::2]gives you every 2nd row
data[::-1]returns all the rows in reverse order
Combining ranges and step: data[0:10:2]will result in rows 0,2,4,6,8 and 10

data[::2]给你每第二行
data[::-1]以相反的顺序返回所有行
组合范围和步长：data[0:10:2]将导致第 0、2、4、6、8 和 10 行

Hope it helps

希望能帮助到你

Answer 2

回答by birdypme

The [start:limit:step] syntax is known as slicing. You can easily create an instance of a slice using the slice()function:

[start:limit:step] 语法称为切片。您可以使用slice()函数轻松创建切片的实例：

class slice(stop)
class slice(start, stop[, step])
Return a slice object representing the set of indices specified by range(start, stop, step). The start and step arguments default to None. Slice objects have read-only data attributes start, stop and step which merely return the argument values (or their default). They have no other explicit functionality; however they are used by Numerical Python and other third party extensions. Slice objects are also generated when extended indexing syntax is used. For example: a[start:stop:step] or a[start:stop, i]. See itertools.islice() for an alternate version that returns an iterator.

类切片（停止）
类切片（开始，停止[，步骤]）
返回一个切片对象，表示由 range(start, stop, step) 指定的索引集。start 和 step 参数默认为 None。Slice 对象具有只读数据属性 start、stop 和 step，它们仅返回参数值（或它们的默认值）。它们没有其他明确的功能；但是它们被 Numerical Python 和其他第三方扩展使用。使用扩展索引语法时也会生成切片对象。例如：a[start:stop:step] 或 a[start:stop, i]。有关返回迭代器的替代版本，请参阅 itertools.islice()。

In your case, you could write something like this to return the first 2 rows

在你的情况下，你可以写这样的东西来返回前 2 行

se = slice(None, 2)
data[se]

Answer 3

回答by Alexander

>>> data.ix[range(2)]
   a  b         c
0  1  m -0.323834
1  2  f  0.159787

pandas 冒号(:) 在python 和pandas 中是如何工作的？

提问by Peng He

回答by urban

回答by birdypme

回答by Alexander

相关推荐

最近更新

标签

pandas 冒号(:) 在python 和pandas 中是如何工作的？

提问by Peng He

回答by urban

回答by birdypme

回答by Alexander

相关推荐

pandas 以绝对值对熊猫系列进行排序

Pandas GroupBy：如何根据列获取前 n 个值

pandas “不能将float Nan转换为int”但没有Nan？

如何在 Pandas 中迭代 MultiIndex 级别？

相关推荐

最近更新

标签