Python 获取 Pandas DataFrame 第一列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41954759/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-20 01:55:29  来源:igfitidea点击:

Get Pandas DataFrame first column

pythonpandas

提问by Bobesh

This question is odd, since I know HOW to do something, but I dont know WHY I cant do it another way.

这个问题很奇怪,因为我知道如何做某事,但我不知道为什么我不能以另一种方式做。

Suppose simple data frame:

假设简单的数据框:

import pandasas pd
a = pd.DataFrame([[0,1], [2,3]])

I can slice this data frame very easily, first column is a[[0]], second is a[[1]]. Simple isnt it?

我可以很容易地对这个数据框进行切片,第一列是a[[0]],第二列是a[[1]]。是不是很简单?

Now, lets have more complex data frame. This is part of my code:

现在,让我们有更复杂的数据框。这是我的代码的一部分:

var_vec = [i for i in range(100)]
num_of_sites = 100
row_names = ["_".join(["loc", str(i)]) for i in 
             range(1,num_of_sites + 1)]
frame = pd.DataFrame(var_vec, columns = ["Variable"], index = row_names)
spec_ab = [i**3 for i in range(100)]
frame[1] = spec_ab

Data frame frameis also pandas DataFrame, such as a. I canget second column very easily as frame[[1]]. But when I try frame[[0]]I get an error:

数据框frame也是pandas的DataFrame,比如一个。我可以很容易地获得第二列作为frame[[1]]. 但是当我尝试时frame[[0]]出现错误:

Traceback (most recent call last):

  File "<ipython-input-55-0c56ffb47d0d>", line 1, in <module>
    frame[[0]]

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 1991, in __getitem__
    return self._getitem_array(key)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 2035, in     _getitem_array
    indexer = self.ix._convert_to_indexer(key, axis=1)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\indexing.py", line 1184, in     _convert_to_indexer
    indexer = labels._convert_list_indexer(objarr, kind=self.name)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\indexes\base.py", line 1112, in     _convert_list_indexer
    return maybe_convert_indices(indexer, len(self))

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\indexing.py", line 1856, in     maybe_convert_indices
    raise IndexError("indices are out-of-bounds")

IndexError: indices are out-of-bounds

I can still use frame.iloc[:,0]but problem is that I dont understand why I cant use simple slicing by [[]]? I use winpython spyder 3 if that helps.

我仍然可以使用,frame.iloc[:,0]但问题是我不明白为什么我不能使用简单的切片[[]]?如果有帮助,我会使用 winpython spyder 3。

回答by epattaro

using your code:

使用您的代码:

import pandas as pd

var_vec = [i for i in range(100)]
num_of_sites = 100
row_names = ["_".join(["loc", str(i)]) for i in 
             range(1,num_of_sites + 1)]
frame = pd.DataFrame(var_vec, columns = ["Variable"], index = row_names)
spec_ab = [i**3 for i in range(100)]
frame[1] = spec_ab

if you ask to print out the 'frame' you get:

如果您要求打印“框架”,您会得到:

    Variable    1
loc_1   0       0
loc_2   1       1
loc_3   2       8
loc_4   3       27
loc_5   4       64
loc_6   5       125
......

So the cause of your problem becomes obvious, you have no column called '0'. At line one you specify a lista called var_vec. At line 4 you make a dataframe out of that list, but you specify the index values and the column name (which is usually good practice). The numerical column name, '0', '1',.. as in the first example, only takes place when you dont specify the column name, its not a column position indexer.

所以你的问题的原因很明显,你没有名为“0”的列。在第一行,您指定一个名为 var_vec 的列表。在第 4 行,您从该列表中创建了一个数据框,但您指定了索引值和列名(这通常是一种很好的做法)。数字列名 '0', '1',.. 与第一个示例一样,仅在您不指定列名时才会发生,它不是列位置索引器。

If you want to access columns by their position, you can:

如果您想按位置访问列,您可以:

df[df.columns[0]]

what happens than, is you get the list of columns of the df, and you choose the term '0' and pass it to the df as a reference.

会发生什么,是您获得 df 的列列表,然后选择术语“0”并将其传递给 df 作为参考。

hope that helps you understand

希望能帮助你理解

edit:

编辑:

another way (better) would be:

另一种方式(更好)是:

df.iloc[:,0]

where ":" stands for all rows. (also indexed by number from 0 to range of rows)

其中“:”代表所有行。(也按从 0 到行范围的数字索引)