Python Pandas - 获取给定列的第一行值

Question

提问by Ahmed Haque

This seems like a ridiculously easy question... but I'm not seeing the easy answer I was expecting.

这似乎是一个非常简单的问题……但我没有看到我期待的简单答案。

So, how do I get the value at an nth row of a given column in Pandas? (I am particularly interested in the first row, but would be interested in a more general practice as well).

那么，如何获取 Pandas 中给定列的第 n 行的值？（我对第一行特别感兴趣，但也会对更一般的实践感兴趣）。

For example, let's say I want to pull the 1.2 value in Btime as a variable.

例如，假设我想将 Btime 中的 1.2 值提取为变量。

Whats the right way to do this?

这样做的正确方法是什么？

df_test =

  ATime   X   Y   Z   Btime  C   D   E
0    1.2  2  15   2    1.2  12  25  12
1    1.4  3  12   1    1.3  13  22  11
2    1.5  1  10   6    1.4  11  20  16
3    1.6  2   9  10    1.7  12  29  12
4    1.9  1   1   9    1.9  11  21  19
5    2.0  0   0   0    2.0   8  10  11
6    2.4  0   0   0    2.4  10  12  15

Answer 1

采纳答案by unutbu

To select the ithrow, use iloc:

要选择ith行，请使用iloc：

In [31]: df_test.iloc[0]
Out[31]: 
ATime     1.2
X         2.0
Y        15.0
Z         2.0
Btime     1.2
C        12.0
D        25.0
E        12.0
Name: 0, dtype: float64

To select the ith value in the Btimecolumn you could use:

要选择Btime列中的第 i 个值，您可以使用：

In [30]: df_test['Btime'].iloc[0]
Out[30]: 1.2

There is a difference between `df_test['Btime'].iloc[0]`(recommended) and `df_test.iloc[0]['Btime']`:

`df_test['Btime'].iloc[0]`(推荐) 和之间有区别`df_test.iloc[0]['Btime']`：

DataFrames store data in column-based blocks (where each block has a single dtype). If you select by column first, a viewcan be returned (which is quicker than returning a copy) and the original dtype is preserved. In contrast, if you select by row first, and if the DataFrame has columns of different dtypes, then Pandas copiesthe data into a new Series of object dtype. So selecting columns is a bit faster than selecting rows. Thus, although df_test.iloc[0]['Btime']works, df_test['Btime'].iloc[0]is a little bit more efficient.

DataFrames 将数据存储在基于列的块中（其中每个块都有一个 dtype）。如果您先按列选择，则可以返回视图（这比返回副本更快）并保留原始 dtype。相反，如果您先按行选择，并且如果 DataFrame 具有不同 dtype 的列，则 Pandas 会将数据复制到新的 Series 对象 dtype 中。所以选择列比选择行要快一些。因此，虽然df_test.iloc[0]['Btime']有效，但 df_test['Btime'].iloc[0]效率更高。

There is a big difference between the two when it comes to assignment. df_test['Btime'].iloc[0] = xaffects df_test, but df_test.iloc[0]['Btime']may not. See below for an explanation of why. Because a subtle difference in the order of indexing makes a big difference in behavior, it is better to use single indexing assignment:

在分配方面，两者之间存在很大差异。 df_test['Btime'].iloc[0] = x影响df_test，但df_test.iloc[0]['Btime']可能不会。有关原因的解释，请参见下文。因为索引顺序的细微差异会导致行为的很大差异，所以最好使用单个索引分配：

df.iloc[0, df.columns.get_loc('Btime')] = x

`df.iloc[0, df.columns.get_loc('Btime')] = x`(recommended):

`df.iloc[0, df.columns.get_loc('Btime')] = x`（受到推崇的）：

The recommended wayto assign new values to a DataFrame is to avoid chained indexing, and instead use the method shown by andrew,

为 DataFrame 分配新值的推荐方法是避免链接索引，而是使用andrew 所示的方法，

df.loc[df.index[n], 'Btime'] = x

or

或者

df.iloc[n, df.columns.get_loc('Btime')] = x

The latter method is a bit faster, because df.lochas to convert the row and column labels to positional indices, so there is a little less conversion necessary if you use df.ilocinstead.

后一种方法要快一些，因为df.loc必须将行和列标签转换为位置索引，因此如果您df.iloc改为使用，则所需的转换会少一些。

`df['Btime'].iloc[0] = x`works, but is not recommended:

`df['Btime'].iloc[0] = x`有效，但不推荐：

Although this works, it is taking advantage of the way DataFrames are currentlyimplemented. There is no guarantee that Pandas has to work this way in the future. In particular, it is taking advantage of the fact that (currently) df['Btime']always returns a view (not a copy) so df['Btime'].iloc[n] = xcan be used to assigna new value at the nth location of the Btimecolumn of df.

尽管这有效，但它利用了当前实现DataFrame 的方式。不能保证 Pandas 将来必须以这种方式工作。特别是，它利用了这样一个事实，即（当前）df['Btime']总是返回一个视图（而不是副本），因此df['Btime'].iloc[n] = x可用于在的列的第 n 个位置分配一个新值。Btimedf

Since Pandas makes no explicit guarantees about when indexers return a view versus a copy, assignments that use chained indexing generally always raise a SettingWithCopyWarningeven though in this case the assignment succeeds in modifying df:

由于 Pandas 没有明确保证索引器何时返回视图与副本，使用链式索引的赋值通常总是引发 a，SettingWithCopyWarning即使在这种情况下赋值成功修改df：

In [22]: df = pd.DataFrame({'foo':list('ABC')}, index=[0,2,1])
In [24]: df['bar'] = 100
In [25]: df['bar'].iloc[0] = 99
/home/unutbu/data/binky/bin/ipython:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._setitem_with_indexer(indexer, value)

In [26]: df
Out[26]: 
  foo  bar
0   A   99  <-- assignment succeeded
2   B  100
1   C  100

`df.iloc[0]['Btime'] = x`does not work:

`df.iloc[0]['Btime'] = x`不起作用：

In contrast, assignment with df.iloc[0]['bar'] = 123does not work because df.iloc[0]is returning a copy:

相反，赋值df.iloc[0]['bar'] = 123不起作用，因为df.iloc[0]正在返回一个副本：

In [66]: df.iloc[0]['bar'] = 123
/home/unutbu/data/binky/bin/ipython:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

In [67]: df
Out[67]: 
  foo  bar
0   A   99  <-- assignment failed
2   B  100
1   C  100

Warning: I had previously suggested df_test.ix[i, 'Btime']. But this is not guaranteed to give you the ithvalue since ixtries to index by labelbefore trying to index by position. So if the DataFrame has an integer index which is not in sorted order starting at 0, then using ix[i]will return the row labeledirather than the ithrow. For example,

警告：我之前曾建议df_test.ix[i, 'Btime']. 但这并不能保证为您提供ith值，因为在尝试按位置索引之前尝试ix按标签索引。因此，如果 DataFrame 的整数索引不是从 0 开始的排序顺序，则 using将返回标记的行而不是行。例如，ix[i]iith

In [1]: df = pd.DataFrame({'foo':list('ABC')}, index=[0,2,1])

In [2]: df
Out[2]: 
  foo
0   A
2   B
1   C

In [4]: df.ix[1, 'foo']
Out[4]: 'C'

Answer 2

回答by andrew

Note that the answer from @unutbu will be correct until you want to set the value to something new, then it will not work if your dataframe is a view.

请注意，@unutbu 的答案将是正确的，直到您想将该值设置为新的值，然后如果您的数据框是视图，它将不起作用。

In [4]: df = pd.DataFrame({'foo':list('ABC')}, index=[0,2,1])
In [5]: df['bar'] = 100
In [6]: df['bar'].iloc[0] = 99
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.16.0_19_g8d2818e-py2.7-macosx-10.9-x86_64.egg/pandas/core/indexing.py:118: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._setitem_with_indexer(indexer, value)

Another approach that will consistently work with both setting and getting is:

另一种始终适用于设置和获取的方法是：

In [7]: df.loc[df.index[0], 'foo']
Out[7]: 'A'
In [8]: df.loc[df.index[0], 'bar'] = 99
In [9]: df
Out[9]:
  foo  bar
0   A   99
2   B  100
1   C  100

Answer 3

回答by nikhil

df.iloc[0].head(1)- First data set only from entire first row.
df.iloc[0]- Entire First row in column.

df.iloc[0].head(1)- 仅来自整个第一行的第一个数据集。
df.iloc[0]- 列中的整个第一行。

Answer 4

回答by anis

In a general way, if you want to pick up the first N rowsfrom the J columnfrom pandas dataframethe best way to do this is:

一般来说，如果您想从J 列中提取前N 行，最好的方法是：pandas dataframe

data = dataframe[0:N][:,J]

Answer 5

回答by Abdulrahman Bres

Another way to do this:

另一种方法来做到这一点：

first_value = df['Btime'].values[0]

This way seems to be faster than using .iloc:

这种方式似乎比使用更快.iloc：

In [1]: %timeit -n 1000 df['Btime'].values[20]
5.82 μs ± 142 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [2]: %timeit -n 1000 df['Btime'].iloc[20]
29.2 μs ± 1.28 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Answer 6

回答by Hunaphu

Another way of getting the first row and preserving the index:

获取第一行并保留索引的另一种方法：

x = df.first('d') # Returns the first day. '3d' gives first three days.

Answer 7

回答by Alex Ortner

To get e.g the value from column 'test' and row 1 it works like

例如，要从“test”列和第 1 行中获取值，它的工作原理类似于

df[['test']].values[0][0]

as only df[['test']].values[0]gives back a array

因为只df[['test']].values[0]返回一个数组

Python Pandas - 获取给定列的第一行值

提问by Ahmed Haque

采纳答案by unutbu

There is a difference between `df_test['Btime'].iloc[0]`(recommended) and `df_test.iloc[0]['Btime']`:

`df_test['Btime'].iloc[0]`(推荐) 和之间有区别`df_test.iloc[0]['Btime']`：

`df.iloc[0, df.columns.get_loc('Btime')] = x`(recommended):

`df.iloc[0, df.columns.get_loc('Btime')] = x`（受到推崇的）：

`df['Btime'].iloc[0] = x`works, but is not recommended:

`df['Btime'].iloc[0] = x`有效，但不推荐：

`df.iloc[0]['Btime'] = x`does not work:

`df.iloc[0]['Btime'] = x`不起作用：

回答by andrew

回答by nikhil

回答by anis

回答by Abdulrahman Bres

回答by Hunaphu

回答by Alex Ortner

相关推荐

最近更新

标签

Python Pandas - 获取给定列的第一行值

提问by Ahmed Haque

采纳答案by unutbu

There is a difference between df_test['Btime'].iloc[0](recommended) and df_test.iloc[0]['Btime']:

df_test['Btime'].iloc[0](推荐) 和之间有区别df_test.iloc[0]['Btime']：

df.iloc[0, df.columns.get_loc('Btime')] = x(recommended):

df.iloc[0, df.columns.get_loc('Btime')] = x（受到推崇的）：

df['Btime'].iloc[0] = xworks, but is not recommended:

df['Btime'].iloc[0] = x有效，但不推荐：

df.iloc[0]['Btime'] = xdoes not work:

df.iloc[0]['Btime'] = x不起作用：

回答by andrew

回答by nikhil

回答by anis

回答by Abdulrahman Bres

回答by Hunaphu

回答by Alex Ortner

相关推荐

Python UnboundLocalError：从文件读取时在赋值之前引用了局部变量

Python 如何在 Matplotlib 中为子图添加标题？

Python 使用 setup.py 在安装的包上导入错误

Python 比较两个数据框并获取差异

相关推荐

最近更新

标签

There is a difference between `df_test['Btime'].iloc[0]`(recommended) and `df_test.iloc[0]['Btime']`:

`df_test['Btime'].iloc[0]`(推荐) 和之间有区别`df_test.iloc[0]['Btime']`：

`df.iloc[0, df.columns.get_loc('Btime')] = x`(recommended):

`df.iloc[0, df.columns.get_loc('Btime')] = x`（受到推崇的）：

`df['Btime'].iloc[0] = x`works, but is not recommended:

`df['Btime'].iloc[0] = x`有效，但不推荐：

`df.iloc[0]['Btime'] = x`does not work:

`df.iloc[0]['Btime'] = x`不起作用：