Python 将列按名称移动到熊猫表的前面
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25122099/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Move column by name to front of table in pandas
提问by Boosted_d16
Here is my df:
这是我的 df:
Net Upper Lower Mid Zsore
Answer option
More than once a day 0% 0.22% -0.12% 2 65
Once a day 0% 0.32% -0.19% 3 45
Several times a week 2% 2.45% 1.10% 4 78
Once a week 1% 1.63% -0.40% 6 65
How can I move a column by name ("Mid") to the front of the table, index 0. This is what the result should look like:
如何按名称 ( "Mid") 将列移动到表的前面,索引 0。结果应该是这样的:
Mid Upper Lower Net Zsore
Answer option
More than once a day 2 0.22% -0.12% 0% 65
Once a day 3 0.32% -0.19% 0% 45
Several times a week 4 2.45% 1.10% 2% 78
Once a week 6 1.63% -0.40% 1% 65
My current code moves the column by index using df.columns.tolist()but I'd like to shift it by name.
我当前的代码使用索引移动列,df.columns.tolist()但我想按名称移动它。
采纳答案by EdChum
We can use ixto reorder by passing a list:
我们可以ix通过传递一个列表来重新排序:
In [27]:
# get a list of columns
cols = list(df)
# move the column to head of list using index, pop and insert
cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[27]:
['Mid', 'Net', 'Upper', 'Lower', 'Zsore']
In [28]:
# use ix to reorder
df = df.ix[:, cols]
df
Out[28]:
Mid Net Upper Lower Zsore
Answer_option
More_than_once_a_day 2 0% 0.22% -0.12% 65
Once_a_day 3 0% 0.32% -0.19% 45
Several_times_a_week 4 2% 2.45% 1.10% 78
Once_a_week 6 1% 1.63% -0.40% 65
Another method is to take a reference to the column and reinsert it at the front:
另一种方法是引用该列并将其重新插入到前面:
In [39]:
mid = df['Mid']
df.drop(labels=['Mid'], axis=1,inplace = True)
df.insert(0, 'Mid', mid)
df
Out[39]:
Mid Net Upper Lower Zsore
Answer_option
More_than_once_a_day 2 0% 0.22% -0.12% 65
Once_a_day 3 0% 0.32% -0.19% 45
Several_times_a_week 4 2% 2.45% 1.10% 78
Once_a_week 6 1% 1.63% -0.40% 65
You can also use locto achieve the same result as ixwill be deprecated in a future version of pandas from 0.20.0onwards:
您还可以使用loc来实现与ix将来版本的 Pandas 中将被弃用的结果相同的结果0.20.0:
df = df.loc[:, cols]
回答by Sachinmm
You can use the df.reindex() function in pandas. df is
您可以在 Pandas 中使用 df.reindex() 函数。df 是
Net Upper Lower Mid Zsore
Answer option
More than once a day 0% 0.22% -0.12% 2 65
Once a day 0% 0.32% -0.19% 3 45
Several times a week 2% 2.45% 1.10% 4 78
Once a week 1% 1.63% -0.40% 6 65
define an list of column names
定义列名列表
cols = df.columns.tolist()
cols
Out[13]: ['Net', 'Upper', 'Lower', 'Mid', 'Zsore']
move the column name to wherever you want
将列名移动到您想要的任何位置
cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[16]: ['Mid', 'Net', 'Upper', 'Lower', 'Zsore']
then use df.reindex()function to reorder
然后使用df.reindex()函数重新排序
df = df.reindex(columns= cols)
out put is: df
输出是:df
Mid Upper Lower Net Zsore
Answer option
More than once a day 2 0.22% -0.12% 0% 65
Once a day 3 0.32% -0.19% 0% 45
Several times a week 4 2.45% 1.10% 2% 78
Once a week 6 1.63% -0.40% 1% 65
回答by citynorman
I didn't like how I had to explicitly specify all the other column in the other solutions so this worked best for me. Though it might be slow for large dataframes...?
我不喜欢我必须在其他解决方案中明确指定所有其他列的方式,因此这对我来说效果最好。虽然对于大型数据帧可能会很慢......?
df = df.set_index('Mid').reset_index()
df = df.set_index('Mid').reset_index()
回答by Bhagabat Behera
Here is a generic set of code that I frequently use to rearrange the position of columns. You may find it useful.
这是我经常用来重新排列列位置的一组通用代码。你可能会发现它很有用。
cols = df.columns.tolist()
n = int(cols.index('Mid'))
cols = [cols[n]] + cols[:n] + cols[n+1:]
df = df[cols]
回答by elPastor
Maybe I'm missing something, but a lot of these answers seem overly complicated. You should be able to just set the columns within a single list:
也许我遗漏了一些东西,但很多这些答案似乎过于复杂。您应该能够在单个列表中设置列:
Column to the front:
前列:
df = df[ ['Mid'] + [ col for col in df.columns if col != 'Mid' ] ]
Or if instead, you want to move it to the back:
或者,如果您想将其移到后面:
df = df[ [ col for col in df.columns if col != 'Mid' ] + ['Mid'] ]
Or if you wanted to move more than one column:
或者,如果您想移动多个列:
cols_to_move = ['Mid', 'Zsore']
df = df[ cols_to_move + [ col for col in df.columns if col not in cols_to_move ] ]
回答by Dustin Helliwell
To reorder the rows of a DataFrame just use a list as follows.
要重新排序 DataFrame 的行,只需使用如下列表即可。
df = df[['Mid', 'Net', 'Upper', 'Lower', 'Zsore']]
This makes it very obvious what was done when reading the code later. Also use:
这使得稍后阅读代码时所做的事情非常明显。还可以使用:
df.columns
Out[1]: Index(['Net', 'Upper', 'Lower', 'Mid', 'Zsore'], dtype='object')
Then cut and paste to reorder.
然后剪切并粘贴以重新排序。
For a DataFrame with many columns, store the list of columns in a variable and pop the desired column to the front of the list. Here is an example:
对于具有许多列的 DataFrame,将列列表存储在一个变量中,并将所需的列弹出到列表的前面。下面是一个例子:
cols = [str(col_name) for col_name in range(1001)]
data = np.random.rand(10,1001)
df = pd.DataFrame(data=data, columns=cols)
mv_col = cols.pop(cols.index('77'))
df = df[[mv_col] + cols]
Now df.columnshas.
现在df.columns有。
Index(['77', '0', '1', '2', '3', '4', '5', '6', '7', '8',
...
'991', '992', '993', '994', '995', '996', '997', '998', '999', '1000'],
dtype='object', length=1001)
回答by normanius
I prefer this solution:
我更喜欢这个解决方案:
col = df.pop("Mid")
df.insert(0, col.name, col)
It's simpler to read and faster than other suggested answers.
与其他建议的答案相比,它更易于阅读且速度更快。
def move_column_inplace(df, col, pos):
col = df.pop(col)
df.insert(pos, col.name, col)
Performance assessment:
绩效评估:
For this test, the currently last column is moved to the front in each repetition. In-place methods generally perform better. While citynorman's solution can be made in-place, Ed Chum's method based on .locand sachinnm's method based on reindexcannot.
对于此测试,当前最后一列在每次重复中移至最前面。就地方法通常性能更好。虽然 citynorman 的解决方案可以就地制作,但基于 Ed Chum 的方法.loc和基于sachinnm 的方法reindex不能。
While other methods are generic, citynorman's solution is limited to pos=0. I didn't observe any performance difference between df.loc[cols]and df[cols], which is why I didn't include some other suggestions.
虽然其他方法是通用的,但 citynorman 的解决方案仅限于pos=0. 我没有观察到df.loc[cols]和之间的任何性能差异df[cols],这就是为什么我没有包含其他一些建议。
I tested with python 3.6.8 and pandas 0.24.2 on a MacBook Pro (Mid 2015).
我在 MacBook Pro(2015 年中)上使用 python 3.6.8 和 pandas 0.24.2 进行了测试。
import numpy as np
import pandas as pd
n_cols = 11
df = pd.DataFrame(np.random.randn(200000, n_cols),
columns=range(n_cols))
def move_column_inplace(df, col, pos):
col = df.pop(col)
df.insert(pos, col.name, col)
def move_to_front_normanius_inplace(df, col):
move_column_inplace(df, col, 0)
return df
def move_to_front_chum(df, col):
cols = list(df)
cols.insert(0, cols.pop(cols.index(col)))
return df.loc[:, cols]
def move_to_front_chum_inplace(df, col):
col = df[col]
df.drop(col.name, axis=1, inplace=True)
df.insert(0, col.name, col)
return df
def move_to_front_elpastor(df, col):
cols = [col] + [ c for c in df.columns if c!=col ]
return df[cols] # or df.loc[cols]
def move_to_front_sachinmm(df, col):
cols = df.columns.tolist()
cols.insert(0, cols.pop(cols.index(col)))
df = df.reindex(columns=cols, copy=False)
return df
def move_to_front_citynorman_inplace(df, col):
# This approach exploits that reset_index() moves the index
# at the first position of the data frame.
df.set_index(col, inplace=True)
df.reset_index(inplace=True)
return df
def test(method, df):
col = np.random.randint(0, n_cols)
method(df, col)
col = np.random.randint(0, n_cols)
ret_mine = move_to_front_normanius_inplace(df.copy(), col)
ret_chum1 = move_to_front_chum(df.copy(), col)
ret_chum2 = move_to_front_chum_inplace(df.copy(), col)
ret_elpas = move_to_front_elpastor(df.copy(), col)
ret_sach = move_to_front_sachinmm(df.copy(), col)
ret_city = move_to_front_citynorman_inplace(df.copy(), col)
# Assert equivalence of solutions.
assert(ret_mine.equals(ret_chum1))
assert(ret_mine.equals(ret_chum2))
assert(ret_mine.equals(ret_elpas))
assert(ret_mine.equals(ret_sach))
assert(ret_mine.equals(ret_city))
Results:
结果:
# For n_cols = 11:
%timeit test(move_to_front_normanius_inplace, df)
# 1.05 ms ± 42.4 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.68 ms ± 46.1 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_sachinmm, df)
# 3.24 ms ± 96.5 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
#?3.84 ms ± 114 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_elpastor, df)
# 3.85 ms ± 58.3 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 9.67 ms ± 101 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# For n_cols = 31:
%timeit test(move_to_front_normanius_inplace, df)
# 1.26 ms ± 31.6 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.95 ms ± 260 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_sachinmm, df)
# 10.7 ms ± 348 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
#?11.5 ms ± 869 μs per loop (mean ± std. dev. of 7 runs, 100 loops each
%timeit test(move_to_front_elpastor, df)
# 11.4 ms ± 598 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 31.4 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

