根据列名从另一个 DataFrame 填充 Pandas DataFrame
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22914367/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Populating a Pandas DataFrame frome another DataFrame based on column names
提问by c_david
I have a DataFrame of the following form:
我有一个以下形式的 DataFrame:
a b c
0 1 4 6
1 3 2 4
2 4 1 5
And I have a list of column names that I need to use to create a new DataFrame using the columns of the first DataFrame that correspond to each label. For example, if my list of columns is ['a', 'b', 'b', 'a', 'c'], the resulting DataFrame should be:
我有一个列名列表,我需要使用它来使用与每个标签对应的第一个 DataFrame 的列来创建新的 DataFrame。例如,如果我的列列表是 ['a', 'b', 'b', 'a', 'c'],生成的 DataFrame 应该是:
a b b a c
0 1 4 4 1 6
1 3 2 2 3 4
2 4 1 1 4 5
I've been trying to figure out a fast way of performing this operations because I'm dealing with extremly large DataFrames and I don't think looping is a reasonable option.
我一直试图找出执行此操作的快速方法,因为我正在处理非常大的 DataFrame,并且我认为循环不是一个合理的选择。
回答by EdChum
You can just use the list to select them:
您可以只使用列表来选择它们:
In [44]:
cols = ['a', 'b', 'b', 'a', 'c']
df[cols]
Out[44]:
a b b a c
0 1 4 4 1 6
1 3 2 2 3 4
2 4 1 1 4 5
[3 rows x 5 columns]
So no need for a loop, once you have created your dataframe dfthen using a list of column names will just index them and create the df you want.
所以不需要循环,一旦你创建了你的数据框,df那么使用列名列表只会索引它们并创建你想要的 df。
回答by Tomás Pica
You can do that directly:
你可以直接这样做:
>>> df
a b c
0 1 4 6
1 3 2 4
2 4 1 5
>>> column_names
['a', 'b', 'b', 'a', 'c']
>>> df[column_names]
a b b a c
0 1 4 4 1 6
1 3 2 2 3 4
2 4 1 1 4 5
[3 rows x 5 columns]
回答by Zero
From 0.17onwards you can use reindexlike
从0.17以后你可以使用reindex像
In [795]: cols = ['a', 'b', 'b', 'a', 'c']
In [796]: df.reindex(columns=cols)
Out[796]:
a b b a c
0 1 4 4 1 6
1 3 2 2 3 4
2 4 1 1 4 5
Note: Ideally, you don't want to have duplicate column names.
注意:理想情况下,您不希望有重复的列名。

