pandas Python 3.x - 使用来自另一个数据帧的列名创建数据帧

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37048570/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:10:29  来源:igfitidea点击:

Python 3.x - Create dataframe with column names from another dataframe

pythonpandasdataframe

提问by Rohan Bapat

Dataframe df1 contains a field 'Column headers' which has the column names. I want to create another dataframe df2 which only contains column headers from 'Column headers' column of df1.

数据框 df1 包含一个字段“列标题”,其中包含列名称。我想创建另一个数据框 df2,它只包含来自 df1 的“列标题”列的列标题。

print(df1['Column header'])
>>
0                                              % Female
1                                  % Below poverty line
2                                    % Rural population
3                      Decadal Population Growth (in %)
4     Availability of Drinking Water Source Within P...
5                                 Concrete Roofs (in %)
6                        Houses With Electricity (in %)
7                        Houses With Televisions (in %)
8                           With Computer/Laptop (in %)
9        Houses With Phones (Telephone + Mobile) (in %)
10                        Houses With 2 wheelers (in %)
11                              Houses With cars (in %)
12              Households With Banking Services (in %)
13                                 Literacy Rate (in %)
14                         Literacy Rate (Rural) (in %)
15                         Literacy Rate (Urban) (in %)
16                  Decadal Difference In Literacy Rate
17                 Student: Teacher Ratio - All Schools
18                     Student: Teacher Ratio - Primary
19               Student: Teacher Ratio - Upper Primary
20     Under-five Mortality Rate (Per 1000 live Births)
21           No of Dispensaries per 1,00,000 population
22                No of Doctors per 1,00,000 population
23    Total patients registered for tuberculosis tre...
24                   Sex Ratio (Females Per 1000 Males)
25                                        Agri GSDP (%)
26                                    Industry GSDP (%)
27                                     Service GSDP (%)
28                        Unemployment Rate   (2011-12)
29                   Rural Unemployment Rate  (2011-12)
30                   Urban Unemployment Rate  (2011-12)
31                Per Capita Public Expenditure (in Rs)
32               Per Capita Private Expenditure (in Rs)
33                          Infant Mortality Rate (IMR)
34                             Maternal Mortality Rate 
35          Coverage Of National Highways (Total in km)
36             Coverage Of State Highways (Total in km)
37                Coverage Of Rural Roads (Total in km)
38                Coverage Of Urban Roads (Total in km)
39                       Railway Coverage (Total in km)
40    Tele-Density [Total Connections /  Total Popul...
Name: Column headers, dtype: object

I want to create dataframe df2 which contains 40 columns as mentioned above. The rows in this dataframe will be populated by a different function. I tried to create df2 as follows -

我想创建如上所述包含 40 列的数据框 df2。此数据框中的行将由不同的函数填充。我尝试按如下方式创建 df2 -

df2 = pd.DataFrame()  #Creating an empty dataframe 
df2.columns = df1['Column header']
>>
ValueError: Length mismatch: Expected axis has 0 elements, new values have 41 elements

Is it possible to create a blank dataframe in Pandas and specify the column names afterwards?

是否可以在 Pandas 中创建一个空白数据框并在之后指定列名?

回答by MaxU

try this:

尝试这个:

df2 = pd.DataFrame(columns=df1['Column header'])

but you shouldn't create empty DFs, because it's very slow to fill them up row by row. So you should collect your data first and then create your DF using precollected data.

但是您不应该创建空的 DF,因为逐行填充它们非常慢。因此,您应该先收集数据,然后使用预先收集的数据创建 DF。

回答by Ivan86

Here is how to create an empty dataframe with custom columns:

以下是如何使用自定义列创建空数据框:

// Example dataframe
df1 = pd.DataFrame({"Headers": ["Alpha","Beta", "Gama", "Delta"]]}, columns=["Headers"], index=range(4))

print(df1)
//      Headers
//   0  Alpha
//   1  Beta
//   2  Gama
//   3  Delta


print(df1['Headers'].values)
// ['Alpha' 'Beta' 'Gama' 'Delta']

// Make empty dataframe, key here is index=None
df2 = pd.DataFrame({}, columns=df1['Headers'].values, index=None)

print(df2)    
// Empty DataFrame
// Columns: [Alpha, Beta, Gama, Delta]
// Index: []