pandas python数据框水平追加列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44774829/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:53:17  来源:igfitidea点击:

python dataframe appending columns horizontally

pythonpandasdataframeappendconcat

提问by Bong Kyo Seo

I am trying to make a simple script that concatenates or appends multiple column sets that I pull from xls files within a directory. Each xls file has a format of:

我正在尝试制作一个简单的脚本,该脚本连接或附加我从目录中的 xls 文件中提取的多个列集。每个 xls 文件的格式为:

Index    Exp. m/z   Intensity   
1        1000.11    1000
2        2000.14    2000
3        3000.15    3000

Each file has varying number of indices. Below is my code:

每个文件都有不同数量的索引。下面是我的代码:

import pandas as pd
import os
import tkinter.filedialog

full_path = tkinter.filedialog.askdirectory(initialdir='.')
os.chdir(full_path)

data = {}
df = pd.DataFrame()

for files in os.listdir(full_path):
    if os.path.isfile(os.path.join(full_path, files)):
        df = pd.read_excel(files, 'Sheet1')[['Exp. m/z', 'Intensity']]
        data = df.concat(df, axis=1)

data.to_excel('test.xls', index=False)

This produces an attributerror: DataFrame object has no attribute concat. I also tried using append like:

这会产生一个属性错误:DataFrame 对象没有属性 concat。我也尝试使用 append 像:

data = df.append(df, axis=1) 

but I know that append has no axis keyword argument. df.append(df) does work, but it places the columns at the bottom. I want something like:

但我知道 append 没有轴关键字参数。df.append(df) 确实有效,但它将列放在底部。我想要这样的东西:

Exp. m/z   Intensity       Exp. m/z   Intensity  
1000.11    1000            1001.43    1000
2000.14    2000            1011.45    2000
3000.15    3000

and so on. So the column sets that I pull from each file should be placed to the right of the previous column sets, with a column space in between.

等等。所以我从每个文件中提取的列集应该放在前一个列集的右侧,中间有一个列空间。

回答by jezrael

I think you need appendDataFramesto list and then pd.concat:

我认为你需要appendDataFrames列出然后pd.concat

dfs = []
for files in os.listdir(full_path):
    if os.path.isfile(os.path.join(full_path, files)):
        df = pd.read_excel(files, 'Sheet1')[['Exp. m/z', 'Intensity']]
        #for add empty column 
        df['empty'] = np.nan
        dfs.append(df)
data = pd.concat(dfs, axis=1)