pandas 如何将一列从一个 csv 文件附加到第二个 csv（具有不同的索引）

Question

提问by Matt

I am working on concatenating many csv files together and want to take one column, from a multicolumn csv, and append it as a new column in a second csv. The problem is that the columns have different numbers of rows so the new column that I am adding to the existing csv gets cut short once the row index from the existing csv is reached.

我正在将许多 csv 文件连接在一起，并希望从多列 csv 中取出一列，并将其作为新列附加到第二个 csv 中。问题是这些列的行数不同，因此一旦达到现有 csv 的行索引，我添加到现有 csv 的新列就会被缩短。

I have tried to read in the new column as a second dataframe and then add that dataframe as a new column to the existing csv.

我试图将新列作为第二个数据框读入，然后将该数据框作为新列添加到现有的 csv 中。

df = pd.read_csv("Existing CSV.csv")
df2 = pd.read_csv("New CSV.csv", usecols = ['Desired Column'])
df["New CSV"] = df2

"Existing CSV" has 1200 rows of data while "New CSV" has 1500 rows. When I run the code, the 'New CSV" column is added to "Existing CSV", however, only the first 1200 rows of data are included.

“现有 CSV”有 1200 行数据，而“新 CSV”有 1500 行数据。当我运行代码时，“新 CSV”列被添加到“现有 CSV”，但是，只包含前 1200 行数据。

Ideally, all 1500 rows from "New CSV" will be included and the 300 rows missing from "Existing CSV" will be left blank.

理想情况下，“新 CSV”中的所有 1500 行都将包含在内，“现有 CSV”中缺少的 300 行将留空。

Answer 1

回答by Peter Leimbigler

By default, read_csvgives the resulting DataFrame an integer index, so I can think of a couple of options to try.

默认情况下，read_csv为生成的 DataFrame 提供一个整数索引，因此我可以想到几个尝试的选项。

Setup

设置

df = pd.read_csv("Existing CSV.csv")
df2 = pd.read_csv("New CSV.csv", usecols = ['Desired Column'])

Method 1: `join`

方法一： `join`

df = df.join(df2['Desired Column'], how='right')

Method 2: `reindex_like`and `assign`

方法2：`reindex_like`和`assign`

df = df.reindex_like(df2).assign(**{'Desired Column': df2['Desired Column']})

pandas 如何将一列从一个 csv 文件附加到第二个 csv（具有不同的索引）

提问by Matt

回答by Peter Leimbigler

Setup

设置

Method 1: `join`

方法一： `join`

Method 2: `reindex_like`and `assign`

方法2：`reindex_like`和`assign`

相关推荐

最近更新

标签

pandas 如何将一列从一个 csv 文件附加到第二个 csv（具有不同的索引）

提问by Matt

回答by Peter Leimbigler

Setup

设置

Method 1: join

方法一： join

Method 2: reindex_likeand assign

方法2：reindex_like和assign

相关推荐

将 Pandas Dataframe 转换为 numpy 数组

在 python pandas 中创建新的日期列

pandas 熊猫图：带索引的散点图

Pandas：如何解决“错误标记数据”？

相关推荐

最近更新

标签

Method 1: `join`

方法一： `join`

Method 2: `reindex_like`and `assign`

方法2：`reindex_like`和`assign`