pandas 用熊猫创建空的 csv 文件

Question

提问by PaulBarr

I am interacting through a number of csv files and want to append the mean temperatures to a blank csv file. How do you create an empty csv file with pandas?

我正在通过多个 csv 文件进行交互，并希望将平均温度附加到一个空白的 csv 文件中。你如何用Pandas创建一个空的 csv 文件？

for EachMonth in MonthsInAnalysis:
    TheCurrentMonth = pd.read_csv('MonthlyDataSplit/Day/Day%s.csv' % EachMonth)
    MeanDailyTemperaturesForCurrentMonth = TheCurrentMonth.groupby('Day')['AirTemperature'].mean().reset_index(name='MeanDailyAirTemperature')
    with open('my_csv.csv', 'a') as f:
        df.to_csv(f, header=False)

So in the above code how do I create the my_csv.csvprior to the forloop?

那么在上面的代码中我如何创建循环my_csv.csv之前的for？

Just a note I know you can create a data frame then save the data frame to csv but I am interested in whether you can skip this step.

请注意，我知道您可以创建一个数据框，然后将数据框保存到 csv，但我对您是否可以跳过此步骤感兴趣。

In terms of context I have the following csv files:

就上下文而言，我有以下 csv 文件：

Each of which have the following structure:

每个都具有以下结构：

The Day column reads up to 30 days for each file.

日期列为每个文件读取最多 30 天。

I would like to output a csv file that looks like this:

我想输出一个如下所示的 csv 文件：

But obviously includes all the days for all the months.

但显然包括所有月份的所有天数。

My issue is that I don't know which months are included in each analysis hence I wanted to use a for loop that used a list that has that information in it to access the relevant csvs, calculate the mean temperature then save it all into one csv.

我的问题是我不知道每个分析中包含哪些月份，因此我想使用一个 for 循环，该循环使用一个包含该信息的列表来访问相关的 csvs，计算平均温度，然后将其全部保存为一个.csv

Input as text:

输入为文本：

    Unnamed: 0  AirTemperature  AirHumidity SoilTemperature SoilMoisture    LightIntensity  WindSpeed   Year    Month   Day Hour    Minute  Second  TimeStamp   MonthCategorical    TimeOfDay
6   6   18  84  17  41  40  4   2016    1   1   6   1   1   10106   January Day
7   7   20  88  22  92  31  0   2016    1   1   7   1   1   10107   January Day
8   8   23  1   22  59  3   0   2016    1   1   8   1   1   10108   January Day
9   9   23  3   22  72  41  4   2016    1   1   9   1   1   10109   January Day
10  10  24  63  23  83  85  0   2016    1   1   10  1   1   10110   January Day
11  11  29  73  27  50  1   4   2016    1   1   11  1   1   10111   January Day

Answer 1

采纳答案by MaxU

I would do it this way: first read up all your CSV files (but only the columns that you really need) into one DF, then make groupby(['Year','Month','Day']).mean()and save resulting DF into CSV file:

我会这样做：首先将所有 CSV 文件（但只有您真正需要的列）读入一个 DF，然后groupby(['Year','Month','Day']).mean()将生成的 DF 生成并保存到 CSV 文件中：

import glob
import pandas as pd

fmask = 'MonthlyDataSplit/Day/Day*.csv'
df = pd.concat((pd.read_csv(f, sep=',', usecols=['Year','Month','Day','AirTemperature']) for f in glob.glob(fmask)))
df.groupby(['Year','Month','Day']).mean().to_csv('my_csv.csv')

and if want to ignore the year:

如果想忽略年份：

import glob
import pandas as pd

fmask = 'MonthlyDataSplit/Day/Day*.csv'
df = pd.concat((pd.read_csv(f, sep=',', usecols=['Month','Day','AirTemperature']) for f in glob.glob(fmask)))
df.groupby(['Month','Day']).mean().to_csv('my_csv.csv')

Some details:

一些细节：

(pd.read_csv(f, sep=',', usecols=['Month','Day','AirTemperature']) for f in glob.glob('*.csv'))

will generate tuple of data frames from all your CSV files

将从您的所有 CSV 文件生成数据框元组

pd.concat(...)

will concatenate them into resulting single DF

将它们连接成结果单个 DF

df.groupby(['Year','Month','Day']).mean()

will produce wanted report as a data frame, which might be saved into new CSV file:

将生成想要的报告作为数据框，它可能会保存到新的 CSV 文件中：

.to_csv('my_csv.csv')

Answer 2

回答by Stop harming Monica

Just open the file in write mode to create it.

只需以写入模式打开文件即可创建它。

with open('my_csv.csv', 'w'):
    pass

Anyway I do not think you should be opening and closing the file so many times. You'd better open the file once, write several times.

无论如何，我认为您不应该多次打开和关闭文件。你最好打开文件一次，多写几遍。

with open('my_csv.csv', 'w') as f:
    for EachMonth in MonthsInAnalysis:
        TheCurrentMonth = pd.read_csv('MonthlyDataSplit/Day/Day%s.csv' % EachMonth)
        MeanDailyTemperaturesForCurrentMonth = TheCurrentMonth.groupby('Day')['AirTemperature'].mean().reset_index(name='MeanDailyAirTemperature')
        df.to_csv(f, header=False)

Answer 3

回答by Shinto Joseph

Creating a blank csv file is as simple as this one

创建一个空白的 csv 文件就像这个一样简单

import pandas as pd

pd.DataFrame({}).to_csv("filename.csv")

Answer 4

回答by Chris

The problem is a little unclear, but assuming you have to iterate month by month, and apply the groupby as stated just use:

问题有点不清楚，但假设您必须逐月迭代，并按照说明应用 groupby，只需使用：

 #Before loops
 dflist=[]

Then in each loop do something like:

然后在每个循环中执行以下操作：

 dflist.append(MeanDailyTemperaturesForCurrentMonth)

Then at the end:

然后在最后：

 final_df = pd.concat([dflist], axis=1)

and this will join everything into one dataframe.

这会将所有内容合并为一个数据帧。

Look at:

看着：

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html

http://pandas.pydata.org/pandas-docs/stable/merging.html

pandas 用熊猫创建空的 csv 文件

提问by PaulBarr

采纳答案by MaxU

回答by Stop harming Monica

回答by Shinto Joseph

回答by Chris

相关推荐

最近更新

标签

pandas 用熊猫创建空的 csv 文件

提问by PaulBarr

采纳答案by MaxU

回答by Stop harming Monica

回答by Shinto Joseph

回答by Chris

相关推荐

pandas 创建数据框字典

pandas 包含数组的熊猫系列

pandas Python：如何将数据框字典变成一个大数据框，其中列名是前一个字典的键？

pandas np.where 多个返回值

相关推荐

最近更新

标签