pandas 如何将熊猫数据帧逐行写入 CSV 文件,一次一行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51296758/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:47:56  来源:igfitidea点击:

How to write a pandas dataframe to CSV file line by line, one line at a time?

pythonpandasread-writewritetofile

提问by Kristada673

I have a list of about 1 million addresses, and a function to find their latitudes and longitudes. Since some of the records are improperly formatted (or for whatever reason), sometimes the function is not able to return the latitudes and longitudes of some addresses. This would lead to the for loop breaking. So, for each address whose latitude and longitude is successfully retrieved, I want to write it to the output CSV file. Or, perhaps instead of writing line by line, writing in small chunk sizes would also work. For this, I am using df.to_csvin "append" mode (mode='a') as shown below:

我有一个大约 100 万个地址的列表,以及一个查找它们的纬度和经度的函数。由于某些记录格式不正确(或出于任何原因),有时该函数无法返回某些地址的纬度和经度。这将导致 for 循环中断。因此,对于成功检索到纬度和经度的每个地址,我想将其写入输出 CSV 文件。或者,也许不是逐行写入,以小块大小写入也可以。为此,我df.to_csv在“追加”模式 ( mode='a') 下使用,如下所示:

for i in range(len(df)):
    place = df['ADDRESS'][i]
    try:
        lat, lon, res = gmaps_geoencoder(place)
    except:
        pass

    df['Lat'][i] = lat
    df['Lon'][i] = lon
    df['Result'][i] = res

    df.to_csv(output_csv_file,
          index=False,
          header=False,
          mode='a', #append data to csv file
          chunksize=chunksize) #size of data to append for each loop

But the problem with this is that, it is printing the whole dataframe for each append. So, for nlines, it would write the whole dataframe n^2times. How to fix this?

但问题在于,它正在为每个附加打印整个数据帧。因此,对于n行,它将写入整个数据帧n^2时间。如何解决这个问题?

采纳答案by Robert Altena

If you really want to print line by line. (You should not).

如果你真的想逐行打印。(你不应该)。

for i in range(len(df)):
    df.loc[[i]].to_csv(output_csv_file,
        index=False,
        header=False,
        mode='a')