Django Pandas 到 http 响应(下载文件)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/35267585/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Django Pandas to http response (download file)
提问by Adrian Z.
Python: 2.7.11
蟒蛇:2.7.11
Django: 1.9
姜戈:1.9
Pandas: 0.17.1
Pandas:0.17.1
How should I go about creating a potentially large xlsx file download? I'm creating a xlsx file with pandas from a list of dictionaries and now need to give the user possibility to download it. The list is in a variable and is not allowed to be saved locally (on server).
我应该如何创建一个潜在的大型 xlsx 文件下载?我正在使用字典列表中的Pandas创建一个 xlsx 文件,现在需要让用户可以下载它。该列表在一个变量中,不允许保存在本地(在服务器上)。
Example:
例子:
df = pandas.DataFrame(self.csvdict)
writer = pandas.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
This example would just create the file and save it where the executing script is located. What I need is to create it to a http response so that the user would get a download prompt.
此示例将创建文件并将其保存在执行脚本所在的位置。我需要的是将其创建为 http 响应,以便用户获得下载提示。
I have found a few posts about doing this for a xlsxwriter but non for pandas. I also think that I should be using 'StreamingHttpResponse' for this and not a 'HttpResponse'.
我找到了一些关于为 xlsxwriter 执行此操作的帖子,但不为大Pandas执行此操作。我还认为我应该为此使用“StreamingHttpResponse”而不是“HttpResponse”。
采纳答案by MartinH
Jmcnamara is pointing you in the rigth direction. Translated to your question you are looking for the following code:
Jmcnamara 正在为您指明正确的方向。翻译您的问题,您正在寻找以下代码:
sio = StringIO()
PandasDataFrame = pandas.DataFrame(self.csvdict)
PandasWriter = pandas.ExcelWriter(sio, engine='xlsxwriter')
PandasDataFrame.to_excel(PandasWriter, sheet_name=sheetname)
PandasWriter.save()
sio.seek(0)
workbook = sio.getvalue()
response = StreamingHttpResponse(workbook, content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s' % filename
Notice the fact that you are saving the data to the StringIO variable and not to a file location. This way you prevent the file being saved before you generate the response.
请注意,您将数据保存到 StringIO 变量而不是文件位置。这样可以防止在生成响应之前保存文件。
回答by PlacidLush
I will elaborate on what @jmcnamara wrote. This if for the latest versions of Excel, Pandas and Django. The import statements would be at the top of your views.py and the remaining code could be in a view:
我将详细说明@jmcnamara 所写的内容。这适用于最新版本的 Excel、Pandas 和 Django。导入语句将在您的 views.py 的顶部,其余代码可能在一个视图中:
import pandas as pd
from django.http import HttpResponse
try:
    from io import BytesIO as IO # for modern python
except ImportError:
    from io import StringIO as IO # for legacy python
# this is my output data a list of lists
output = some_function()
df_output = pd.DataFrame(output)
# my "Excel" file, which is an in-memory output file (buffer) 
# for the new workbook
excel_file = IO()
xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')
df_output.to_excel(xlwriter, 'sheetname')
xlwriter.save()
xlwriter.close()
# important step, rewind the buffer or when it is read() you'll get nothing
# but an error message when you try to open your zero length file in Excel
excel_file.seek(0)
# set the mime type so that the browser knows what to do with the file
response = HttpResponse(excel_file.read(), content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
# set the file name in the Content-Disposition header
response['Content-Disposition'] = 'attachment; filename=myfile.xlsx'
return response
回答by brunostuyts
Just wanted to share a class-based view approach to this, using elements from the answers above. Just override the getmethod of a Django View. My model has a JSON field which contains the results of dumping a dataframe to JSON with the to_jsonmethod.
只是想分享一种基于类的视图方法,使用上述答案中的元素。只需覆盖getDjango的方法View。我的模型有一个 JSON 字段,其中包含使用该to_json方法将数据帧转储到 JSON 的结果。
Python version is 3.6 with Django 1.11.
Python 版本是 3.6,带有 Django 1.11。
# models.py
from django.db import models
from django.contrib.postgres.fields import JSONField
class myModel(models.Model):
    json_field = JSONField(verbose_name="JSON data")
# views.py
import pandas as pd
from io import BytesIO as IO
from django.http import HttpResponse
from django.views import View
from .models import myModel
class ExcelFileDownloadView(View):
    """
    Allows the user to download records in an Excel file
    """
    def get(self, request, *args, **kwargs):
        obj = myModel.objects.get(pk=self.kwargs['pk'])
        excel_file = IO()
        xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')
        pd.read_json(obj.json_field).to_excel(xlwriter, "Summary")
        xlwriter.save()
        xlwriter.close()
        excel_file.seek(0)
        response = HttpResponse(excel_file.read(),
                                content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
        response['Content-Disposition'] = 'attachment; filename="excel_file.xlsx"'
        return response
# urls.py
from django.conf.urls import url
from .views import ExcelFileDownloadView
urlpatterns = [
    url(r'^mymodel/(?P<pk>\d+)/download/$', ExcelFileDownloadView.as_view(), name="excel-download"),]
回答by Zheng Liu
Maybe a bit off-topic, but it's worth pointing out that the to_csvmethod is generally faster than to_excel, since excel contains format information of the sheets. If you only have data and not formatting information, consider to_csv. Microsoft Excel can view and edit csv files with no problem.
也许有点题外话,但值得指出的是,该to_csv方法通常比 快to_excel,因为 excel 包含工作表的格式信息。如果您只有数据而不是格式信息,请考虑to_csv. Microsoft Excel 可以毫无问题地查看和编辑 csv 文件。
One gain by using to_csvis that to_csvfunction can take any file-like object as the first argument, not only a filename string. Since Django response object is file-like, to_csvfunction can directly write to it. Some codes in your view function will look like:
使用的一个好处to_csv是该to_csv函数可以将任何类文件对象作为第一个参数,而不仅仅是文件名字符串。由于 Django 响应对象是类文件,to_csv函数可以直接写入它。视图函数中的一些代码如下所示:
df = <your dataframe to be downloaded>
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename=<default filename you wanted to give to the downloaded file>'
df.to_csv(response, index=False)
return response
Reference:
参考:
回答by jmcnamara
With Pandas 0.17+ you can use a StringIO/BytesIO object as a filehandle to pd.ExcelWriter. For example:
使用 Pandas 0.17+,您可以使用 StringIO/BytesIO 对象作为pd.ExcelWriter. 例如:
import pandas as pd
import StringIO
output = StringIO.StringIO()
# Use the StringIO object as the filehandle.
writer = pd.ExcelWriter(output, engine='xlsxwriter')
# Write the data frame to the StringIO object.
pd.DataFrame().to_excel(writer, sheet_name='Sheet1')
writer.save()
xlsx_data = output.getvalue()
print len(xlsx_data)
After that follow the XlsxWriter Python 2/3 HTTP examples.
之后,请遵循 XlsxWriter Python 2/3 HTTP示例。
For older versions of Pandas you can use this workaround.
对于旧版本的 Pandas,您可以使用此解决方法。
回答by America
You're mixing two requirements that should be separate:
您混合了两个应该分开的要求:
- Creating a .xlsx file using python or pandas--it looks like you're good on this part. 
- Serving a downloadable file (django); see this postor maybe this one 

