javascript 将大数据查询(60k+ 行)导出到 Excel

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11370672/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 12:56:36  来源:igfitidea点击:

Export a large data query (60k+ rows) to Excel

c#javascriptasp.netexcelgridview

提问by RJB

I created a reporting tool as part of an internal web application. The report displays all results in a GridView, and I used JavaScript to read the contents of the GridView row-by-row into an Excel object. The JavaScript goes on to create a PivotTable on a different worksheet.

我创建了一个报告工具作为内部 Web 应用程序的一部分。该报告将所有结果显示在一个 GridView 中,我使用 JavaScript 将 GridView 的内容逐行读取到 Excel 对象中。JavaScript 继续在不同的工作表上创建数据透视表。

Unfortunately I didn't expect that the size of the GridView would cause overloading problems with the browser if more than a few days are returned. The application has a few thousand records per day, let's say 60k per month, and ideally I'd like to be able to return all results for up to a year. The number of rows is causing the browser to hang or crash.

不幸的是,我没想到如果返回超过几天,GridView 的大小会导致浏览器过载问题。该应用程序每天有几千条记录,假设每月有 6 万条记录,理想情况下我希望能够返回长达一年的所有结果。行数导致浏览器挂起或崩溃。

We're using ASP.NET 3.5 on Visual Studio 2010 with SQL Server and the expected browser is IE8. The report consists of a gridview that gets data from one out of a handful of stored procedures depending on which population the user chooses. The gridview is in an UpdatePanel:

我们在带有 SQL Server 的 Visual Studio 2010 上使用 ASP.NET 3.5,预期的浏览器是 IE8。该报告由一个 gridview 组成,它根据用户选择的人群从少数存储过程中的一个获取数据。网格视图位于 UpdatePanel 中:

<asp:UpdatePanel ID="update_ResultSet" runat="server">
<Triggers>
    <asp:AsyncPostBackTrigger ControlID="btn_Submit" />
</Triggers>
<ContentTemplate>
<asp:Panel ID="pnl_ResultSet" runat="server" Visible="False">
    <div runat="server" id="div_ResultSummary">
        <p>This Summary Section is Automatically Completed from Code-Behind</p>
    </div>
        <asp:GridView ID="gv_Results" runat="server" 
            HeaderStyle-BackColor="LightSkyBlue" 
            AlternatingRowStyle-BackColor="LightCyan"  
            Width="100%">
        </asp:GridView>
    </div>
</asp:Panel>
</ContentTemplate>
</asp:UpdatePanel>

I was relatively new to my team, so I followed their typical practice of returning the sproc to a DataTable and using that as the DataSource in the code behind:

我对我的团队来说相对较新,所以我遵循了他们将 sproc 返回到 DataTable 并将其用作后面代码中的 DataSource 的典型做法:

    List<USP_Report_AreaResult> areaResults = new List<USP_Report_AreaResult>();
    areaResults = db.USP_Report_Area(ddl_Line.Text, ddl_Unit.Text, ddl_Status.Text, ddl_Type.Text, ddl_Subject.Text, minDate, maxDate).ToList();
    dtResults = Common.LINQToDataTable(areaResults);

    if (dtResults.Rows.Count > 0)
    {
        PopulateSummary(ref dtResults);
        gv_Results.DataSource = dtResults;
        gv_Results.DataBind();

(I know what you're thinking! But yes, I have learned much more about parameterization since then.)

(我知道你在想什么!但是是的,从那以后我学到了更多关于参数化的知识。)

The LINQToDataTable function isn't anything special, just converts a list to a datatable.

LINQToDataTable 函数没什么特别的,只是将列表转换为数据表。

With a few thousand records (up to a few days), this works fine. The GridView displays the results, and there's a button for the user to click which launches the JScript exporter. The external JavaScript function reads each row into an Excel sheet, and then uses that to create a PivotTable. The PivotTable is important!

有几千条记录(最多几天),这很好用。GridView 显示结果,并且有一个按钮供用户单击以启动 JScript 导出器。外部 JavaScript 函数将每一行读入 Excel 工作表,然后使用它来创建数据透视表。数据透视表很重要!

function exportToExcel(sMyGridViewName, sTitleOfReport, sHiddenCols) {
//sMyGridViewName = the name of the grid view, supplied as a text
//sTitleOfReport = Will be used as the page header if the spreadsheet is printed
//sHiddenCols = The columns you want hidden when sent to Excel, separated by semicolon (i.e. 1;3;5).
//              Supply an empty string if all columns are visible.

var oMyGridView = document.getElementById(sMyGridViewName);

//If no data is on the GridView, display alert.
if (oMyGridView == null)
    alert('No data for report');
else {
    var oHid = sHiddenCols.split(";");  //Contains an array of columns to hide, based on the sHiddenCols function parameter
    var oExcel = new ActiveXObject("Excel.Application");
    var oBook = oExcel.Workbooks.Add;
    var oSheet = oBook.Worksheets(1);
    var iRow = 0;
    for (var y = 0; y < oMyGridView.rows.length; y++)
    //Export all non-hidden rows of the HTML table to excel.
    {
        if (oMyGridView.rows[y].style.display == '') {
            var iCol = 0;
            for (var x = 0; x < oMyGridView.rows(y).cells.length; x++) {
                var bHid = false;
                for (iHidCol = 0; iHidCol < oHid.length; iHidCol++) {
                    if (oHid[iHidCol].length !=0 && oHid[iHidCol] == x) {
                        bHid = true;
                        break; 
                    } 
                }
                if (!bHid) {
                    oSheet.Cells(iRow + 1, iCol + 1) = oMyGridView.rows(y).cells(x).innerText;
                    iCol++;
                }
            }
            iRow++;
        }
    }

What I'm trying to do:Create a solution (probably client-side) that can handle this data and process it into Excel. Someone might suggest using the HtmlTextWriter, but afaik that doesn't allow for automatically generating a PivotTable and creates an obnoxious pop-up warning....

我想要做的是:创建一个解决方案(可能是客户端),可以处理这些数据并将其处理成 Excel。有人可能会建议使用HtmlTextWriter,但 afaik 不允许自动生成数据透视表并创建一个令人讨厌的弹出警告......

What I've tried:

我试过的:

  • Populating a JSON object -- I still think this has potential but I haven't found a way of making it work.
  • Using a SQLDataSource -- I can't seem to use it to get any data back out.
  • Paginating and looping through the pages -- Mixed progress. Generally ugly though, and I still have the problem that the entire dataset is queried and returned for each page displayed.
  • 填充一个 JSON 对象——我仍然认为这有潜力,但我还没有找到让它工作的方法。
  • 使用 SQLDataSource——我似乎无法使用它来获取任何数据。
  • 分页和循环页面 - 混合进度。虽然通常很难看,但我仍然遇到问题,即为显示的每个页面查询和返回整个数据集。

Update:I'm still very open to alternate solutions, but I've been pursuing the JSON theory. I have a working server-side method that generates the JSON object from a DataTable. I can't figure out how to pass that JSON into the (external) exportToExcel JavaScript function....

更新:我仍然对替代解决方案持开放态度,但我一直在追求 JSON 理论。我有一个有效的服务器端方法,可以从 DataTable 生成 JSON 对象。我不知道如何将该 JSON 传递到(外部)exportToExcel JavaScript 函数中....

    protected static string ConstructReportJSON(ref DataTable dtResults)
    {
        StringBuilder sb = new StringBuilder();
        sb.Append("var sJSON = [");
        for (int r = 0; r < dtResults.Rows.Count; r++)
        {
            sb.Append("{");
            for (int c = 0; c < dtResults.Columns.Count; c++)
            {
                sb.AppendFormat("\"{0}\":\"{1}\",", dtResults.Columns[c].ColumnName, dtResults.Rows[r][c].ToString());
            }
            sb.Remove(sb.Length - 1, 1); //Truncate the trailing comma
            sb.Append("},");
        }
        sb.Remove(sb.Length - 1, 1);
        sb.Append("];");
        return sb.ToString();
    }

Can anybody show an example of how to carry this JSON object into an external JS function? Or any other solution for the export to Excel.

任何人都可以展示如何将此 JSON 对象携带到外部 JS 函数中的示例吗?或任何其他导出到 Excel 的解决方案。

回答by Greg

It's easy and efficient to write CSV files. However, if you need Excel, it can also be donein a reasonably efficient way, that can handle 60,000+ rows by using the Microsoft Open XML SDK's open XML Writer.

编写 CSV 文件既简单又高效。但是,如果您需要 Excel,也可以通过使用 Microsoft Open XML SDK 的 open XML Writer以一种相当有效的方式来完成,即可以处理 60,000 多行。

  1. Install Microsoft Open SDK if you don't have it already (google "download microsoft open xml sdk")
  2. Create a Console App
  3. Add Reference to DocumentFormat.OpenXml
  4. Add Reference to WindowsBase
  5. Try running some test code like below (will need a few using's)
  1. 如果您还没有安装 Microsoft Open SDK,请安装它(谷歌“下载 microsoft open xml sdk”)
  2. 创建控制台应用程序
  3. 添加对 DocumentFormat.OpenXml 的引用
  4. 添加对 WindowsBase 的引用
  5. 尝试运行一些像下面这样的测试代码(需要一些使用)

Just Check out Vincent Tan's solution at http://polymathprogrammer.com/2012/08/06/how-to-properly-use-openxmlwriter-to-write-large-excel-files/( Below, I cleaned up his example slightly to help new users. )

只需在http://polymathprogrammer.com/2012/08/06/how-to-properly-use-openxmlwriter-to-write-large-excel-files/ 上查看 Vincent Tan 的解决方案 (下面,我稍微清理了他的示例帮助新用户。)

In my own use I found this pretty straight forward with regular data, but I did have to strip out "\0" characters from my real data.

在我自己的使用中,我发现使用常规数据非常简单,但我确实必须从我的真实数据中去除“\0”字符。

using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

...

...

        using (var workbook = SpreadsheetDocument.Create("SomeLargeFile.xlsx", SpreadsheetDocumentType.Workbook))
        {
            List<OpenXmlAttribute> attributeList;
            OpenXmlWriter writer;

            workbook.AddWorkbookPart();
            WorksheetPart workSheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();

            writer = OpenXmlWriter.Create(workSheetPart);
            writer.WriteStartElement(new Worksheet());
            writer.WriteStartElement(new SheetData());

            for (int i = 1; i <= 50000; ++i)
            {
                attributeList = new List<OpenXmlAttribute>();
                // this is the row index
                attributeList.Add(new OpenXmlAttribute("r", null, i.ToString()));

                writer.WriteStartElement(new Row(), attributeList);

                for (int j = 1; j <= 100; ++j)
                {
                    attributeList = new List<OpenXmlAttribute>();
                    // this is the data type ("t"), with CellValues.String ("str")
                    attributeList.Add(new OpenXmlAttribute("t", null, "str"));

                    // it's suggested you also have the cell reference, but
                    // you'll have to calculate the correct cell reference yourself.
                    // Here's an example:
                    //attributeList.Add(new OpenXmlAttribute("r", null, "A1"));

                    writer.WriteStartElement(new Cell(), attributeList);

                    writer.WriteElement(new CellValue(string.Format("R{0}C{1}", i, j)));

                    // this is for Cell
                    writer.WriteEndElement();
                }

                // this is for Row
                writer.WriteEndElement();
            }

            // this is for SheetData
            writer.WriteEndElement();
            // this is for Worksheet
            writer.WriteEndElement();
            writer.Close();

            writer = OpenXmlWriter.Create(workbook.WorkbookPart);
            writer.WriteStartElement(new Workbook());
            writer.WriteStartElement(new Sheets());

            // you can use object initialisers like this only when the properties
            // are actual properties. SDK classes sometimes have property-like properties
            // but are actually classes. For example, the Cell class has the CellValue
            // "property" but is actually a child class internally.
            // If the properties correspond to actual XML attributes, then you're fine.
            writer.WriteElement(new Sheet()
            {
                Name = "Sheet1",
                SheetId = 1,
                Id = workbook.WorkbookPart.GetIdOfPart(workSheetPart)
            });

            writer.WriteEndElement(); // Write end for WorkSheet Element
            writer.WriteEndElement(); // Write end for WorkBook Element
            writer.Close();

            workbook.Close();
        }

If you review that code you'll notice two major writes, first the Sheet, and then later the workbook that contains the sheet. The workbook part is the boring part at the end, the earlier sheet part contains all the rows and columns.

如果您查看该代码,您会注意到两个主要写入,首先是工作表,然后是包含工作表的工作簿。工作簿部分是最后的无聊部分,较早的工作表部分包含所有行和列。

In your own adaptation, you could write real string values into the cells from your own data. Instead, above, we're just using the row and column numbering.

在您自己的改编中,您可以将真实的字符串值从您自己的数据写入单元格。相反,在上面,我们只是使用行和列编号。

writer.WriteElement(new CellValue("SomeValue"));

Worth noting, the row numbering in Excel starts at 1 and not 0. Starting rows numbered from an index of zero will lead to "Corrupt file" error messages.

值得注意的是,Excel 中的行编号从 1 开始而不是 0。从索引为零开始编号的行将导致“文件损坏”错误消息。

Lastly, if you're working with very large sets of data, never call ToList(). Use a data readerstyle methodology of streaming the data. For example, you could have an IQueryableand utilize it in a for each. You never really want to have to rely on having all the data in memory at the same time, or you'll hit an out of memory limitation and/or high memory utilization.

最后,如果您正在处理非常大的数据集,切勿调用 ToList()。使用数据阅读器风格的数据流方法。例如,您可以拥有一个IQueryable并在for each 中使用它。您永远不想依赖同时将所有数据保存在内存中,否则会遇到内存不足限制和/或内存利用率高的问题。

回答by Kevin Crowell

I would try to use displaytagto display the results. You could set it up display a certain number per page, which should solve your overloading issue. Then, you can set displaytag to allow for an Excel export.

我会尝试使用displaytag来显示结果。您可以将其设置为每页显示特定数量,这应该可以解决您的超载问题。然后,您可以设置 displaytag 以允许 Excel 导出。

回答by mellodev

We typically handle this with an "Export" command button which is wired up to a server side method to grab the dataset and convert it to CSV. Then we adjust the response headers and the browser will treat it as a download. I know this is a server side solution, but you may want to consider it since you'll continue having timeout and browser issues until you implement server side record paging.

我们通常使用“导出”命令按钮处理此问题,该按钮连接到服务器端方法以获取数据集并将其转换为 CSV。然后我们调整响应头,浏览器会将其视为下载。我知道这是一个服务器端解决方案,但您可能需要考虑它,因为在您实现服务器端记录分页之前,您将继续遇到超时和浏览器问题。

回答by RJB

Almost a week and a half since I began this problem, I've finally managed to get it all working to some extent. I'll wait temporarily from marking an answer to see if anybody else has a more efficient, better 'best practices' method.

自从我开始解决这个问题差不多一个半星期以来,我终于设法在某种程度上让它一切正常。我会暂时等待标记答案,看看其他人是否有更有效、更好的“最佳实践”方法。

By generating a JSON string, I've divorced the JavaScript from the GridView. The JSON is generated in code behind when the data is populated:

通过生成 JSON 字符串,我将 JavaScript 与 GridView 分离。填充数据时,会在后面的代码中生成 JSON:

    protected static string ConstructReportJSON(ref DataTable dtResults)
    {
        StringBuilder sb = new StringBuilder();
        for (int r = 0; r < dtResults.Rows.Count; r++)
        {
            sb.Append("{");
            for (int c = 0; c < dtResults.Columns.Count; c++)
            {
                sb.AppendFormat("\"{0}\":\"{1}\",", dtResults.Columns[c].ColumnName, dtResults.Rows[r][c].ToString());
            }
            sb.Remove(sb.Length - 1, 1); //Truncate the trailing comma
            sb.Append("},");
        }
        sb.Remove(sb.Length - 1, 1);
        return String.Format("[{0}]", sb.ToString());
    }

Returns a string of data such as

返回一串数据,如

[ {"Caller":"John Doe", "Office":"5555","Type":"Incoming", etc},

[ {"Caller":"John Doe", "Office":"5555","Type":"Incoming", etc},

{"Caller":"Jane Doe", "Office":"7777", "Type":"Outgoing", etc}, {etc} ]

{"Caller":"Jane Doe", "Office":"7777", "Type":"Outgoing", etc}, {etc}]

I've hidden this string by assigning the text to a Literal in the UpdatePanel using:

我通过使用以下命令将文本分配给 UpdatePanel 中的 Literal 来隐藏此字符串:

    <div id="div_JSON" style="display: none;">
            <asp:Literal id="lit_JSON" runat="server" /> 
    </div>

And the JavaScript parses that output by reading the contents of the div:

JavaScript 通过读取 div 的内容来解析输出:

function exportToExcel_Pivot(sMyJSON, sTitleOfReport, sReportPop) {
     //sMyJSON = the name, supplied as a text, of the hidden element that houses the JSON array.
     //sTitleOfReport = Will be used as the page header if the spreadsheet is printed.
     //sReportPop = Determines which business logic to create a pivot table for.

var sJSON = document.getElementById(sMyJSON).innerHTML;
var oJSON = eval("(" + sJSON + ")");

 //    DEBUG Example Test Code
 //    for (x = 0; x < oJSON.length; x++) {
 //        for (y in oJSON[x])
 //            alert(oJSON[x][y]); //DEBUG, returns field value
 //            alert(y); //DEBUG, returns column name
 //    }


//If no data is in the JSON object array, display alert.
if (oJSON == null)
    alert('No data for report');
else {
    var oExcel = new ActiveXObject("Excel.Application");
    var oBook = oExcel.Workbooks.Add;
    var oSheet = oBook.Worksheets(1);
    var oSheet2 = oBook.Worksheets(2);
    var iRow = 0;
    var iCol = 0;

        //Take the column names of the JSON object and prepare them in Excel
        for (header in oJSON[0])
        {
            oSheet.Cells(iRow + 1, iCol + 1) = header;
            iCol++;
        }

        iRow++;

        //Export all rows of the JSON object to excel
        for (var r = 0; r < oJSON.length; r++)
        {
            iCol = 0;
            for (c in oJSON[r]) 
                    {
                        oSheet.Cells(iRow + 1, iCol + 1) = oJSON[r][c];
                        iCol++;
                    } //End column loop
            iRow++;
        } //End row

The string output and the JavaScript 'eval' parsing both work surprisingly fast, but looping through the JSON object is a little slower than I'd like.

字符串输出和 JavaScript 'eval' 解析都出奇地快,但循环遍历 JSON 对象比我想要的要慢一些。

I believe that this method would be limited to around 1 billion characters of data -- maybe less depending how memory testing works out. (I've calculated that I'll probably be looking at a maximum of 1 million characters per day, so that should be fine, within one year of reporting.)

我相信这种方法将被限制为大约 10 亿个字符的数据——可能更少,这取决于内存测试的结果。(我已经计算出我每天最多可能会查看 100 万个字符,因此在报告后的一年内应该没问题。)