vb.net 使用Interop从excel中获取最后一个非空列和行索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43910117/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 20:16:28  来源:igfitidea点击:

Get Last non empty column and row index from excel using Interop

c#excelvb.netssisetl

提问by Yahfoufi

I am trying to remove all extra blank rows and columns from an excel file using Interop Library.

我正在尝试使用 Interop 库从 excel 文件中删除所有额外的空白行和列。

I followed this question Fastest method to remove Empty rows and Columns From Excel Files using Interopand i find it helpful.

我按照这个问题Fastest method to remove Empty rows and Columns From Excel Files using Interop我发现它很有帮助。

But i have excel files that contains a small set of data but a lot of empty rows and columns (from the last non empty row (or column) to the end of the worksheet)

但我有 excel 文件,其中包含一小组数据但有很多空行和列(从最后一个非空行(或列)到工作表的末尾)

I tried looping over Rows and Columns but the loop is taking hours.

我尝试循环遍历行和列,但循环需要几个小时。

I am trying to get the last non-empty row and column index so i can delete the whole empty range in one line

我正在尝试获取最后一个非空行和列索引,以便我可以在一行中删除整个空范围

XlWks.Range("...").EntireRow.Delete(xlShiftUp)

enter image description here

在此处输入图片说明

Note: i am trying to get the last row containing data to remove all extra blanks (after this row , or column)

注意:我试图获取包含数据的最后一行以删除所有额外的空白(在此行或列之后)

Any suggestions?

有什么建议?



Note: The code must be compatible with SSIS Script Task environment

注意:代码必须兼容SSIS Script Task环境

采纳答案by Hadi

Update 1

更新 1

If your goal is to import the excel data using c#, assuming that you have identified the the highest used index in your worksheet (in the image you posted it is Col = 10 , Row = 16), you can convert the maximum used indexes to letter so it will be J16and select only the used range using and OLEDBCommand

如果您的目标是使用 c# 导入 excel 数据,假设您已经确定了工作表中使用率最高的索引(在您发布的图像中为 Col = 10 , Row = 16),您可以将最大使用索引转换为字母,因此它将是J16并仅使用和选择使用的范围OLEDBCommand

SELECT * FROM [Sheet1$A1:J16]

Else, i don't think it is easy to find a faster method.

否则,我认为找到更快的方法并不容易。

You can refer to these article to convert indexes into alphabet and to connect to excel using OLEDB:

您可以参考这些文章将索引转换为字母表并使用 OLEDB 连接到 excel:



Initial Answer

初步答复

As you said you started from the following question:

正如您所说,您从以下问题开始:

And you are trying to "get the last row containing data to remove all extra blanks (after this row , or column)"

并且您正在尝试“获取包含数据的最后一行以删除所有额外的空白(在此行或列之后)”

So assuming that you are working with the accept answer (provided by @JohnG), so you can add some line of code to get the last used row and column

因此,假设您正在使用接受答案(由@JohnG提供),因此您可以添加一些代码行以获取最后使用的行和列

Empty Rows are stored in a list of integer rowsToDelete

空行存储在整数列表中 rowsToDelete

You can use the following code to get the last non empty rows with an index smaller than the last empty row

您可以使用以下代码获取索引小于最后一个空行的最后一个非空行

List<int> NonEmptyRows = Enumerable.Range(1, rowsToDelete.Max()).ToList().Except(rowsToDelete).ToList();

And if NonEmptyRows.Max() < rowsToDelete.Max()the last non-empty row is NonEmptyRows.Max()Else it is worksheet.Rows.Countand there is no empty rows after the last used one.

如果NonEmptyRows.Max() < rowsToDelete.Max()最后一个非空行是NonEmptyRows.Max()Else,则worksheet.Rows.Count在最后使用的行之后没有空行。

The same thing can be done to get the last non empty column

可以做同样的事情来获得最后一个非空列

The code is Edited in DeleteColsand DeleteRowsfunctions:

代码被编辑在DeleteColsDeleteRows功能:

    private static void DeleteRows(List<int> rowsToDelete, Microsoft.Office.Interop.Excel.Worksheet worksheet)
    {
        // the rows are sorted high to low - so index's wont shift

        List<int> NonEmptyRows = Enumerable.Range(1, rowsToDelete.Max()).ToList().Except(rowsToDelete).ToList();

        if (NonEmptyRows.Max() < rowsToDelete.Max())
        {

            // there are empty rows after the last non empty row

            Microsoft.Office.Interop.Excel.Range cell1 = worksheet.Cells[NonEmptyRows.Max() + 1,1];
            Microsoft.Office.Interop.Excel.Range cell2 = worksheet.Cells[rowsToDelete.Max(), 1];

            //Delete all empty rows after the last used row
            worksheet.Range[cell1, cell2].EntireRow.Delete(Microsoft.Office.Interop.Excel.XlDeleteShiftDirection.xlShiftUp);


        }    //else last non empty row = worksheet.Rows.Count



        foreach (int rowIndex in rowsToDelete.Where(x => x < NonEmptyRows.Max()))
        {
            worksheet.Rows[rowIndex].Delete();
        }
    }

    private static void DeleteCols(List<int> colsToDelete, Microsoft.Office.Interop.Excel.Worksheet worksheet)
    {
        // the cols are sorted high to low - so index's wont shift

        //Get non Empty Cols
        List<int> NonEmptyCols = Enumerable.Range(1, colsToDelete.Max()).ToList().Except(colsToDelete).ToList();

        if (NonEmptyCols.Max() < colsToDelete.Max())
        {

            // there are empty rows after the last non empty row

            Microsoft.Office.Interop.Excel.Range cell1 = worksheet.Cells[1,NonEmptyCols.Max() + 1];
            Microsoft.Office.Interop.Excel.Range cell2 = worksheet.Cells[1,NonEmptyCols.Max()];

            //Delete all empty rows after the last used row
            worksheet.Range[cell1, cell2].EntireColumn.Delete(Microsoft.Office.Interop.Excel.XlDeleteShiftDirection.xlShiftToLeft);


        }            //else last non empty column = worksheet.Columns.Count

        foreach (int colIndex in colsToDelete.Where(x => x < NonEmptyCols.Max()))
        {
            worksheet.Columns[colIndex].Delete();
        }
    }

回答by Karen Payne

Several years ago I created a MSDN code sample that permits a developer to get the last used row and column from a worksheet. I modified it, placed all needed code into a class library with a windows form front end to demo the operation.

几年前,我创建了一个 MSDN 代码示例,它允许开发人员从工作表中获取最后使用的行和列。我修改了一下,把所有需要的代码放到一个类库中,用windows窗体前端来演示操作。

Underlying code uses Microsoft.Office.Interop.Excel.

底层代码使用 Microsoft.Office.Interop.Excel。

Location on Microsoft one drive https://1drv.ms/u/s!AtGAgKKpqdWjiEGdBzWDCSCZAMaM

Microsoft 一个驱动器上的位置 https://1drv.ms/u/s!AtGAgKKpqdWjiEGdBzWDCSCZAMaM

Here I get the first sheet in an Excel file, get the last used row and col and present as a valid cell address.

在这里,我获取 Excel 文件中的第一张工作表,获取最后使用的行和列并作为有效的单元格地址显示。

Private Sub cmdAddress1_Click(sender As Object, e As EventArgs) Handles cmdAddress1.Click
    Dim ops As New GetExcelColumnLastRowInformation
    Dim info = New UsedInformation
    ExcelInformationData = info.UsedInformation(FileName, ops.GetSheets(FileName))

    Dim SheetName As String = ExcelInformationData.FirstOrDefault.SheetName

    Dim cellAddress = (
        From item In ExcelInformationData
        Where item.SheetName = ExcelInformationData.FirstOrDefault.SheetName
        Select item.LastCell).FirstOrDefault

    MessageBox.Show($"{SheetName} - {cellAddress}")

End Sub

Within the demo project I also get all sheets for an excel file, present them in a ListBox. Select a sheet name from the list box and get that sheet's last row and column in a valid cell address.

在演示项目中,我还获取了 excel 文件的所有工作表,并将它们显示在列表框中。从列表框中选择一个工作表名称,并在有效的单元格地址中获取该工作表的最后一行和最后一列。

Private Sub cmdAddress_Click(sender As Object, e As EventArgs) Handles cmdAddress.Click
    Dim cellAddress =
        (
            From item In ExcelInformationData
            Where item.SheetName = ListBox1.Text
            Select item.LastCell).FirstOrDefault

    If cellAddress IsNot Nothing Then
        MessageBox.Show($"{ListBox1.Text} {cellAddress}")
    End If

End Sub

Upon first glance when opening the solution from the link above you will note there is a lot of code. The code is optimal and will release all objects immediately.

从上面的链接打开解决方案时,乍一看,您会注意到有很多代码。代码是最优的,将立即释放所有对象。

回答by dee

  • To get last non empty column/row index the Excel function Findcan be used. See GetLastIndexOfNonEmptyCell.
  • Then the Excel Worksheet FunctionCountAis used to determine if the cells are empty and unionthe entire rows/columns to one rows/columns range.
  • This ranges are deleted finally at once.
  • 要获取最后一个非空列/行索引,Find可以使用Excel 函数。见GetLastIndexOfNonEmptyCell
  • 然后,Excel工作表功能CountA是用来确定该电池是空的和联合的全部行/列到一个行/列的范围内。
  • 此范围最终会立即删除。


public void Yahfoufi(string excelFile)
{
    var exapp = new Microsoft.Office.Interop.Excel.Application {Visible = true};
    var wrb = exapp.Workbooks.Open(excelFile);
    var sh = wrb.Sheets["Sheet1"];
    var lastRow = GetLastIndexOfNonEmptyCell(exapp, sh, XlSearchOrder.xlByRows);
    var lastCol = GetLastIndexOfNonEmptyCell(exapp, sh, XlSearchOrder.xlByColumns);
    var target = sh.Range[sh.Range["A1"], sh.Cells[lastRow, lastCol]];
    Range deleteRows = GetEmptyRows(exapp, target);
    Range deleteColumns = GetEmptyColumns(exapp, target);
    deleteColumns?.Delete();
    deleteRows?.Delete();
}

private static int GetLastIndexOfNonEmptyCell(
    Microsoft.Office.Interop.Excel.Application app,
    Worksheet sheet,
    XlSearchOrder searchOrder)
{
    Range rng = sheet.Cells.Find(
        What: "*",
        After: sheet.Range["A1"],
        LookIn: XlFindLookIn.xlFormulas,
        LookAt: XlLookAt.xlPart,
        SearchOrder: searchOrder,
        SearchDirection: XlSearchDirection.xlPrevious,
        MatchCase: false);
    if (rng == null)
        return 1;
    return searchOrder == XlSearchOrder.xlByRows
        ? rng.Row
        : rng.Column;
}

private static Range GetEmptyRows(
    Microsoft.Office.Interop.Excel.Application app,
    Range target)
{
    Range result = null;
    foreach (Range r in target.Rows)
    {
        if (app.WorksheetFunction.CountA(r.Cells) >= 1)
            continue;
        result = result == null
            ? r.EntireRow
            : app.Union(result, r.EntireRow);
    }
    return result;
}

private static Range GetEmptyColumns(
    Microsoft.Office.Interop.Excel.Application app,
    Range target)
{
    Range result = null;
    foreach (Range c in target.Columns)
    {
        if (app.WorksheetFunction.CountA(c.Cells) >= 1)
            continue;
        result = result == null
            ? c.EntireColumn
            : app.Union(result, c.EntireColumn);
    }
    return result;
}

The two functions for getting empty ranges of rows/columns could be refactored to one function, something like this:

用于获取空行/列范围的两个函数可以重构为一个函数,如下所示:

private static Range GetEntireEmptyRowsOrColumns(
    Microsoft.Office.Interop.Excel.Application app,
    Range target,
    Func<Range, Range> rowsOrColumns,
    Func<Range, Range> entireRowOrColumn)
{
    Range result = null;
    foreach (Range c in rowsOrColumns(target))
    {
        if (app.WorksheetFunction.CountA(c.Cells) >= 1)
            continue;
        result = result == null
            ? entireRowOrColumn(c)
            : app.Union(result, entireRowOrColumn(c));
    }
    return result;
}

And then just call it:

然后调用它:

Range deleteColumns = GetEntireEmptyRowsOrColumns(exapp, target, (Func<Range, Range>)(r1 => r1.Columns), (Func<Range, Range>)(r2 => r2.EntireColumn));
Range deleteRows = GetEntireEmptyRowsOrColumns(exapp, target, (Func<Range, Range>)(r1 => r1.Rows), (Func<Range, Range>)(r2 => r2.EntireRow));
deleteColumns?.Delete();
deleteRows?.Delete();


Note: for more informations have a look e.g. on this SO question.

注意:有关更多信息,请查看例如this SO question

Edit

编辑

Try to simply clear the content of all the cells which are after the last used cell.

尝试简单地清除最后使用的单元格之后的所有单元格的内容。

public void Yahfoufi(string excelFile)
{
    var exapp = new Microsoft.Office.Interop.Excel.Application {Visible = true};
    var wrb = exapp.Workbooks.Open(excelFile);
    var sh = wrb.Sheets["Sheet1"];
    var lastRow = GetLastIndexOfNonEmptyCell(exapp, sh, XlSearchOrder.xlByRows);
    var lastCol = GetLastIndexOfNonEmptyCell(exapp, sh, XlSearchOrder.xlByColumns);

    // Clear the columns
    sh.Range(sh.Cells(1, lastCol + 1), sh.Cells(1, Columns.Count)).EntireColumn.Clear();

    // Clear the remaining cells
    sh.Range(sh.Cells(lastRow + 1, 1), sh.Cells(Rows.Count, lastCol)).Clear();

}

回答by Phil

I'm using ClosedXml which has useful 'LastUsedRow' and 'LastUsedColumn' methods.

我正在使用 ClosedXml,它具有有用的“LastUsedRow”和“LastUsedColumn”方法。

var wb = new XLWorkbook(@"<path>\test.xlsx", XLEventTracking.Disabled);
var sheet = wb.Worksheet("Sheet1");

for (int i = sheet.LastRowUsed().RowNumber() - 1; i >= 1; i--)
{
    var row = sheet.Row(i);
    if (row.IsEmpty())
    {
        row.Delete();
    }
}

wb.Save();

This simple loop deleted 5000 out of 10000 rows in 38 seconds. Not fast, but a lot better than 'hours'. That depends on how many rows/columns you're dealing with of course which you don't say. However, after further tests with 25000 empty rows out of 50000 it does take about 30 minutes to delete the empty rows in a loop. Clearly deleting rows isn't an efficient process.

这个简单的循环在 38 秒内删除了 10000 行中的 5000 行。不快,但比“小时”要好得多。这取决于您处理的行数/列数,当然您没有说。但是,在对 50000 行中的 25000 行进行进一步测试后,在循环中删除空行确实需要大约 30 分钟。显然,删除行不是一个有效的过程。

A better solution is to create a new sheet and then copy the rows you want to keep.

更好的解决方案是创建一个新工作表,然后复制要保留的行。

Step 1 - create sheet with 50000 rows and 20 columns, every other row and column is empty.

第 1 步 - 创建 50000 行和 20 列的工作表,每隔一行和一列为空。

var wb = new XLWorkbook(@"C:\Users\passp\Documents\test.xlsx");
var sheet = wb.Worksheet("Sheet1");
sheet.Clear();

for (int i = 1; i < 50000; i+=2)
{
    var row = sheet.Row(i);

    for (int j = 1; j < 20; j += 2)
    {
        row.Cell(j).Value = i * j;
    }
}

Step 2 - copy the rows with data to a new sheet. This takes 10 seconds.

第 2 步 - 将包含数据的行复制到新工作表中。这需要 10 秒钟。

var wb = new XLWorkbook(@"C:\Users\passp\Documents\test.xlsx", XLEventTracking.Disabled);
var sheet = wb.Worksheet("Sheet1");

var sheet2 = wb.Worksheet("Sheet2");
sheet2.Clear();

sheet.RowsUsed()
    .Where(r => !r.IsEmpty())
    .Select((r, index) => new { Row = r, Index = index + 1} )
    .ForEach(r =>
    {
        var newRow = sheet2.Row(r.Index);

        r.Row.CopyTo(newRow);
    }
);

wb.Save();

Step 3 - this would be to do the same operation for the columns.

第 3 步 - 这将对列执行相同的操作。

回答by MacroMarc

Let's say the last corner cell with data is J16 - so no data in columns K onwards, or in rows 17 downwards. Why are you actually deleting them? What is the scenario and what are you trying to achieve? Is it clearing our formatting? Is is clearing our formulas which show an empty string?

假设带有数据的最后一个角单元格是 J16 - 所以在列 K 中没有数据,或者在第 17 行向下。你为什么真的要删除它们?场景是什么,你想达到什么目的?它是在清除我们的格式吗?是否正在清除显示空字符串的公式?

In any case, looping is not the way.

无论如何,循环不是方法。

The code below shows a way to use the Clear() method of Range object to clear all contents and formulas and formatting from a range. Alternatively if you really want to delete them, you can use the Delete() method to delete a whole rectangular Range in one hit. Will be much faster than looping...

下面的代码显示了一种使用 Range 对象的 Clear() 方法来清除区域中的所有内容和公式以及格式的方法。或者,如果您真的想删除它们,您可以使用 Delete() 方法一次性删除整个矩形范围。将比循环快得多......

//code uses variables declared appropriately as Excel.Range & Excel.Worksheet Using Interop library
int x;
int y;
// get the row of the last value content row-wise
oRange = oSheet.Cells.Find(What: "*", 
                           After: oSheet.get_Range("A1"),
                           LookIn: XlFindLookIn.xlValues,
                           LookAt: XlLookAt.xlPart, 
                           SearchDirection: XlSearchDirection.xlPrevious,
                           SearchOrder: XlSearchOrder.xlByRows);

if (oRange == null)
{
    return;
}
x = oRange.Row;

// get the column of the last value content column-wise
oRange = oSheet.Cells.Find(What: "*",
                           After: oSheet.get_Range("A1"),
                           LookIn: XlFindLookIn.xlValues, LookAt: XlLookAt.xlPart,
                           SearchDirection: XlSearchDirection.xlPrevious,
                           SearchOrder: XlSearchOrder.xlByColumns);
y = oRange.Column;

// now we have the corner (x, y), we can delete or clear all content to the right and below
// say J16 is the cell, so x = 16, and j=10

Excel.Range clearRange;

//set clearRange to ("K1:XFD1048576")
clearRange = oSheet.Range[oSheet.Cells[1, y + 1], oSheet.Cells[oSheet.Rows.Count, oSheet.Columns.Count]];
clearRange.Clear(); //clears all content, formulas and formatting
//clearRange.Delete(); if you REALLY want to hard delete the rows

//set clearRange to ("A17:J1048576")            
clearRange = oSheet.Range[oSheet.Cells[x + 1, 1], oSheet.Cells[oSheet.Rows.Count, y]];
clearRange.Clear(); //clears all content, formulas and formatting
//clearRange.Delete();  if you REALLY want to hard delete the columns

回答by garroad_ran

You should be able to find the last non-empty row and column with something similar to this:

您应该能够找到与此类似的最后一个非空行和列:

with m_XlWrkSheet
lastRow = .UsedRange.Rows.Count
lastCol = .UsedRange.Columns.Count
end with

That's VB.NET, but it should more or less work. That will return Row 16 and Column 10 (based on your picture above). Then you can use that to find the range you want to delete all in one line.

那是 VB.NET,但它或多或少应该可以工作。这将返回第 16 行和第 10 列(基于上图)。然后您可以使用它来查找要在一行中全部删除的范围。

回答by Maciej Los

Seems that your problem has been resolved by Microsoft. Take a look at Range.CurrentRegion Property, which returns a range bounded by any combination of blank rows and blank columns. There's one inconvenience: this property cannot be used on a protected worksheet.

看来你的问题已经被微软解决了。看看Range.CurrentRegion 属性,它返回一个由空白行和空白列的任意组合限定的范围。有一个不便之处:不能在受保护的工作表上使用此属性

For further details, please see: How to Find Current Region, Used Range, Last Row and Last Column in Excel with VBA Macro

有关更多详细信息,请参阅:如何使用 VBA 宏在 Excel 中查找当前区域、使用范围、最后一行和最后一列

Some of SO members have mentioned about UsedRange property, which might be useful too, but the differ to CurrentRegionis that UsedRangereturns a range includes any cell that has ever been used.
So, if you would like to get a LAST(row)and LAST(column)occupied by data, you have to use End propertywith XlDirection: xlToLeftand/or xlUp.

一些 SO 成员提到了UsedRange 属性,这也可能很有用,但不同的CurrentRegionUsedRange返回一个范围,包括曾经使用过的任何单元格。
所以,如果你想获得一个LAST(row)LAST(column)由数据占用的,必须使用高端物业XlDirectionxlToLeft和/或xlUp

Note #1:
If your data are in a tabular format, you can simply find last cell, by using:

注意#1:
如果您的数据是表格格式,您可以简单地找到最后一个单元格,使用:

lastCell = yourWorkseet.UsedRange.End(xlUp)
firstEmtyRow = lastCell.Offset(RowOffset:=1).EntireRow

Note #2:
If your data aren'tin a tabular format, you need to loop through the collection of rows and columns to find last non-blank cell.

注意#2:
如果您的数据不是表格格式,您需要遍历行和列的集合以找到最后一个非空白单元格。

Good luck!

祝你好运!

回答by Shajin Chandran

I think you can try using the Range.

我认为您可以尝试使用 Range。

        Application excel = new Application();
        Workbook workBook=  excel.Workbooks.Open("file.xlsx")
        Worksheet excelSheet = workBook.ActiveSheet;
        Range excelRange = excelSheet.UsedRange.Columns[1, Missing.Value] as Range;

        var lastNonEmptyRow = excelRange.Cells.Count;

The above code works for me.

上面的代码对我有用。