C# 逐行读取word文档

Question

提问by Bat_Programmer

I'm trying to read a word document using C#. I am able to get all text but I want to be able to read line by lineand store in a list and bind to a gridview. Currently my code returns a list of one item only with all text (not line by line as desired). I'm using the Microsoft.Office.Interop.Wordlibrary to read the file. Below is my code till now:

我正在尝试使用 C# 读取 Word 文档。我能够获取所有文本，但我希望能够逐行读取并存储在列表中并绑定到 gridview。目前，我的代码仅返回一个包含所有文本的项目列表（而不是按需要逐行）。我正在使用Microsoft.Office.Interop.Word库来读取文件。以下是我到目前为止的代码：

    Application word = new Application();
    Document doc = new Document();

    object fileName = path;
    // Define an object to pass to the API for missing parameters
    object missing = System.Type.Missing;
    doc = word.Documents.Open(ref fileName,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing);

    String read = string.Empty;
    List<string> data = new List<string>();
    foreach (Range tmpRange in doc.StoryRanges)
    {
        //read += tmpRange.Text + "<br>";
        data.Add(tmpRange.Text);
    }
    ((_Document)doc).Close();
    ((_Application)word).Quit();

    GridView1.DataSource = data;
    GridView1.DataBind();

Answer 1

采纳答案by Bat_Programmer

Ok. I found the solution here.

好的。我在这里找到了解决方案。

The final code is as follows:

最终代码如下：

Application word = new Application();
Document doc = new Document();

object fileName = path;
// Define an object to pass to the API for missing parameters
object missing = System.Type.Missing;
doc = word.Documents.Open(ref fileName,
        ref missing, ref missing, ref missing, ref missing,
        ref missing, ref missing, ref missing, ref missing,
        ref missing, ref missing, ref missing, ref missing,
        ref missing, ref missing, ref missing);

String read = string.Empty;
List<string> data = new List<string>();
for (int i = 0; i < doc.Paragraphs.Count; i++)
{
    string temp = doc.Paragraphs[i + 1].Range.Text.Trim();
    if (temp != string.Empty)
        data.Add(temp);
}
((_Document)doc).Close();
((_Application)word).Quit();

GridView1.DataSource = data;
GridView1.DataBind();

Answer 2

回答by Pratik Anjania

The above code is correct, but it's too slow. I have improved the code, and it's much faster than the above one.

上面的代码是正确的，但是太慢了。我已经改进了代码，它比上面的要快得多。

List<string> data = new List<string>();
Application app = new Application();
Document doc = app.Documents.Open(ref readFromPath);

foreach (Paragraph objParagraph in doc.Paragraphs)
    data.Add(objParagraph.Range.Text.Trim());

((_Document)doc).Close();
((_Application)app).Quit();

Answer 3

回答by Chris

How about this yo. Get all the words from the doc and split them on return or whatever is better for you. Then turn into list

这个怎么样哟。从文档中获取所有单词并在返回时拆分它们或对您更好的任何内容。然后变成list

   List<string> lines = doc.Content.Text.Split('\n').ToList();

C# 逐行读取word文档

提问by Bat_Programmer

采纳答案by Bat_Programmer

回答by Pratik Anjania

回答by Chris

相关推荐

最近更新

标签

C# 逐行读取word文档

提问by Bat_Programmer

采纳答案by Bat_Programmer

回答by Pratik Anjania

回答by Chris

相关推荐

C# 求最大公约数

C# 使用 LINQ 将多个列表合并为一个列表

C# 将华氏温度转换为摄氏温度

C# ITextSharp 编辑现有的 pdf

相关推荐

最近更新

标签