vba 使用VBA提取Word文档目录的标题和页码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3628222/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 12:03:41  来源:igfitidea点击:

Extract Headings and Pagenumber of Table of Contents of a Word Document with VBA

vbams-wordword-vba

提问by FTav

Basically what we have here

基本上我们这里有

Getting the headings from a Word document

从 Word 文档中获取标题

Public Sub CreateOutline()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    ' Content returns only the
    ' main body of the document, not
    ' the headers and footer.
    Set rng = docOutline.Content
    astrHeadings = _
     docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine

        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function

but I need the page number for each heading too.

但我也需要每个标题的页码。

I tried doing a search for each heading, select the search result and retrieve the wdActiveEndPageNumber.

我尝试搜索每个标题,选择搜索结果并检索 wdActiveEndPageNumber。

This didn't work, was slow and is sure an ugly approach.

这不起作用,速度很慢,而且肯定是一种丑陋的方法。

I'd like to paste the found stuff into another word document like: rng.InsertAfter "Page: " & pageNum & " Header: " & strText & vbNewLine

我想将找到的东西粘贴到另一个 Word 文档中,例如:rng.InsertAfter "Page:" & pageNum & " Header: " & strText & vbNewLine

采纳答案by ForEachLoop

I may not understand the question, then, but this code goes through the document, looking for lines that are only headers and gets the page its on.

那么,我可能不明白这个问题,但是这段代码会遍历文档,查找仅作为标题的行并打开页面。

Public Sub SeeHeadingPageNumber()
    On Error GoTo MyErrorHandler

    Dim sourceDocument As Document
    Set sourceDocument = ActiveDocument

    Dim myPara As Paragraph
    For Each myPara In sourceDocument.Paragraphs
        myPara.Range.Select 'For debug only
        If InStr(LCase$(myPara.Range.Style.NameLocal), LCase$("heading")) > 0 Then
            Debug.Print myPara.Range.Information(wdActiveEndAdjustedPageNumber)
        End If

        DoEvents
    Next

    Exit Sub

MyErrorHandler:
    MsgBox "SeeHeadingPageNumber" & vbCrLf & vbCrLf & "Err = " & Err.Number & vbCrLf & "Description: " & Err.Description
End Sub

回答by ForEachLoop

Try using a Table of Content field. The following code dissects a TOC and gives you the item, page number and style. You might have to parse each string to get the exact info or formatting you need.

尝试使用目录字段。下面的代码剖析了一个目录,并为您提供了项目、页码和样式。您可能必须解析每个字符串才能获得所需的确切信息或格式。

Public Sub SeeTOCInfo()
    On Error GoTo MyErrorHandler

    Dim sourceDocument As Document
    Set sourceDocument = ActiveDocument

    Dim myField As Field
    For Each myField In sourceDocument.TablesOfContents(1).Range.Fields
        Debug.Print Replace(myField.Result.Text, Chr(13), "-") & " " & " Type: " & myField.Type
        If Not myField.Result.Style Is Nothing Then
            Debug.Print myField.Result.Style
        End If
        DoEvents
    Next

    Exit Sub

MyErrorHandler:
    MsgBox "SeeTOCInfo" & vbCrLf & vbCrLf & "Err = " & Err.Number & vbCrLf & "Description: " & Err.Description
End Sub

回答by Todd Main

This will insert the page number of the referenced Heading:

这将插入引用标题的页码:

rng.InsertCrossReference ReferenceType:=wdRefTypeHeading, _
            ReferenceKind:=wdPageNumber, ReferenceItem:=intItem

But only works if you're inserting in the same document. You could insert in the current document and then cut/paste out to a new document.

但仅当您插入同一文档时才有效。您可以插入当前文档,然后剪切/粘贴到新文档中。