从 VBA 中的 powerpoint 文件中提取所有文本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4675100/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extracting all text from a powerpoint file in VBA
提问by elbillaf
I have a huge set of powerpoint files from which I want to extract all the text and just lump it all into one big text file. Each source (PPT) file has multiple pages (slides). I do not care about formatting - only the words.
我有一大堆powerpoint 文件,我想从中提取所有文本,然后将它们全部合并到一个大文本文件中。每个源 (PPT) 文件都有多个页面(幻灯片)。我不关心格式——只关心单词。
I could do this manually with a file by just ^A ^C in PPT, followed by ^V in notepad; then page down in the PPT, and repeat for each slide in the powerpoint. (Too bad I can't just do a ^A that would grab EVERYTHING ... then I could use sendkey to copy / paste)
我可以通过在 PPT 中通过 ^A ^C 和记事本中的 ^V 手动使用文件来完成此操作;然后在 PPT 中向下翻页,并对 PowerPoint 中的每张幻灯片重复此操作。(太糟糕了,我不能只做一个可以抓取所有东西的 ^A ......然后我可以使用 sendkey 复制/粘贴)
But there are many hundreds of these PPTs with different numbers of slides.
但是这些PPT有数百个,幻灯片数量不同。
It seems like this would be a common thing to want to do, but I can't find an example anywhere.
这似乎是一件很常见的事情,但我在任何地方都找不到示例。
Does anyone have sample code to do this?
有没有人有示例代码来做到这一点?
采纳答案by Todd Main
Here's some code to get you started. This dumps all text in slides to the debug window. It doesn't try to format, group or do anything other than just dump.
这里有一些代码可以帮助您入门。这会将幻灯片中的所有文本转储到调试窗口。除了转储之外,它不会尝试格式化、分组或执行任何其他操作。
Sub GetAllText()
Dim p As Presentation: Set p = ActivePresentation
Dim s As Slide
Dim sh As Shape
For Each s In p.Slides
For Each sh In s.Shapes
If sh.HasTextFrame Then
If sh.TextFrame.HasText Then
Debug.Print sh.TextFrame.TextRange.Text
End If
End If
Next
Next
End Sub
回答by elbillaf
The following example shows code to loop through a list of files based on Otaku's code given above:
以下示例显示了根据上面给出的 Otaku 代码循环遍历文件列表的代码:
Sub test_click2()
Dim thePath As String
Dim src As String
Dim dst As String
Dim PPT As PowerPoint.Application
Dim p As PowerPoint.Presentation
Dim s As Slide
Dim sh As PowerPoint.Shape
Dim i As Integer
Dim f(10) As String
f(1) = "abc.pptx"
f(2) = "def.pptx"
f(3) = "ghi.pptx"
thePath = "C:\Work\Text parsing PPT\"
For i = 1 To 3
src = thePath & f(i)
dst = thePath & f(i) & ".txt"
On Error Resume Next
Kill dst
Open dst For Output As #1
Set PPT = CreateObject("PowerPoint.Application")
PPT.Activate
PPT.Visible = True
'PPT.WindowState = ppWindowMinimized
PPT.Presentations.Open filename:=src, ReadOnly:=True
For Each s In PPT.ActivePresentation.Slides
For Each sh In s.Shapes
If sh.HasTextFrame Then
If sh.TextFrame.HasText Then
Debug.Print sh.TextFrame.TextRange.Text
End If
End If
Next
Next
PPT.ActivePresentation.Close
Close #1
Next i
Set PPT = Nothing
End Sub