vba 在 Microsoft Word 中将文本转换为图像
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8888145/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert text to image in Microsoft Word
提问by Abdullah Jibaly
I have a large book written in Microsoft Word and want to create a macro that will find all text using a predefined style and convert that text to an inline image. This text will be in Arabic and generally no longer than 4-5 lines. Is this possible?
我有一本用 Microsoft Word 编写的大书,想创建一个宏,该宏将使用预定义的样式查找所有文本并将该文本转换为内嵌图像。此文本为阿拉伯语,一般不超过 4-5 行。这可能吗?
UPDATE: Here's an example to show what I'm referring to:
更新:这里有一个例子来说明我所指的内容:
I want to replace that entire line in Arabic with an image (as if I cropped this attached image to only include the Arabic and then replaced the line in Arabic with the image).
我想用图像替换阿拉伯语中的整行(好像我裁剪了此附加图像以仅包含阿拉伯语,然后用图像替换了阿拉伯语中的行)。
The reason I want a macro or script to do this is because there are hundreds of such lines and updating them one by one is cumbersome plus that will make modifications difficult later on.
我想要一个宏或脚本来做这件事的原因是因为有数百行这样的行,并且逐行更新它们很麻烦,而且以后会很难修改。
UPDATE2: I found an interesting option here: http://windowssecrets.com/forums/showthread.php/31344-Convert-Text-to-an-Image-of-Text-in-VBA-(Office-2000-Sr1a)
UPDATE2:我在这里找到了一个有趣的选项:http: //windowssecrets.com/forums/showthread.php/31344-Convert-Text-to-an-Image-of-Text-in-VBA-(Office-2000-Sr1a )
It looks like you can cut a piece of text and then "Paste Special" as an image. So if there's a way to automate that that might work.
看起来您可以剪切一段文本,然后“选择性粘贴”作为图像。因此,如果有一种方法可以实现自动化,那可能会奏效。
采纳答案by Tony Dallimore
This is not an answer although I hope it will grow into a community answer. At the moment it is an exploration of what is required to solve the problem.
这不是一个答案,尽管我希望它会成长为一个社区答案。目前正在探索解决问题所需的条件。
I know from the discussion when this question was posted on Super User that Abdullah wishes to publish his book on Kindle. So the question is really about how to get a document in English and Arabic ready for publication as an e-Book.
我从超级用户上发布这个问题时的讨论中得知,阿卜杜拉希望在 Kindle 上出版他的书。所以问题实际上是关于如何准备好以电子书形式出版的英语和阿拉伯语文档。
The Kindle does not support Arabic. The number of languages it does support is slowly increasing but there is no evidence I can find that Amazon has plans to add Arabic in the foreseeable future.
Kindle 不支持阿拉伯语。它支持的语言数量正在缓慢增加,但没有证据表明亚马逊计划在可预见的未来添加阿拉伯语。
The format behind an Amazon e-Book is a cut down version of HTML. If a Word document containing Arabic letters is exported to HTML, the Arabic letters are included as character entities; for example: “ﭐ &#amp;64337; ﭒ ﭓ”. Importing the original Word or the HTML version to Kindle, results in the leading bits being discarded so these characters are displayed as P, Q, R and S instead of “ﭐ ﭑ ﭒ ﭓ (Alef Wasla isolated form, Alef Wasla final form, Beeh Wasla isolated form and Beeh Wasla final form).
亚马逊电子书背后的格式是 HTML 的精简版。如果将包含阿拉伯字母的 Word 文档导出为 HTML,则将阿拉伯字母作为字符实体包含在内;例如:“ﭐ ﭑ ﭒ ﭓ”。将原始 Word 或 HTML 版本导入 Kindle,导致前导位被丢弃,因此这些字符显示为 P、Q、R 和 S,而不是“ﭐ ﭑ ﭒ ﭓ”(Alef Wasla 孤立形式、Alef Wasla 最终形式、Beeh Wasla 分离形式和 Beeh Wasla 最终形式)。
I have tried Abdullah's idea of saving some Arabic letters in a PNG file and creating an HTML file containing <p> … </p> <img src= “Arabic.png” > <p> … </p>
. The appearance of this file on my Kindle 2 is perfectly acceptable so this has the potential to be a solution. The question is: how can the necessary conversions be performed?
我尝试了 Abdullah 的想法,即在 PNG 文件中保存一些阿拉伯字母并创建一个包含<p> … </p> <img src= “Arabic.png” > <p> … </p>
. 这个文件在我的 Kindle 2 上的外观是完全可以接受的,所以这有可能成为一个解决方案。问题是:如何进行必要的转换?
We need to extract each Arabic string from either the Word document or its HTML equivalent and import it into a program that can convert them to PNG files.
我们需要从 Word 文档或其 HTML 等效文件中提取每个阿拉伯字符串,并将其导入到可以将它们转换为 PNG 文件的程序中。
The only way that I know of automating this would be to copy each string to a slide within PowerPoint. With PowerPoint's SaveAs option it is possible to save each slide as a separate PNG file. The slides are named: SLIDE1.PNG, SLIDE2.PNG, SLIDE3.PNG and so on in sequence which would allow a macro to relate the results to the original strings. It would then be possible to replace the Arabic strings in the HTML file with the image elements. None of this would be too difficult to automate but there is a problem with the slides all being the size of the PowerPoint page. The page could be made smallish but what we need is for each slide to be cropped to just bigger than that slide's text. I cannot think of any way of automating this cropping.
我所知道的自动执行此操作的唯一方法是将每个字符串复制到 PowerPoint 中的幻灯片。使用 PowerPoint 的另存为选项,可以将每张幻灯片另存为单独的 PNG 文件。幻灯片按顺序命名为:SLIDE1.PNG、SLIDE2.PNG、SLIDE3.PNG 等,这将允许宏将结果与原始字符串相关联。然后就可以用图像元素替换 HTML 文件中的阿拉伯字符串。这一切都不会太难自动化,但幻灯片的大小都是 PowerPoint 页面的大小存在问题。页面可以做得更小,但我们需要的是将每张幻灯片裁剪为仅大于该幻灯片的文本。我想不出任何自动化这种裁剪的方法。
Does anyone have a better approach than converting each Arabic phrase to a PNG file?
有没有人有比将每个阿拉伯语短语转换为 PNG 文件更好的方法?
I have been looking for PNG editors with some sort of command line interface but can find nothing that would be easier than using PowerPoint. Does anyone know of an alternative to PowerPoint?
我一直在寻找具有某种命令行界面的 PNG 编辑器,但找不到比使用 PowerPoint 更容易的东西。有谁知道 PowerPoint 的替代品?
Does anyone have any suggestions for automating the cropping of each image? When a string is placed in a PowerPoint slide it is possible to set its width to, say, 6.5cm (which looks good on my Kindle) and get the height determined by PowerPoint. This could be saved for later use if anyone knows how to use it.
有没有人对自动裁剪每个图像有任何建议?当在 PowerPoint 幻灯片中放置一个字符串时,可以将其宽度设置为 6.5 厘米(这在我的 Kindle 上看起来不错)并获得由 PowerPoint 确定的高度。如果有人知道如何使用它,这可以保存以备后用。
Implementing solution
实施方案
Pending any suggestions for improving the approach described above, the following outlines how I would implement it.
在提出改进上述方法的任何建议之前,以下概述了我将如何实施它。
I would not attempt to process the Word document. I would save it as a Web Page, Filtered
HTML file, which is a required step on the way to creating a Kindle eBook, and process that.
我不会尝试处理 Word 文档。我会将其另存为Web Page, Filtered
HTML 文件,这是创建 Kindle 电子书并对其进行处理的必要步骤。
Within the HTML file created from my test document, the Arabic phrase comes out as:
在从我的测试文档创建的 HTML 文件中,阿拉伯语短语显示为:
<p class="MsoNormal"></p>
<p class="MsoNormal" align="center" style="text-align:center"><span dir="RTL"
style="font-size:24.0pt;font-family:Arial">
&#64336;&#64337;&#64338;&#64339;&#64340;&#64341;
&#64342;&#64343;&#65153;&#65154;&#65276;&#65275;
&#65274;&#65273;&#65246;&#65226;&#65227;&#65228;
</span><span style="font-size:24.0pt"></span></p>
<p class="MsoNormal"></p>
<p class="MsoNormal"></p>
I assume Abdullah's document will result in something similar. Note 1: the above is a random collection of Arabic letters. Note 2: they are held left-to-right in reading sequence even though, when displayed or printed, they are read right-to-left.
我认为阿卜杜拉的文件会产生类似的结果。注1:以上是随机收集的阿拉伯字母。注 2:它们在阅读顺序中是从左到右保持的,即使在显示或打印时,它们是从右到左阅读的。
The whole of this block will have to be replaced with something like:
整个块将必须替换为:
<br><imc src="xxxx.png"><br>
where the file xxxx.png holds an image of the Arabic text.
其中文件 xxxx.png 包含阿拉伯文本的图像。
The file names, such as xxxx.png, could be systematic (A001.png, A002.png, ...) but I would have thought that transliterating the first ten or twenty characters of the phrase from the Arabic to English alphabets and using the result, with a numeric suffix, as the file name would be more convenient.
文件名,例如 xxxx.png,可以是系统的(A001.png、A002.png、...),但我认为将短语的前十或二十个字符从阿拉伯字母音译为英文字母并使用结果,带有数字后缀,作为文件名会更方便。
I would hold the records necessary to manage the process in an Excel worksheet. I would place the VBA code in the same workbook.
我会在 Excel 工作表中保存管理流程所需的记录。我会将 VBA 代码放在同一个工作簿中。
The steps in the conversion process that I envisage are:
我设想的转换过程中的步骤是:
- VBA macro to extract Arabic strings from latest HTML file and add new strings to the Excel worksheet. (More about the Excel worksheet later.)
- VBA macro to create PowerPoint file, with one slide per new string, and use
SaveAs
in PNG format to create one PNG file per slide before discarding the PowerPoint file. - Human to crop each PNG file. (There appears to be no way of automating the cropping so this task will be minimised by use of data in the Excel worksheet.)
- VBA macro to rename each slide from SLIDEnnn.PNG to its permanent name and to record the permanent name in the Excel worksheet.
- VBA macro to update the latest HTML file by replacing the block containing the Arabic phrase with the appropriate HTML IMG element.
- VBA 宏用于从最新的 HTML 文件中提取阿拉伯语字符串并将新字符串添加到 Excel 工作表。(稍后将详细介绍 Excel 工作表。)
- 用于创建 PowerPoint 文件的 VBA 宏,每个新字符串包含一张幻灯片,并
SaveAs
在丢弃 PowerPoint 文件之前使用PNG 格式为每张幻灯片创建一个 PNG 文件。 - 人工裁剪每个 PNG 文件。(似乎无法自动裁剪,因此将通过使用 Excel 工作表中的数据最小化此任务。)
- VBA 宏将每张幻灯片从 SLIDEnnn.PNG 重命名为其永久名称,并在 Excel 工作表中记录永久名称。
- VBA 宏通过用适当的 HTML IMG 元素替换包含阿拉伯语短语的块来更新最新的 HTML 文件。
The Excel worksheet needs two columns: Arabic phrase and PNG file name. If there is any risk of the worksheet being sorted between steps 2 and 4, we may need a sequence number as well.
Excel 工作表需要两列:阿拉伯语短语和 PNG 文件名。如果工作表在第 2 步和第 4 步之间排序存在任何风险,我们可能还需要一个序列号。
Macro 1 will extract an Arabic phrase from the HTML file, look down the list in the worksheet for this phrase and add the phrase at the bottom if it is not already present.
宏 1 将从 HTML 文件中提取一个阿拉伯语短语,查看该短语的工作表中的列表,如果该短语尚不存在,则在底部添加该短语。
Macro 2 will look for phrases in the worksheet that do not have a PNG file name. These new phrases are the ones to be written to the PowerPoint presentation. That is, a phrase only goes into this process once.
宏 2 将在工作表中查找没有 PNG 文件名的短语。这些新短语是要写入 PowerPoint 演示文稿的短语。也就是说,一个短语只进入这个过程一次。
Task 3, cropping each PNG file, will be a pain. All I can say is that it will only be once per phrase.
任务 3,裁剪每个 PNG 文件,会很痛苦。我只能说每个短语只会出现一次。
Macro 4 will assume that the SLIDE001.PNG, SLIDE002.PNG, … are in the sequence of phrases without PNG files in the worksheet. If this might not be true (because the worksheet has been sorted) we will either need a sequence number or to retain the PowerPoint file. The macro will assign a unique name to each new phrase, record this name in the worksheet and rename the PNG file.
宏 4 将假定 SLIDE001.PNG、SLIDE002.PNG、... 在工作表中没有 PNG 文件的短语序列中。如果这可能不正确(因为工作表已排序),我们将需要序列号或保留 PowerPoint 文件。该宏将为每个新短语分配一个唯一名称,在工作表中记录此名称并重命名 PNG 文件。
Macro 5 creates a new copy of the latest HTML file using the contents of the worksheet to determine which phrase to replace with which PNG file.
宏 5 使用工作表的内容创建最新 HTML 文件的新副本,以确定用哪个 PNG 文件替换哪个短语。
This process is not ideal but it will achieve the desired result and has no obvious complications. Any suggestions for improving it?
这个过程并不理想,但会达到预期的效果,并且没有明显的并发症。有什么改进的建议吗?
回答by MacGyver
Before you begin these instructions, press record in the Microsoft Word macro editor, so you can see what the VBA code is.
在开始这些说明之前,请在 Microsoft Word 宏编辑器中按记录,这样您就可以看到 VBA 代码是什么。
I'm wondering if this will be easier if you convert the docx file to .rtf (rich text format) and replace that line with an image? Go to File > Save As.. > name it "old.rtf", then replace the line with an image and Save As.. again and name it "new.rtf" and then download Beyond Compare or your favorite diff program to see what happened. It should be easy to do this pro-grammatically if you choose to. I think working in text would be easier than Microsoft's binary format unless you can find a good library to modify their doc or docx formats.
我想知道如果您将 docx 文件转换为 .rtf(富文本格式)并将该行替换为图像,这是否会更容易?转到“文件”>“另存为...”> 将其命名为“old.rtf”,然后用图像替换该行并再次另存为...并将其命名为“new.rtf”,然后下载 Beyond Compare 或您最喜欢的 diff 程序以查看发生了什么。如果您愿意,以编程方式执行此操作应该很容易。我认为在文本中工作会比微软的二进制格式更容易,除非你能找到一个好的库来修改他们的 doc 或 docx 格式。
回答by Trygve R. Lerwick
Sub CopySelPasteAsPicture()
' Take a picture of a selection and paste it at the
' document end
With Selection
.CopyAsPicture
End With
ActiveDocument.Content.Select
With Selection
.Collapse Direction:=wdCollapseEnd
.TypeParagraph
.TypeParagraph
.PasteSpecial DataType:=wdPasteMetafilePicture
End With
End Sub