使用 javascript/jquery 获取 docx 文件内容
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28440170/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
get docx file contents using javascript/jquery
提问by Abdul Ali
wish to open / read docx file using client side technologies (HTML/JS).
希望使用客户端技术(HTML/JS)打开/读取 docx 文件。
kindly assist if this is possible . have found a Javascript library named docx.js but personally cannot seem to locate any documentation for it. (http://blog.innovatejs.com/?p=184)
如果可能,请提供帮助。找到了一个名为 docx.js 的 Javascript 库,但个人似乎无法找到它的任何文档。( http://blog.innovatejs.com/?p=184)
the goal is to make a browser based search tool for docx files and txt files .
目标是为 docx 文件和 txt 文件制作一个基于浏览器的搜索工具。
any help appreciated.
任何帮助表示赞赏。
采纳答案by edi9999
With docxtemplater, you can easily get the full text of a word (works with docx only) by using the doc.getFullText() method.
使用docxtemplater,您可以使用 doc.getFullText() 方法轻松获取单词的全文(仅适用于 docx)。
HTML code:
HTML代码:
<script src="build/docxgen.js"></script>
<script src="vendor/FileSaver.min.js"></script>
<script src="vendor/jszip-utils.js"></script>
<script>
var loadFile=function(url,callback){
JSZipUtils.getBinaryContent(url,callback);
}
loadFile("examples/tagExample.docx",function(err,content){
var doc=new Docxgen(content);
text=doc.getFullText();
console.log(text);
});
</script>
Getting the source code:
获取源代码:
git clone https://github.com/edi9999/docxtemplater.git && cd docxtemplater
# git checkout v1.0.4 # Optional
npm install -g gulp jasmine-node uglify-js browserify
npm install
gulp allCoffee
mkdir build -p
browserify -r ./js/docxgen.js -s Docxgen > build/docxgen.js
uglifyjs build/docxgen.js > build/docxgen.min.js # Optional
回答by Brian Dobby
I know this is an old post, but doctemplaterhas moved on and the accepted answer no longer works. This worked for me:
我知道这是一篇旧帖子,但doctemplater已经继续前进,接受的答案不再有效。这对我有用:
function loadDocx(filename) {
// Read document.xml from docx document
const AdmZip = require("adm-zip");
const zip = new AdmZip(filename);
const xml = zip.readAsText("word/document.xml");
// Load xml DOM
const cheerio = require('cheerio');
$ = cheerio.load(xml, {
normalizeWhitespace: true,
xmlMode: true
})
// Extract text
let out = new Array()
$('w\:t').each((i, el) => {
out.push($(el).text())
})
return out
}
回答by JasonPlutext
If you want to be able to display the docx files in a web browser, you might be interested in Native Documents' recently released commercial Word File Editor; try it at https://nativedocuments.com/test_drive.html
如果您希望能够在 Web 浏览器中显示 docx 文件,您可能会对 Native Documents 最近发布的商业 Word 文件编辑器感兴趣;在https://nativedocuments.com/test_drive.html尝试一下
You'll get much better layout fidelity if you do it this way, than if you try to convert to (X)HTML and view it that way.
如果您这样做,您将获得更好的布局保真度,而不是您尝试转换为 (X)HTML 并以这种方式查看。
It is designed specifically for embedding in a webapp, so there is an API for loading documents, and it will sit happily within the security context of your webapp.
它是专门为嵌入在 web 应用程序中而设计的,因此有一个用于加载文档的 API,并且它会很好地位于您的 web 应用程序的安全上下文中。
Disclosure: I have a commercial interest in Native Documents
披露:我对 Native Documents 有商业利益

