apache 格式化 XML DocBook 的推荐工具链是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/122752/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 16:45:30  来源:igfitidea点击:

What is the recommended toolchain for formatting XML DocBook?

xmlapachepdfapache-fopdocbook

提问by Jonathan Leffler

I've seen Best tools for working with DocBook XML documents, but my question is slightly different. Which is the currently recommended formatting toolchain - as opposed to editing tool - for XML DocBook?

我见过使用 DocBook XML 文档的最佳工具,但我的问题略有不同。哪个是当前推荐的 XML DocBook 格式化工具链 - 而不是编辑工具?

In Eric Raymond's 'The Art of Unix Programming'from 2003 (an excellent book!), the suggestion is XML-FO (XML Formatting Objects), but I've since seen suggestions here that indicated that XML-FO is no longer under development (though I can no longer find that question on StackOverflow, so maybe it was erroneous).

在 Eric Raymond 2003 年的“Unix 编程艺术”(一本好书!)中,建议是 XML-FO(XML 格式化对象),但我从那以后看到这里的建议表明 XML-FO 不再处于开发阶段(虽然我在 StackOverflow 上找不到那个问题了,所以可能是错误的)。

Assume I'm primarily interested in Unix/Linux (including MacOS X), but I wouldn't automatically ignore Windows-only solutions.

假设我主要对 Unix/Linux(包括 MacOS X)感兴趣,但我不会自动忽略仅适用于 Windows 的解决方案。

Is Apache's FOPthe best way to go? Are there any alternatives?

Apache的FOP去的最佳途径?有没有其他选择?

采纳答案by Gustavo Carreno

I've been doing some manual writing with DocBook, under cygwin, to produce One Page HTML, Many Pages HTML, CHM and PDF.

我一直在使用 DocBook 在 cygwin 下进行一些手动编写,以生成一页 HTML、多页 HTML、CHM 和 PDF。

I installed the following:

我安装了以下内容:

  1. The docbookstylesheets (xsl) repository.
  2. xmllint, to test if the xml is correct.
  3. xsltproc, to process the xml with the stylesheets.
  4. Apache's fop, to produce PDF's.I make sure to add the installed folder to the PATH.
  5. Microsoft's HTML Help Workshop, to produce CHM's. I make sure to add the installed folder to the PATH.
  1. DocBook的样式表(XSL)资源库。
  2. xmllint,测试xml是否正确。
  3. xsltproc,使用样式表处理 xml。
  4. Apache 的 fop,用于生成 PDF。我确保将安装的文件夹添加到 PATH。
  5. Microsoft 的HTML Help Workshop,用于生成 CHM。我确保将安装的文件夹添加到 PATH。

Edit: In the below code I'm using more than the 2 files. If someone wants a cleaned up version of the scripts and the folder structure, please contact me: guscarreno (squiggly/at) googlemail (period/dot) com

编辑:在下面的代码中,我使用了 2 个以上的文件。如果有人想要脚本和文件夹结构的清理版本,请联系我:guscarreno (squiggly/at) googlemail (period/dot) com

I then use a configure.in:

然后我使用一个configure.in:

AC_INIT(Makefile.in)

FOP=fop.sh
HHC=hhc
XSLTPROC=xsltproc

AC_ARG_WITH(fop, [  --with-fop  Where to find Apache FOP],
[
    if test "x$withval" != "xno"; then
        FOP="$withval"
    fi
]
)
AC_PATH_PROG(FOP,  $FOP)

AC_ARG_WITH(hhc, [  --with-hhc  Where to find Microsoft Help Compiler],
[
    if test "x$withval" != "xno"; then
        HHC="$withval"
    fi
]
)
AC_PATH_PROG(HHC,  $HHC)

AC_ARG_WITH(xsltproc, [  --with-xsltproc  Where to find xsltproc],
[
    if test "x$withval" != "xno"; then
        XSLTPROC="$withval"
    fi
]
)
AC_PATH_PROG(XSLTPROC,  $XSLTPROC)

AC_SUBST(FOP)
AC_SUBST(HHC)
AC_SUBST(XSLTPROC)

HERE=`pwd`
AC_SUBST(HERE)
AC_OUTPUT(Makefile)

cat > config.nice <<EOT
#!/bin/sh
./configure \
    --with-fop='$FOP' \
    --with-hhc='$HHC' \
    --with-xsltproc='$XSLTPROC' \

EOT
chmod +x config.nice

and a Makefile.in:

和一个 Makefile.in:

FOP=@FOP@
HHC=@HHC@
XSLTPROC=@XSLTPROC@
HERE=@HERE@

# Subdirs that contain docs
DOCS=appendixes chapters reference 

XML_CATALOG_FILES=./build/docbook-xsl-1.71.0/catalog.xml
export XML_CATALOG_FILES

all:    entities.ent manual.xml html

clean:
@echo -e "\n=== Cleaning\n"
@-rm -f html/*.html html/HTML.manifest pdf/* chm/*.html chm/*.hhp chm/*.hhc chm/*.chm entities.ent .ent
@echo -e "Done.\n"

dist-clean:
@echo -e "\n=== Restoring defaults\n"
@-rm -rf .ent autom4te.cache config.* configure Makefile html/*.html html/HTML.manifest pdf/* chm/*.html chm/*.hhp chm/*.hhc chm/*.chm build/docbook-xsl-1.71.0
@echo -e "Done.\n"

entities.ent: ./build/mkentities.sh $(DOCS)
@echo -e "\n=== Creating entities\n"
@./build/mkentities.sh $(DOCS) > .ent
@if [ ! -f entities.ent ] || [ ! cmp entities.ent .ent ]; then mv .ent entities.ent ; fi
@echo -e "Done.\n"

# Build the docs in chm format

chm:    chm/htmlhelp.hpp
@echo -e "\n=== Creating CHM\n"
@echo logo.png >> chm/htmlhelp.hhp
@echo arrow.gif >> chm/htmlhelp.hhp
@-cd chm && "$(HHC)" htmlhelp.hhp
@echo -e "Done.\n"

chm/htmlhelp.hpp: entities.ent build/docbook-xsl manual.xml build/chm.xsl
@echo -e "\n=== Creating input for CHM\n"
@"$(XSLTPROC)" --output ./chm/index.html ./build/chm.xsl manual.xml

# Build the docs in HTML format

html: html/index.html

html/index.html: entities.ent build/docbook-xsl manual.xml build/html.xsl
@echo -e "\n=== Creating HTML\n"
@"$(XSLTPROC)" --output ./html/index.html ./build/html.xsl manual.xml
@echo -e "Done.\n"

# Build the docs in PDF format

pdf:    pdf/manual.fo
@echo -e "\n=== Creating PDF\n"
@"$(FOP)" ./pdf/manual.fo ./pdf/manual.pdf
@echo -e "Done.\n"

pdf/manual.fo: entities.ent build/docbook-xsl manual.xml build/pdf.xsl
@echo -e "\n=== Creating input for PDF\n"
@"$(XSLTPROC)" --output ./pdf/manual.fo ./build/pdf.xsl manual.xml

check: manual.xml
@echo -e "\n=== Checking correctness of manual\n"
@xmllint --valid --noout --postvalid manual.xml
@echo -e "Done.\n"

# need to touch the dir because the timestamp in the tarball
# is older than that of the tarball :)
build/docbook-xsl: build/docbook-xsl-1.71.0.tar.gz
@echo -e "\n=== Un-taring docbook-xsl\n"
@cd build && tar xzf docbook-xsl-1.71.0.tar.gz && touch docbook-xsl-1.71.0

to automate the production of the above mentioned file outputs.

自动生成上述文件输出。

I prefer to use a nix approach to the scripting just because the toolset is more easy to find and use, not to mention easier to chain.

我更喜欢使用 nix 方法来编写脚本,因为工具集更容易找到和使用,更不用说更容易链接了。

回答by Oliver Drotbohm

We use XMLmind XmlEditfor editing and Maven's docbkxplugin to create output during our builds. For a set of good templates take a look at the ones Hibernateor Springprovide.

我们使用XMLmind XmlEdit进行编辑,并使用 Maven 的docbkx插件在我们的构建过程中创建输出。对于一组好的模板,请查看HibernateSpring提供的模板。

回答by bortzmeyer

For HTML output, I use the Docbook XSL stylesheetswith the XSLT processor xsltproc.

对于 HTML 输出,我使用带有 XSLT 处理器 xsltproc的Docbook XSL 样式表

For PDF output, I use dblatex, which translates to LaTeX and then use pdflatex to compile it to PDF. (I used Jade, the DSSSL stylesheets and jadetex before.)

对于 PDF 输出,我使用dblatex,它会转换为 LaTeX,然后使用 pdflatex 将其编译为 PDF。(我之前使用过 Jade、DSSSL 样式表和 jadetex。)

回答by Verhagen

We use

我们用

  • Serna XML Editor
  • Eclipse (plain xml editing, mostly used by the technical people)
  • own specific Eclipse plug-in (just for our release-notes)
  • Maven docbkx plug-in
  • Maven jar with specific corporate style sheet, based on the standard docbook style-sheets
  • Maven plug-in for converting csv to DocBook table
  • Maven plug-in for extracting BugZilla data and creating a DocBook section from it
  • Hudson (to generate the PDF document(s))
  • Nexus to deploy the created PDF documents
  • Serna XML 编辑器
  • Eclipse(纯xml编辑,多为技术人员使用)
  • 自己的特定 Eclipse 插件(仅用于我们的发行说明)
  • Maven docbkx 插件
  • 带有特定公司样式表的 Maven jar,基于标准的 docbook 样式表
  • 用于将csv转换为DocBook表的Maven插件
  • 用于提取 BugZilla 数据并从中创建 DocBook 部分的 Maven 插件
  • Hudson(生成 PDF 文档)
  • Nexus 部署创建的 PDF 文档

Some ideas we have:

我们有一些想法:

Deploy with each product version not only the PDF, but also the original complete DocBook document (as we partly write the document and partly generate them). Saving the full DocBook document makes them independent for changes in the system setup in the future. Meaning, if the system changes, from which the content was extracted (or replaced by diff. systems) we would not be able to generate the exact content any more. Which could cause an issue, if we needed to re-release (with different style-sheet) the whole product ranche of manuals. Same as with the jars; these compiled Java classes are also placed in Nexus (you do not want to store them in your SCM); this we would also do with the generated DocBook document.

不仅为每个产品版本部署 PDF,还部署原始的完整 DocBook 文档(因为我们部分编写文档并部分生成它们)。保存完整的 DocBook 文档使它们独立于未来系统设置的更改。这意味着,如果系统发生变化,从中提取内容(或由差异系统替换),我们将无法再生成确切的内容。如果我们需要重新发布(使用不同的样式表)整个产品系列的手册,这可能会导致问题。和罐子一样;这些编译后的 Java 类也放在 Nexus 中(您不想将它们存储在 SCM 中);我们也将使用生成的 DocBook 文档执行此操作。

Update:

更新:

Fresh created a Maven HTML Cleaner Plug-in, which makes it possible to add DocBook content to a Maven Project Site(Beta version available). Feedback is welcome through the Open DiscussionForum.

Fresh 创建了一个Maven HTML Cleaner Plug-in,它可以将 DocBook 内容添加到 Maven 项目站点(Beta 版可用)。欢迎通过开放讨论论坛提供反馈。

回答by Dick

The DocBook stylesheets, plus FOP, work well, but I finally decided to spring for RenderX, which covers the standard more thoroughly and has some nice extensions that the DocBook stylesheets take advantage of.

DocBook 样式表以及 FOP 运行良好,但我最终决定使用 RenderX,它更全面地涵盖了标准,并且具有 DocBook 样式表可以利用的一些不错的扩展。

Bob Stayton's book, DocBook XSL: The Complete Guide, describes several alternate tool chains, including ones that work on Linux or Windows (almost surely MacOS, too, though I have not personally used a Mac).

Bob Stayton 的书DocBook XSL: The Complete Guide描述了几个替代工具链,包括在 Linux 或 Windows 上工作的工具链(几乎肯定也是 MacOS,尽管我个人没有使用过 Mac)。

回答by Upendra

The article called The DocBook toolchainmight be useful as well. It is a section of a HOWTOon DocBook written by Eric Raymond.

名为The DocBook toolchain的文章也可能有用。它是Eric Raymond 编写的 DocBook 上HOWTO的一部分。

回答by Ismael Olea

I've been using two CLI utils for simplifying my docbook toolchain: xmlto and publican.

我一直在使用两个 CLI 实用程序来简化我的 docbook 工具链:xmlto 和 publican。

Publican looks elegant to me but enough fitted for the Fedora & Redhat publication needs.

Publican 对我来说看起来很优雅,但足以满足 Fedora 和 Redhat 的发布需求。

回答by Liz Fraley

With FOP you get the features that someone decided they wanted bad enough to implement. I'd say that no one who's serious about publishing uses it in production. You're far better off with RenderX or Antenna House or Arbortext. (I've used them all over the last decade's worth of implementation projects.) It depends on your business requirements, how much you want to automate, and what your team's skills, time, and resources are like as well. It's not just a technology question.

使用 FOP,您可以获得某些人认为他们想要实现的功能。我想说,没有人认真对待发布在生产中使用它。使用 RenderX 或 Antenna House 或Arbortext会好得多。(我在过去十年的所有实施项目中都使用了它们。)这取决于您的业务需求、您想要自动化的程度以及您的团队的技能、时间和资源如何。这不仅仅是一个技术问题。

回答by uman

If you're on Red Hat, Ubuntu, or Windows, you could take a look at Publican, which is supposed to be a fairly complete command line toolchain. Red Hat uses it extensively.

如果您使用的是 Red Hat、Ubuntu 或 Windows,您可以看看 Publican,它应该是一个相当完整的命令行工具链。Red Hat 广泛使用它。

回答by Palmin

Regarding the question about Apache's FOP: when we established our toolchain (similar to what Gustavo has suggested) we had very good results using the RenderX XEP engine. XEPs output looks a little bit more polished, and as far as I recall, FOP had some problems with tables (this was a few years ago though, this might have changed).

关于 Apache 的 FOP 的问题:当我们建立我们的工具链(类似于 Gustavo 的建议)时,我们使用RenderX XEP 引擎获得了非常好的结果。XEPs 输出看起来更精致一点,据我所知,FOP 有一些表格问题(不过这是几年前的事,这可能已经改变了)。