C# ITextSharp 解析包含图像的 HTML:它解析正确但不会显示图像

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9611535/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-09 08:09:29  来源:igfitidea点击:

ITextSharp Parsing HTML with Images in it: It parses correctly but wont show images

c#asp.netitextsharp

提问by sazr

I am trying to generate a .pdf from html using the library ITextSharp. I am able to create the pdf with the html text converted to pdf text/paragraphs

我正在尝试使用库 ITextSharp 从 html 生成 .pdf。我能够使用转换为 pdf 文本/段落的 html 文本创建 pdf

My Problem:The pdf does not show my images(my imgelements from the html). All my imghtml elements in my html dont get displayed in the pdf? Is it possible for ITextSharp to parse HTML & display images. I really hope so otherwise I am stuffed :(

我的问题:pdf 不显示我的图像(来自 html 的img元素)。我的html 中的所有imghtml 元素都没有显示在 pdf 中?ITextSharp 是否可以解析 HTML 并显示图像。我真的希望如此,否则我会吃饱的:(

I am linking to the correct directory where the images are(using IMG_BASURL) but they are just not showing

我正在链接到图像所在的正确目录(使用 IMG_BASURL),但它们只是没有显示

My code:

我的代码:

// mainContents variable is a string containing my HTML
var document = new Document(PageSize.A4, 50, 50, 80, 100);
var output = new MemoryStream();
var writer = PdfWriter.GetInstance(document, output);
document.open();

Hashtable providers = new Hashtable();
providers.Add("img_baseurl","C:/users/xx/VisualStudio/Projects/myproject/");
var parsedHtmlElements = HTMLWorker.ParseToList(new StringReader(mainContents), null, providers);
foreach (var htmlElement in parsedHtmlElements)
   document.Add(htmlElement as IElement);

document.Close();

回答by Chris Haas

Every time that I've encountered this the problem was that the image was too large for the canvas. More specifically, even a naked IMGtag internally will get wrapped in a Chunkthat will get wrapped in a Paragraph, and I think that the image is overflowing the Paragraph but I'm not 100% sure.

每次我遇到这个问题时,问题是图像对于画布来说太大了。更具体地说,即使是IMG内部的裸标签也会被包裹在 a 中Chunk,然后将被包裹在 a 中Paragraph,我认为图像溢出了 Paragraph,但我不是 100% 确定。

The two easy fixes are to either enlarge the canvas or to specify image dimensions on the HTML IMGtag. The third more complex route would be to use an additional provider IMG_PROVIDER. To do this you need to implement the IImageProviderinterface. Below is a very simple version of one

两个简单的修复方法是放大画布或在 HTMLIMG标签上指定图像尺寸。第三个更复杂的途径是使用额外的提供者IMG_PROVIDER。为此,您需要实现该IImageProvider接口。下面是一个非常简单的版本

    public class ImageThing : IImageProvider {
        //Store a reference to the main document so that we can access the page size and margins
        private Document MainDoc;
        //Constructor
        public  ImageThing(Document doc) {
            this.MainDoc = doc;
        }
        Image IImageProvider.GetImage(string src, IDictionary<string, string> attrs, ChainedProperties chain, IDocListener doc) {
            //Prepend the src tag with our path. NOTE, when using HTMLWorker.IMG_PROVIDER, HTMLWorker.IMG_BASEURL gets ignored unless you choose to implement it on your own
            src = Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + @"\" + src;
            //Get the image. NOTE, this will attempt to download/copy the image, you'd really want to sanity check here
            Image img = Image.GetInstance(src);
            //Make sure we got something
            if (img == null) return null;
            //Determine the usable area of the canvas. NOTE, this doesn't take into account the current "cursor" position so this might create a new blank page just for the image
            float usableW = this.MainDoc.PageSize.Width - (this.MainDoc.LeftMargin + this.MainDoc.RightMargin);
            float usableH = this.MainDoc.PageSize.Height - (this.MainDoc.TopMargin + this.MainDoc.BottomMargin);
            //If the downloaded image is bigger than either width and/or height then shrink it
            if (img.Width > usableW || img.Height > usableH) {
                img.ScaleToFit(usableW, usableH);
            }
            //return our image
            return img;
        }
    }

To use this provider just add it to the provider collection like you did with HTMLWorker.IMG_BASEURL:

要使用此提供程序,只需将其添加到提供程序集合中,就像您使用的一样HTMLWorker.IMG_BASEURL

providers.Add(HTMLWorker.IMG_PROVIDER, new ImageThing(doc));

It should be noted that if you use HTMLWorker.IMG_PROVIDERthat you are responsible for figuring out everything about the image. The code above assumes that all image paths need to be prepended with a constant string, you'll probably want to update this and check for HTTPat the start. Also, because we're saying that we want to completely handle the image processing pipeline the provider HTMLWorker.IMG_BASEURLis no longer needed.

应该注意的是,如果您使用HTMLWorker.IMG_PROVIDER它,则您有责任弄清楚有关图像的所有内容。上面的代码假设所有图像路径都需要以常量字符串开头,您可能需要更新它并HTTP在开始时检查。此外,因为我们说我们想要完全处理图像处理管道,HTMLWorker.IMG_BASEURL所以不再需要提供程序。

The main code loop would now look something like this:

主代码循环现在看起来像这样:

        string html = @"<img src=""Untitled-1.png"" />";
        string outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "HtmlTest.pdf");
        using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
            using (Document doc = new Document(PageSize.A4, 50, 50, 80, 100)) {
                using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
                    doc.Open();
                    using (StringReader sr = new StringReader(html)) {
                        System.Collections.Generic.Dictionary<string, object> providers = new System.Collections.Generic.Dictionary<string, object>();
                        providers.Add(HTMLWorker.IMG_PROVIDER, new ImageThing(doc));

                        var parsedHtmlElements = HTMLWorker.ParseToList(sr, null,  providers);
                        foreach (var htmlElement in parsedHtmlElements) {
                            doc.Add(htmlElement as IElement);
                        }
                    }
                    doc.Close();
                }
            }
        }

One last thing, make sure to specify which version of iTextSharp you are targetting when posting here. The code above targets iTextSharp 5.1.2.0 but I think you might be using the 4.X series.

最后一件事,请确保在此处发布时指定您要定位的 iTextSharp 版本。上面的代码针对 iTextSharp 5.1.2.0,但我认为您可能正在使用 4.X 系列。

回答by Guru Raja

string siteUrl = HttpContext.Current.Server.MapPath("/images/image/ticket/Ticket.jpg");
string HTML = "<table><tr><td><u>asdasdsadasdsa <img src='" + siteUrl + "' al='tt' /> </u></td></tr></table>";

回答by Fourat

I faced the same problem and tried the following proposed solutions: string replaced a tag, encode in base64 and embed the image to a .NET class library but none worked ! So I've come to the old fashioned solution: adding the logo manually with doc.Add()
Here's your code updated:

我遇到了同样的问题并尝试了以下建议的解决方案:字符串替换了一个标签,在 base64 中编码并将图像嵌入到 .NET 类库中,但没有任何效果!所以我来到了老式的解决方案:手动添加徽标,doc.Add()
这是您的代码更新:

string html = @"<img src=""Untitled-1.png"" />";
        string outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "HtmlTest.pdf");
        using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
            using (Document doc = new Document(PageSize.A4, 50, 50, 80, 100)) {
                using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
                    doc.Open();
                    using (StringReader sr = new StringReader(html)) {
                        System.Collections.Generic.Dictionary<string, object> providers = new System.Collections.Generic.Dictionary<string, object>();
                        providers.Add(HTMLWorker.IMG_PROVIDER, new ImageThing(doc));

                        var parsedHtmlElements = HTMLWorker.ParseToList(sr, null,  providers);
                        foreach (var htmlElement in parsedHtmlElements) {
                            doc.Add(htmlElement as IElement);
                        }
// here's the magic
var logo = iTextSharp.text.Image.GetInstance(Server.MapPath("~/HTMLTemplate/logo.png"));
                logo.SetAbsolutePosition(440, 800);
                document.Add(logo);
// end
                    }
                    doc.Close();
                }
            }
        }