在实现ItextSharp的情况下,将HTML转换为PDF



编辑:下面提出了完美的解决方案(以错误的顺序关闭流)。我最终选择了Premailer.net htmlagilitypack wkhtmltopdf的开源替代方案,因为它更适合我的需求。

我正在尝试将ItextSharp插入C#以将HTML转换为PDF文件,包括将相对URI转换为链接和图像。我有一个非常基本的实现,即"更改默认配置"(http://demo.itextsupport.com/xmlworker/xmlworker/itextdoc/flatsite.html),从java转换为c#,以尝试尝试。但是,我馈入脚本的示例html(我已经测试过)返回我通过文本编辑器编辑时创建的PDF中的以下内容:

%PDF-1.4
%âãÏÓ

这似乎是错误的。此外,内存线具有与之相关的少数字节。我的ItextSharp的实现是错误的,还是我使用流或其他C#构造?

using System.IO;
using System.Text;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.tool.xml.html;
using iTextSharp.tool.xml.pipeline.html;
using iTextSharp.tool.xml;
using iTextSharp.tool.xml.parser;
using iTextSharp.tool.xml.pipeline.css;
using iTextSharp.tool.xml.pipeline.end; 
class Program
{
    static void Main(string[] args)
    {
        FontFactory.RegisterDirectories();
        var document = new Document();
        var memoryStream = new MemoryStream();
        var pdfWriter = PdfWriter.GetInstance(document, memoryStream );
        document.Open();
        var htmlContext = new HtmlPipelineContext(null);
        htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
        htmlContext.SetImageProvider(new ImageProvider());
        htmlContext.SetLinkProvider(new LinkProvider());
        htmlContext.CharSet(Encoding.UTF8);
        var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(true);
        var pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, pdfWriter)));
        var xmlWorker = new XMLWorker(pipeline, true);
        var xmlParser = new XMLParser(xmlWorker);
        var inputFileStream = new FileStream("testHTML.html", FileMode.Open);
        xmlParser.Parse(inputFileStream);
        inputFileStream.Close();
        memoryStream.Position = 0;
        pdfWriter.CloseStream = false;
        var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write);
        memoryStream.WriteTo(outputFileStream);
        outputFileStream.Close();
        document.Close();
    }
}
class ImageProvider : AbstractImageProvider
{
    public override string GetImageRootPath()
    {
        return "testDir/";
    }
}
class LinkProvider : ILinkProvider
{
    public string GetLinkRoot()
    {
        return "http://www.examplesite.com/testdir/";
    }
}

非常感谢您的时间和帮助!

您在关闭ITEXT document之前抓住内存流的内容:

    memoryStream.WriteTo(outputFileStream);
    outputFileStream.Close();
    document.Close();

,但仅在关闭文档时,itext才能完成输出PDF,尤其是刷新当前页面的内容并添加交叉引用等。

因此,更改您的代码

    memoryStream.Position = 0;
    pdfWriter.CloseStream = false;
    var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write);
    memoryStream.WriteTo(outputFileStream);
    outputFileStream.Close();
    document.Close();

到这个

    pdfWriter.CloseStream = false;
    document.Close();
    var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write);
    memoryStream.Position = 0;
    memoryStream.WriteTo(outputFileStream);
    outputFileStream.Close();

最新更新