docx4java的SectionWrapper.getHeaderFooterPolicy - 我可以使用它来删除页眉和页脚吗 - docx4java's SectionWrapper.getHeaderFooterPolicy -- can I use this to remove headers & footers 小贝子编程网

重写以看起来更像编程问题

好吧，所以我做了更多的研究，看起来我需要使用的java包是docx4j。不幸的是，由于我不熟悉这个包以及PDF格式的基础，我很难弄清楚如何使用返回的SectionWrapper.getHeaderFooterPolicy()的页眉和页脚。目前还不完全清楚返回的HeaderPart和FooterPart对象是否是可写的，也不完全清楚如何修改它们。

这段代码提供了一个如何创建页眉部分的示例，但它创建了一个新的HeaderPart并将其添加到文档中。

我想找到现有的页眉/页脚部分，如果可能的话，请删除它们，或者清空它们。理想情况下，它们将完全从文件中删除。

这段代码很相似，允许您使用setJaxbElement设置headerpart的文本，但这个术语很多都不熟悉，我担心最终结果是我在每个文档中创建标题（尽管是空的），而不是删除它们。

下面的原始问题

我正在处理一组变化很大的MS Word文档。我正在把它们编译成一个PDF文件，在这样做之前，我想确保它们都没有页眉或页脚

理想情况下，如果不是Times New Roman，我也希望覆盖它们的默认字体。

有没有任何方法可以通过编程或使用某种批处理过程来实现这一点？

我将在目前没有安装Office或Word的Windows服务器上运行此程序（尽管我认为它可能安装了OpenOffice，当然也很容易添加安装）。

现在我正在使用一些版本的iText（java）将文件转换为PDF。我知道iText显然不能做删除页眉/页脚之类的事情，但由于现代.doc文件的底层结构是XML，我想知道是否有API（甚至是XML解析/编辑API，如果其他一切都失败了，还有RegEx[恐怖]）可以删除页眉和页脚并设置一些默认样式。

下面是一些热门代码，可以随心所欲：

public class HeaderFooterRemove  {
public static void main(String[] args) throws Exception {
    // A docx or a dir containing docx files
    String inputpath = System.getProperty("user.dir") + "/testHF.docx";
    StringBuilder sb = new StringBuilder(); 
    File dir = new File(inputpath);
    if (dir.isDirectory()) {
        String[] files = dir.list();
        for (int i = 0; i<files.length; i++  ) {
            if (files[i].endsWith("docx")) {
                sb.append("nn" + files[i] + "n");
                removeHFFromFile(new java.io.File(inputpath + "/" + files[i]));     
            }
        }
    } else if (inputpath.endsWith("docx")) {
        sb.append("nn" + inputpath + "n");
        removeHFFromFile(new java.io.File(inputpath ));     
    }
    System.out.println(sb.toString());
}
public static void removeHFFromFile(File f) throws Exception {

    WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
            .load(f);
    MainDocumentPart mdp = wordMLPackage.getMainDocumentPart();
    // Remove from sectPr
    SectPrFinder finder = new SectPrFinder(mdp);
    new TraversalUtil(mdp.getContent(), finder);
    for (SectPr sectPr : finder.getSectPrList()) {
        sectPr.getEGHdrFtrReferences().clear();
    }
    // Remove rels
    List<Relationship> hfRels = new ArrayList<Relationship>(); 
    for (Relationship rel : mdp.getRelationshipsPart().getRelationships().getRelationship() ) {
        if (rel.getType().equals(Namespaces.HEADER)
                || rel.getType().equals(Namespaces.FOOTER)) {
            hfRels.add(rel);
        }
    }
    for (Relationship rel : hfRels ) {
        mdp.getRelationshipsPart().removeRelationship(rel);
    }
        wordMLPackage.save(f);              
    }
}

上面的代码依赖于SectPFinder，所以把它复制到某个地方。

为了简洁起见，我省略了导入部分。但你可以从GitHub 复制这些

当涉及到将一组docx制作成一个PDF时，很明显，你可以将它们合并成一个docx，然后将其转换成PDF，或者将它们全部转换为PDF，然后合并这些PDF。如果您更喜欢前一种方法（例如，因为最终用户希望能够编辑文档包），那么您可能希望考虑我们对docx4j的商业扩展MergeDocx。

要删除页眉/页脚，有一个非常简单的解决方案：

以Zip打开docx，并删除名为header*.xml/footer*.xml的文件（位于word文件夹中）。

解压缩文档的结构：https://stackoverflow.com/tags/docx/info

要真正删除链接（如果你不这样做，它可能会损坏）：

您需要编辑document.xml.rels文件，并删除所有包含页脚/页眉的Relationships。这是一个你应该删除的关系：

<Relationship Id="rId13" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer2.xml"/>

更普遍地说，所有包含type='foot'或type='header'

docx4java的SectionWrapper.getHeaderFooterPolicy - 我可以使用它来删除页眉和页脚吗

重写以看起来更像编程问题

下面的原始问题

相关内容

最新更新

热门标签：