使用 iText7 识别特定的 PDF 字段类型



使用此PDF表单示例: http://foersom.com/net/HowTo/data/OoPdfFormExample.pdf

此代码:

public String getPdfFieldNames() throws IOException {
if (pdf == null || pdf.isClosed()) {
throwPdfNotOpenException();
}
if (getPdfFormType().equals("XFA")) {
throwXfaNotSupportedException();
}
String s = "";
Map<String, PdfFormField> map = form.getFormFields();
for (String key : map.keySet()) {
String simpleFieldType = getSimpleFieldType(form.getField(key));
s += "[Field name: " + key + ", Field type: " + simpleFieldType + "]n";
}
s = (s.substring(0, s.length() - 1));
return s;
}
private String getSimpleFieldType(PdfFormField field) {
if (field.getFormType() == PdfName.Tx) {
return "text box";
} else if (field.getFormType() == PdfName.Ch) {
return "check box";
} else if (field.getFormType() == PdfName.Btn) {
return "button";
} else {
return field.getFormType().toString();
}
// also do radio button
}

生成以下结果:

[Field name: Given Name Text Box, Field type: text box]
[Field name: Family Name Text Box, Field type: text box]
[Field name: Address 1 Text Box, Field type: text box]
[Field name: House nr Text Box, Field type: text box]
[Field name: Address 2 Text Box, Field type: text box]
[Field name: Postcode Text Box, Field type: text box]
[Field name: City Text Box, Field type: text box]
[Field name: Country Combo Box, Field type: check box]
[Field name: Gender List Box, Field type: check box]
[Field name: Height Formatted Field, Field type: text box]
[Field name: Driving License Check Box, Field type: button]
[Field name: Language 1 Check Box, Field type: button]
[Field name: Language 2 Check Box, Field type: button]
[Field name: Language 3 Check Box, Field type: button]
[Field name: Language 4 Check Box, Field type: button]
[Field name: Language 5 Check Box, Field type: button]
[Field name: Favourite Colour List Box, Field type: check box]

如您所见,文本框是正确的,但下拉列表被视为复选框,复选框被视为按钮。

我找到了如何识别特定的字段类型。

更新的方法:

public String getPdfFieldNames() throws IOException {
if (pdf == null || pdf.isClosed()) {
throwPdfNotOpenException();
}
if (getPdfFormType().equals("XFA")) {
throwXfaNotSupportedException();
}
String s = "";
Map<String, PdfFormField> map = form.getFormFields();
for (String key : map.keySet()) {
PdfName type = form.getField(key).getFormType();
String simpleFieldType = getSimpleFieldType(form.getField(key), type, key);
s += "[Field name: " + key + ", Field type: " + simpleFieldType + "]n";
}
s = (s.substring(0, s.length() - 1));
return s;
}
private String getSimpleFieldType(PdfFormField field, PdfName type, String key) {
if (0 == PdfName.Btn.compareTo(type)) {
if(((PdfButtonFormField)form.getField(key)).isPushButton()){
return "Push Button";
} else {
if(((PdfButtonFormField)form.getField(key)).isRadio()){
return "Radio Button";                   
}else {
return "Check Box";
}
}
} else if (0 == PdfName.Ch.compareTo(type)) {
return "List Box";
} else if (0 == PdfName.Sig.compareTo(type)) {
return "Signature";
} else if (0 == PdfName.Tx.compareTo(type)) {
return "Text Box";
}else {
return "Unknown type";
}
}

结果现在显示为:

[Field name: Given Name Text Box, Field type: Text Box]
[Field name: Family Name Text Box, Field type: Text Box]
[Field name: Address 1 Text Box, Field type: Text Box]
[Field name: House nr Text Box, Field type: Text Box]
[Field name: Address 2 Text Box, Field type: Text Box]
[Field name: Postcode Text Box, Field type: Text Box]
[Field name: City Text Box, Field type: Text Box]
[Field name: Country Combo Box, Field type: List Box]
[Field name: Gender List Box, Field type: List Box]
[Field name: Height Formatted Field, Field type: Text Box]
[Field name: Driving License Check Box, Field type: Check Box]
[Field name: Language 1 Check Box, Field type: Check Box]
[Field name: Language 2 Check Box, Field type: Check Box]
[Field name: Language 3 Check Box, Field type: Check Box]
[Field name: Language 4 Check Box, Field type: Check Box]
[Field name: Language 5 Check Box, Field type: Check Box]
[Field name: Favourite Colour List Box, Field type: List Box]

iText 7 返回正确的类型,只是您的代码

private String getSimpleFieldType(PdfFormField field) {
if (field.getFormType() == PdfName.Tx) {
return "text box";
} else if (field.getFormType() == PdfName.Ch) {
return "check box";
} else if (field.getFormType() == PdfName.Btn) {
return "button";
} else {
return field.getFormType().toString();
}
// also do radio button
}

错误地解释信息。

getFormType根据 PDF 规范返回表单字段类型名称,ISO 32000-2 描述了表 226所有字段字典通用条目中的字段类型:

此字典描述的字段类型:

Btn按钮(请参见 12.7.5.2 "按钮字段")

Tx文本(请参见 12.7.5.3 "文本字段")

通道选择(请参见 12.7.5.4 "选项字段")

签名(PDF 1.3) 签名(请参见 12.7.5.5 "签名字段")

所以PdfName.Ch表示一个选择字段,PdfName.Btn表示按钮字段的任何风格;同样根据ISO 32000-2,这次第12.7.5.2节按钮字段

按钮字段(字段类型Btn)表示用户可以用鼠标操作的屏幕上的交互式控件。有三种类型的按钮字段:

  • 按钮是一个纯粹的交互式控件,它立即响应用户输入而不保留永久值(请参见 12.7.5.2.2 "按钮")。
  • 复选框在两种状态之间切换,打开和关闭(请参见 12.7.5.2.3 "复选框")。
  • 单选按钮字段包含一组相关按钮,每个按钮都可以打开或关闭。 通常,在任何给定时间,一组单选按钮中最多只能有一个单选按钮处于打开状态,选择其中一个按钮都会自动取消选择所有其他按钮。(此规则也有例外,如 12.7.5.2.4 "单选按钮"中所述)

最新更新