java 表单导出xml文件,如何将pdf表单字段自动导出到xml

I have a pdf file including form fields and need to export the data into a xml file AUTOMATICALLY. Here is a screen of a sample form I created for testing:

Note: It works great exporting it MANUALLY using Acrobat Professional by clicking on Tools > Form > Export Form Data and finally chose xml extension for file output. This is the result I'm getting when I export it manually:

John

Doe

However, I need to automate it, e.g. with a python script, Java implementation or some command line tools. Any ideas which libraries or tools I could use to export form field data to xml? The tool or library should be open source, that I can integrate it in my workflow.

I already tried python pdfminer library, which helped me to export static parts (like Static form header, First name: and Last name:) of the pdf file: But how to export form field data (in my case the content of the form fields first_name and last_name)??

EDIT: Feel free to download the sample.pdf file here.

解决方案

How about Apache PDFBox? It is open source and could fit your needs, since the website says "Extract forms data from PDF forms or prefill a PDF form."

EDIT: Check out the PrintFields example.

你可能感兴趣的:(java,表单导出xml文件)