PDF

Purpose¶

Generate PDFs from HTML files and merge PDFs.

Methods¶

Binding name: p6.pdf

Method:

String fromHTML(String html, String targetUri = null)

Generates a PDF from an HTML string at location specified by targetUri. The CSS of the HTML must be version 2.1 max. The targetUri must point to a local file (e.g. protocol file: only). If the targetUri is Null, a temporary file will be created. Returns the URI written to (String).

Method:

String merge(List<String> sourceUris, String targetUri = null)

Merges PDFs specified in the List of sourceUris and write the result to targetUri. The targetUri must point to a local file (e.g. protocol file: only). If the targetUri is Null, a temporary file will be created. Returns the URI written to (String).

Method:

void parse(Map configuration, Closure rowNotify)

Parses the PDF file specified in the configuration map and calls the given closure with each row processed.

Method:

List<Tuple> parseToList(Map configuration)

Parses the PDF file specified in the configuration map returning the processed values as a List of Tuples (pageNumber, row).

Configuration Map

Configuration Name	Description
`password`	(Optional) Password to use to decrypt the pdf
`spreadsheetDisabled`	(Optional) Force PDF not to be extracted using spreadsheet-style extraction (if there are ruling lines separating each cell, as in a PDF of an Excel spreadsheet). The default is true.
`areaFail`	(Optional) If a configured area does not select text on a page a P6Exception is thrown, unless this value is false. The default is true
`areaN`	(Optional) `N` is a zero based numeric. If no area(s) are given, the whole of each page will be used as the bounding area. All areas defined will be applied to each page specified. Area format is defined in ‘Points’ and can be identified using OSX Preview via ‘Rectangular Selection’ mode. A comma separated string is required: `'{top},{left},{width},{height}'`
`columnN`	(Optional) `N` is a zero based numeric. A comma separated list of X coordinates of column boundaries.
`uri`	(Mandatory) The URI of the source PDF file to parse.
`pages`	(Optional) If not specified, all pages in the source file will be processed. A comma separated string list of page numbers is required.

Method:

void split(Map configuration)

Copy pages from a source PDF file to a destination PDF file.

Configuration Map

Configuration Name	Description
`password`	(Optional) Password to use to decrypt the pdf
`keepAnnotations`	(Optional) true to retain any annotations in the destination (default: false)
`startPage`	(Mandatory) A one based numeric specifying the first page to copy to the new destination
`endPage`	(Mandatory) A one based numeric specifying the last page (and all pages in between) to copy to the new destination
`sourceUri`	(Mandatory) The URI of the source PDF file
`destinationUri`	(Mandatory) The URI of the destination PDF file. Destination will always be overwritten

Examples¶

def cnf = [
    area0: '402.89,17.24,550.29,64.89',
    area1: '30.6,346.29,195.95,150.07',
    pages: '1,2',
    uri: 'file:${P6_DATA}/00140_Facture Alfa.pdf'
]

p6.pdf.parse(cnf) { pageNumber, row ->
    println( pageNumber + ": " + row )
    if ( pageNumber == 2) false         // Returning false will halt page iteration
    else true
}

def cnf = [
    area0: '402.89,17.24,550.29,64.89',
    areaFail: false,
    uri: 'file:${P6_DATA}/00140_Facture Alfa.pdf'
]

def lstTuples = p6.pdf.parseToList(cnf)

lstTuples.each { tup ->
    println( tup.get(0) + ": " + tup.get(1) )
}

def cnf = [
    startPage: 3,
    endPage: 4,
    sourceUri: 'file:${P6_DATA}/00140_Facture Alfa.pdf',
    destinationUri: 'file:/tmp/page4.pdf'
]

p6.pdf.split(cnf)