CSV

Purpose

A general purpose comma separated values (CSV) format parser with inbuilt validation and output format templating.

Methods

Binding name: csv


Method: void parse( Map configuration, Closure rowNotify )

Parses the CSV file specified in the configuration map and calls the given closure with each row processed.


Method: List parseToList( Map configuration )

Parses the CSV file specified in the configuration map returning the processed values as a List or Maps.


Method: int parseToXml( Map configuration, Closure docNotify )

Parses the CSV file specified in the configuration map and calls the given docNotify closure with each XML document generated. This method returns the number of XML documents generated.

Configuration map

Configuration Name Description
separator Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
quoteChar Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
escape Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
skipLines Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
strictQuotes Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
ignoreLeadingWhiteSpace Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
useFirstLineHeaders Optional. Set to true to read column descriptions from the first line of the csv. These column names will be used in all row Maps returned. Otherwise column name are col0..colN. The default is false.
validation Optional. A Map of column numbers and the name of a valid content type. Type names are: date, byte, short int,long, double float, email, url and creditcard. See: http://commons.apache.org/proper/commons-validator/apidocs/org/apache/commons/validator/GenericValidator.html. If a value is null or empty, validation is not applied. If validation is applied and fails then a B2boxException is thrown.
groovyTemplate One of five groovy template types. Mandatory for parseToXml. See: http://docs.groovy-lang.org/next/html/documentation/template-engines.html
uri Mandatory. A file or jcr URI to open and use as the CSV input source (expressed as a string).
funcNext Optional closure. Parameters ( lastrow, currentrow ). Used by parseToXml. Return true to generate Xml document false to continue accumulating data
funcValidate Optional closure. Paramaters ( row, column, value ). Called each time a value in a row is validated. Return null or an empty string to signify the value is valid. Or return a message detailing why the validation has failed.
funcException Optional closure. Parameters ( row, column, value, message ). Called each time the validation of a value in a row fails. Return true continue processing on the next row. Return false to exit the processor with a B2boxException.

Closure

A single parameters containing an array of rows is passed to the closure. Each row is a map that uses the column names as keys if useFirstLineHeaders is set to true.

Examples

Simple parse of CSV file with notification per row.

def cnf = [
    skipLines: 1,
    separator: ',',
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv"
]

csv.parse(cnf) { row ->
        println( row )
        // Could validate the row content in here and return false to halt the parse
        true
}

In memory parse of CSV file. Results returned as a List or Map.

def cnf = [
    skipLines: 1,
    separator: ',',
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv"
]
csv.parseToList(cnf).inject(0) { i, entry->
 println( entry )
}

Parse CSV file to XML with document stepping control via the script; notifications per document generated.

def tpl = '''
     <response version-api="2.0" xmlns:gsp="http://groovy.codehaus.org/2005/gsp">
         <value>
             <addresses>
                 <gsp:scriptlet>rows.eachWithIndex{row,index-></gsp:scriptlet>
                     <address id="${index}">           
                         <!-- You can use GString expressions -->
                         <uniqueid>${row.UniqueName}</uniqueid>   
                         <name id="${index}">         
                             <!-- Or you can use expression tags as well -->
                             <gsp:expression>row.Name</gsp:expression>
                         </name>
                     </address>
                 <gsp:scriptlet>}</gsp:scriptlet>
             </addresses>
         </value>
     </response>
'''

def cnf = [
    skipLines: 1,
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv",
    groovyTemplate: new groovy.text.XmlTemplateEngine().createTemplate( tpl ),
    funcNext: { lastrow, currentrow ->

        if( null != lastrow ){
            if( lastrow.Name != currentrow.Name ){
                // Change of name so build xml
                return true
            }
        }
        false
    }
]

def docs = csv.parseToXml(cnf){ gpath ->
    println( XmlUtil.serialize( gpath ))
}

Alternative Groovy template engines

Groovy currently provides five template engines. Each engine supports a different template syntax and is suited to a different task:

  • SimpleTemplateEngine
  • StreamingTemplateEngine
  • XmlTemplateEngine
  • GStringTemplateEngine
  • MarkupTemplateEngine

For further details see: http://docs.groovy-lang.org/next/html/documentation/template-engines.html

Here is an example of using the SimpleTemplateEngine.

def tpl2 = '''
     <response>
         <value>
             <addresses>
                 <% rows.eachWithIndex{row,index-> %>
                     <address id="${index}"><uniqueid>${row.UniqueName}</uniqueid><name id="${index}">${row.Name}</name></address>
                 <% } %>
             </addresses>
         </value>
     </response>
'''

def cnf2 = [
    skipLines: 1,
    useFirstLineHeaders: true,
    uri:"file:/Users/simont/Documents/temp/test.csv",
    groovyTemplate: new groovy.text.SimpleTemplateEngine().createTemplate( tpl2 ),
    funcNext: { lastrow, currentrow ->

        if( null != lastrow ){
            if( lastrow.Name != currentrow.Name ){
                // Change of name so build xml
                return true
            }
        }
        false
    }
]

def docs = csv.parseToXml(cnf2){ gpath ->
    println( groovy.xml.XmlUtil.serialize( gpath ))
}

Simple parse of CSV file with notification per row and validation or column vales 1 and 17.

def cnf = [
    skipLines: 1,
    separator: ',',
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv",
    validation: [1: 'int', 17: 'date']
]

csv.parse(cnf) { row ->
        println( row )
        true
}

Simple parse of CSV file with notification per row and complex validation rule and exception handler.

def cnf = [
    skipLines: 1,
    separator: ',',
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv",
    funcValidate: { row, column, value ->
        if( column == 8 && value == "1946437"){
            "This is not the droid you are looking for! row:" + row  
        } else {
            ""
        }
    },
    funcException: { row, column, value, message ->
        println("Ooops! " + message)
        // true to continue processing
        true
    }
]

csv.parse(cnf) { row ->
        println( row )
        true
}