CSV
Purpose¶
A general purpose comma separated values (CSV) format parser with inbuilt validation and output format templating.
Methods¶
Binding name: csv
Method: void parse(Map configuration, Closure rowNotify)
Parses the CSV file specified in the configuration map and calls the given closure with each row processed.
Method: List parseToList(Map configuration)
Parses the CSV file specified in the configuration map returning the processed values as a List or Maps.
Method: int parseToXml(Map configuration, Closure docNotify)
Parses the CSV file specified in the configuration map and calls the given docNotify closure with each XML document generated. This method returns the number of XML documents generated.
Configuration map
Configuration Name | Description |
---|---|
separator | Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html |
quoteChar | Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html |
escape | Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html |
skipLines | Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html |
strictQuotes | Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html |
ignoreLeadingWhiteSpace | Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html |
useFirstLineHeaders | Optional. Set to true to read column descriptions from the first line of the csv. These column names will be used in all row Maps returned. Otherwise column name are col0..colN. The default is false. |
validation | Optional. A Map of column numbers and the name of a valid content type. Type names are: date, byte, short int,long, double float, email, url and creditcard. See: http://commons.apache.org/proper/commons-validator/apidocs/org/apache/commons/validator/GenericValidator.html. If a value is null or empty, validation is not applied. If validation is applied and fails then a B2boxException is thrown. |
groovyTemplate | One of five groovy template types. Mandatory for parseToXml. See: http://docs.groovy-lang.org/next/html/documentation/template-engines.html |
uri | Mandatory. A file or jcr URI to open and use as the CSV input source (expressed as a string). |
funcNext | Optional closure. Parameters (lastrow, currentrow). Used by parseToXml. Return true to generate Xml document false to continue accumulating data |
funcValidate | Optional closure. Paramaters (row, column, value). Called each time a value in a row is validated. Return null or an empty string to signify the value is valid. Or return a message detailing why the validation has failed. |
funcException | Optional closure. Parameters (row, column, value, message). Called each time the validation of a value in a row fails. Return true continue processing on the next row. Return false to exit the processor with a B2boxException. |
Closure
A single parameters containing an array of rows is passed to the closure.
Each row is a map that uses the column names as keys if useFirstLineHeaders
is set to true.
Examples¶
Simple parse of CSV file with notification per row.
def cnf = [
skipLines: 1,
separator: ',',
useFirstLineHeaders: true,
uri:"file:./src/test/resources/test1.csv"
]
csv.parse(cnf) { row ->
println(row)
// Could validate the row content in here and return false to halt the parse
true
}
In memory parse of CSV file. Results returned as a List or Map.
def cnf = [
skipLines: 1,
separator: ',',
useFirstLineHeaders: true,
uri:"file:./src/test/resources/test1.csv"
]
csv.parseToList(cnf).inject(0) { i, entry->
println(entry)
}
Parse CSV file to XML with document stepping control via the script; notifications per document generated.
def tpl = '''
<response version-api="2.0" xmlns:gsp="http://groovy.codehaus.org/2005/gsp">
<value>
<addresses>
<gsp:scriptlet>rows.eachWithIndex{row,index-></gsp:scriptlet>
<address id="${index}">
<!-- You can use GString expressions -->
<uniqueid>${row.UniqueName}</uniqueid>
<name id="${index}">
<!-- Or you can use expression tags as well -->
<gsp:expression>row.Name</gsp:expression>
</name>
</address>
<gsp:scriptlet>}</gsp:scriptlet>
</addresses>
</value>
</response>
'''
def cnf = [
skipLines: 1,
useFirstLineHeaders: true,
uri:"file:./src/test/resources/test1.csv",
groovyTemplate: new groovy.text.XmlTemplateEngine().createTemplate(tpl),
funcNext: { lastrow, currentrow ->
if(null != lastrow){
if(lastrow.Name != currentrow.Name){
// Change of name so build xml
return true
}
}
false
}
]
def docs = csv.parseToXml(cnf){ gpath ->
println(XmlUtil.serialize(gpath))
}
Alternative Groovy template engines
Groovy currently provides five template engines. Each engine supports a different template syntax and is suited to a different task:
- SimpleTemplateEngine
- StreamingTemplateEngine
- XmlTemplateEngine
- GStringTemplateEngine
- MarkupTemplateEngine
For further details see: http://docs.groovy-lang.org/next/html/documentation/template-engines.html
Here is an example of using the SimpleTemplateEngine.
def tpl2 = '''
<response>
<value>
<addresses>
<% rows.eachWithIndex{row,index-> %>
<address id="${index}"><uniqueid>${row.UniqueName}</uniqueid><name id="${index}">${row.Name}</name></address>
<% } %>
</addresses>
</value>
</response>
'''
def cnf2 = [
skipLines: 1,
useFirstLineHeaders: true,
uri:"file:/Users/simont/Documents/temp/test.csv",
groovyTemplate: new groovy.text.SimpleTemplateEngine().createTemplate(tpl2),
funcNext: { lastrow, currentrow ->
if(null != lastrow){
if(lastrow.Name != currentrow.Name){
// Change of name so build xml
return true
}
}
false
}
]
def docs = csv.parseToXml(cnf2){ gpath ->
println(groovy.xml.XmlUtil.serialize(gpath))
}
Simple parse of CSV file with notification per row and validation or column vales 1 and 17.
def cnf = [
skipLines: 1,
separator: ',',
useFirstLineHeaders: true,
uri:"file:./src/test/resources/test1.csv",
validation: [1: 'int', 17: 'date']
]
csv.parse(cnf) { row ->
println(row)
true
}
Simple parse of CSV file with notification per row and complex validation rule and exception handler.
def cnf = [
skipLines: 1,
separator: ',',
useFirstLineHeaders: true,
uri:"file:./src/test/resources/test1.csv",
funcValidate: { row, column, value ->
if(column == 8 && value == "1946437"){
"This is not the droid you are looking for! row:" + row
}
else { "" }
},
funcException: { row, column, value, message ->
println("Ooops! " + message)
// true to continue processing
true
}
]
csv.parse(cnf) { row ->
println(row)
true
}