Reference Guide

Copies of this document may be made for your own use and for distribution to others, provided that you do not charge any fee for such copies and further provided that each copy contains this Copyright Notice, whether distributed in print or electronically.

1.0. Introduction

BeanIO is an open source Java framework for reading and writing Java objects from a flat file, stream, or any String input. BeanIO is well suited for batch processing, and currently supports XML, CSV, delimited and fixed length file formats. BeanIO is licensed under the Apache 2.0 License.

1.1. What's new in 3.0?

BeanIO 3.0 is the continuation of the BeanIO 2.x series. Because the original maintainer wasn't working on BeanIO anymore and seemed to be unreachable, we decided to fork the project and host it on GitHub Pages instead of beanio.org. Version 3.0 brings new exciting features like:

support for java.time types such as LocalDateTime and ZonedDateTime
BeanWriter now implements AutoCloseable to be used in try-with-resources
BeanReader now implements Closeable to be used in try-with-resources
an Automatic-Module-Name was added to MANIFEST.MF for BeanIO to be compatible with the Java Platform Module System

1.2. Migrating from 2.x to 3.0

Release 3.0 is almost backwards compatible with prior 2.x releases, with the following exceptions:

a JDK 1.8+ is now required
the Maven groupId changed from org.beanio to com.github.beanio

2.0. Getting Started

To get started with BeanIO, add beanio.jar to your application's classpath.

In Maven projects:

<dependency>
    <groupId>com.github.beanio</groupId>
    <artifactId>beanio</artifactId>
    <version>3.2.0</version>
</dependency>

In Gradle projects:

implementation 'com.github.beanio:beanio:3.2.0'

BeanIO requires a version 1.8 JDK or higher. In order to process XML formatted streams, BeanIO also requires an XML parser based on the Streaming API for XML (StAX), as specified by JSR 173. JDK 1.6 and higher includes a StAX implementation and therefore does not require any additional libraries.

2.1. My First Stream

This section explores a simple example that uses BeanIO to read and write a flat file containing employee data. Let's suppose the file is in CSV format and has the following record layout:

Position	Field	Format
0	First Name	Text
1	Last Name	Text
2	Job Title	Text
3	Salary	Number
4	Hire Date	Date (MMDDYYYY)

A sample file is shown below.

Joe,Smith,Developer,75000,10012009
Jane,Doe,Architect,80000,01152008
Jon,Anderson,Manager,85000,03182007

Next, let's suppose we want to read records into the following Java bean for further processing. Remember that a Java bean must have a default no-argument constructor and public getters and setters for all exposed properties.

package example;
import java.util.Date;

public class Employee {
    String firstName;
    String lastName;
    String title;
    int salary;
    Date hireDate;
    
    // getters and setters not shown...
}

BeanIO uses an XML configuration file, called a mapping file, to define how bean objects are bound to records. Below is a mapping file, named mapping.xml, that could be used to read the sample employee file and unmarshall records into Employee objects. The same mapping file can be used to write, or marshall, Employee objects to a file or output stream.

<beanio xmlns="http://www.beanio.org/2012/03" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">

  <stream name="employeeFile" format="csv">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" format="MMddyyyy" />
    </record>
  </stream>
</beanio>

To read the employee CSV file, a StreamFactory is used to load our mapping file and create a new BeanReader instance. The BeanReader is used to unmarshall Employee objects from the file employee.csv. (For the sake of brevity, proper exception handling is not shown.)

package example;

import org.beanio.*;
import java.io.*;

public class BeanReaderExample {
    public static void main(String[] args) throws Exception {
        // create a StreamFactory
        StreamFactory factory = StreamFactory.newInstance();
        // load the mapping file
        factory.load("mapping.xml");
        
        // use a StreamFactory to create a BeanReader
        try (BeanReader in = factory.createReader("employeeFile", new File("employee.csv"))) {
            Employee employee;
            while ((employee = (Employee) in.read()) != null) {
                // process the employee...
            }
        }
    }
}

To write an employee CSV file, the same StreamFactory class is used to create a BeanWriter for marshalling Employee bean objects to the file employee.csv. In this example, the same mapping configuration file is used for both reading and writing an employee file.

package example;

import org.beanio.*;
import java.io.*;
import java.util.*;

public class BeanWriterExample {
    public static void main(String[] args) throws Exception {
        // create a StreamFactory
        StreamFactory factory = StreamFactory.newInstance();
        // load the mapping file
        factory.load("mapping.xml");
        
        Employee employee = new Employee();
        employee.setFirstName("Jennifer");
        employee.setLastName("Jones");
        employee.setTitle("Marketing")
        employee.setSalary(60000);
        employee.setHireDate(new Date());
        
        // use a StreamFactory to create a BeanWriter
        try (BeanWriter out = factory.createWriter("employeeFile", new File("employee.csv"))) {
            // write an Employee object directly to the BeanWriter
            out.write(employee);
            out.flush();
        }
    }
}

Running BeanWriterExample produces the following CSV file.

Jennifer,Jones,Marketing,60000,01012011

3.0. Core Concepts

3.1. BeanReader

The org.beanio.BeanReader interface, shown below, is used to read bean objects from an input stream. The read() method returns an unmarshalled bean object for the next record or group of records read from the input stream. When the end of the stream is reached, null is returned.

The method setErrorHandler(...) can be used to register a custom error handler. If an error handler is not configured, read() simply throws the unhandled exception.

The method getRecordName() returns the name of the record (or group) mapped to the most recent bean object read from the input stream, as declared in the mapping file. And getLineNumber() returns the line number of the first record mapped to the most recent bean object read from the input stream. Additional information is available about records read from the stream by calling getRecordCount and getRecordContext. Please consult the API documentation for further information.

Before discarding a BeanReader, close() should be invoked to close the underlying input stream.

package org.beanio;

public interface BeanReader {

    public Object read() throws BeanReaderException;
    
    public int getLineNumber();
    
    public String getRecordName();
    
    public int getRecordCount();
    
    public RecordContext getRecordContext(int index); 
    
    public int skip(int count) throws BeanReaderException;
    
    public void close() throws BeanReaderIOException;
    
    public void setErrorHandler(BeanReaderErrorHandler errorHandler);
}

3.2. BeanWriter

The org.beanio.BeanWriter interface, shown below, is used to write bean objects to an output stream. Calling the write(Object) method marshals a bean object to the output stream. In some cases where multiple record types are not discernible by class type or record identifying fields, the write(String,Object) method can be used to explicitly name the record type to marshal.

Before discarding a BeanWriter, close() should be invoked to close the underlying output stream.

package org.beanio;

public interface BeanWriter {

    public void write(Object bean) throws BeanWriterException;
    
    public void write(String recordName, Object bean) throws BeanWriterException;
    
    public void flush() throws BeanWriterIOException;
    
    public void close() throws BeanWriterIOException;
}

3.3. Unmarshaller

The org.beanio.Unmarshaller interface, shown below, is used to unmarshal a bean object from a String record.

package org.beanio;

public interface Unmarshaller {

    // For all stream formats
    public Object unmarshal(String record) throws BeanReaderException;
    
    // For CSV and delimited formatted streams
    public Object unmarshal(List<String> fields) throws BeanReaderException;
    public Object unmarshal(String[] fields) throws BeanReaderException;

    // For XML formatted streams
    public Object unmarshal(Node node) throws BeanReaderException;
    
    public String getRecordName();
    
    public RecordContext getRecordContext();
}

3.4. Marshaller

The org.beanio.Marshaller interface, shown below, is used to marshal a bean object into a String record.

package org.beanio;

public interface Marshaller {

    public Marshaller marshal(Object bean) throws BeanWriterException;
    
    public Marshaller marshal(String recordName, Object bean) throws BeanWriterException;
    
    // For all stream formats
    public String toString();
    
    // For CSV and delimited formatted streams
    public String[] toArray() throws BeanWriterException;
    public List<String> toList() throws BeanWriterException;
    
    // For XML formatted streams
    public Document toDocument() throws BeanWriterException;
}

Marshalling a single bean object to record text is now as simple as:

String recordText = marshaller.marshal(object).toString();

3.5. Mapping Files

BeanIO uses XML configuration files, called mapping files, to bind a stream layout to bean objects. Multiple layouts can be configured in a single mapping file using stream elements. Each stream is assigned a unique name for referencing the layout. In addition to its name, every stream must declare its format using the format attribute. Supported stream formats include csv, delimited, fixedlength, and xml. Mapping files are fully explained in the next section (4.0. The Mapping File).

<beanio xmlns="http://www.beanio.org/2012/03" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">

  <stream name="stream1" format="csv"... >
    <!-- record layout... -->
  </stream>
  
  <stream name="stream2" format="fixedlength"... >
    <!-- record layout... -->
  </stream>
    
</beanio>

3.6. StreamFactory

The org.beanio.StreamFactory class is used to load mapping files and create BeanReader, BeanWriter, Marshaller and Unmarshaller instances. The following code snippet shows how to instantiate a StreamFactory, load a mapping file and create the various BeanIO parsers. The load(...) method loads mapping files from the file system (relative to the current working directory), while the method loadResource(...) loads mapping files from the classpath.

// create a StreamFactory
StreamFactory factory = StreamFactory.newInstance();
// load 'mapping-1.xml' from the current working directory
factory.load("mapping-1.xml");
// load 'mapping-2.xml' from the classpath
factory.loadResource("mapping-2.xml");'

// create a BeanReader to read from 'in.txt'
Reader in = new BufferedReader(new FileReader("in.txt"));
BeanReader beanReader = factory.createReader("streamName", in);

// create a BeanWriter to write to 'out.txt'
Writer out = new BufferedWriter(new FileWriter("out.txt"));
BeanWriter beanWriter = factory.createWriter("streamName", out);

// create an Unmarshaller to unmarshal bean objects from record text
Unmarshaller unmarshaller = factory.createUnmarshaller("streamName");

// create a Marshaller to marshal bean objects to record text
Marshaller marshaller = factory.createMarshaller("streamName");

3.7. Exception Handling

All BeanIO exceptions extend from BeanIOException, which extends from RuntimeException so that exceptions do not need to be explicitly caught unless desired. BeanReaderException and BeanWriterException extend from BeanIOException and may be thrown by a BeanReader or BeanWriter respectively.

A BeanReaderException is further broken down into the following subclasses thrown by the read() method.

Exception	Description
`BeanReaderIOException`	Thrown when the underlying input stream throws an `IOException`.
`MalformedRecordException`	Thrown when the underlying input stream is malformed based on the configured stream format, and therefore a record could not be accurately read from the stream. In many cases, further reads from the input stream will be unsuccessful.
`UnidentifiedRecordException`	Thrown when a record does not match any record definition configured in the mapping file. If the stream layout does not strictly enforce record sequencing, further reads from the input stream are likely to be successful.
`UnexpectedRecordException`	Thrown when a record is read out of order. Once record sequencing is violated, further reads from the input stream are likely to be unsuccessful.
`InvalidRecordException`	Thrown when a record is matched, but the record is invalid for one of the following reasons: A record level validation failed One or more field level validations failed Field type conversion failed This exception has no effect on the state of the `BeanReader` and further reads from the input stream can be safely performed.
`InvalidRecordGroupException`	Extends from `InvalidRecordException` and is thrown when one or more records in a group (that are mapped to a single bean object) are invalid. This exception has no effect on the state of the `BeanReader` and further reads from the input stream can be safely performed.
`BeanReaderException`	Thrown directly in a few rare unrecoverable scenarios.

When a BeanReaderException is thrown, information about the failed record(s) can be accessed by calling exception.getRecordContext() to obtain a org.beanio.RecordContext. Please refer to the API javadocs for more information.

package org.beanio;

public interface RecordContext {
    public int getLineNumber();
    public String getRecordText();
    public String getRecordName();
    public boolean hasErrors();
    public boolean hasRecordErrors();
    public Collection<String> getRecordErrors();
    public String getFieldText(String fieldName);
    public String getFieldText(String fieldName, int index);
    public boolean hasFieldErrors();
    public Map<String, Collection<String>> getFieldErrors();
    public Collection<String> getFieldErrors(String fieldName);
}

3.7.1. BeanReaderErrorHandler

If you need to handle an exception and continue processing, it may be simpler to register a BeanReaderErrorHandler using the beanReader.setErrorHandler() method. The BeanReaderErrorHandler interface is shown below. Any exception thrown by the error handler will be rethrown by the BeanReader.

package org.beanio;

public interface BeanReaderErrorHandler {
    public void handleError(BeanReaderException ex) throws Exception;
}

The following example shows how invalid records could be written to a reject file by registering an error handler extending BeanReaderErrorHandlerSupport, a subclass of BeanReaderErrorHandler. All other exceptions are left uncaught and will bubble up to the calling method.

    BeanReader input;
    BufferedWriter rejects;
    try {
        input.setErrorHandler(new BeanReaderErrorHandlerSupport() {
            public void invalidRecord(InvalidRecordException ex) throws Exception {
                // if a bean object is mapped to a record group,
                // the exception may contain more than one record
            	for (int i=0, j=ex.getRecordCount(); i<j; i++) {
                    rejects.write(ex.getRecordContext(i).getRecordText());
                    rejects.newLine();
                }
            }
        });
        
        Object record = null;
        while ((record = input.read()) != null) {
            // process a valid record
        }
        
        rejects.flush();
    }
    finally {
        input.close();
        rejects.close();
    }

4.0. Stream Components

This section covers the basic components used by BeanIO to map an input stream or String to Java objects. All examples are shown using a mapping file, but the concepts (and most attributes) are the same whether using the stream builder API, mapping file, Java annotations, or any combination thereof.

4.1. Streams

A typical mapping file contains one or more stream layouts. A stream must have a name and format attribute configured. The name of the stream is used to reference the layout when creating a parser using a StreamFactory. And the format instructs BeanIO how to interpret the stream. Supported formats include xml, csv, delimited and fixedlength.

<beanio xmlns="http://www.beanio.org/2012/03" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">

  <stream name="stream1" format="csv"... >
    <!-- record layout... -->
  </stream>
  
  <stream name="stream2" format="fixedlength"... >
    <!-- record layout... -->
  </stream>
    
</beanio>

BeanIO parses (and formats) a record from a stream or text using a record parser generated by a RecordParserFactory. BeanIO allows you to create and customize your own RecordParserFactory, but in most cases you can simply configure BeanIO's default record parser factory using a stream's parser element. The parser element allows you to set format specific properties on a RecordParserFactory. For example, the following stream layout changes the delimiter to a pipe for the delimited stream 's1':

<beanio xmlns="http://www.beanio.org/2012/03" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">

  <stream name="s1" format="delimited">
    <parser>
      <property name="delimiter" value="|" />
    </parser>
    <!-- record layout... -->
  </stream>
  
</beanio>

The next few sections list available parser properties for each stream format.

4.1.1. CSV Streams

CSV formatted streams are parsed according to RFC 4180 with one exception: multi-line records are disabled (but this can be overridden).

The following properties can be used to customize default CSV parsers:

Property Name	Type	Description	Affects
`delimiter`	char	The field delimiter. Defaults to a comma.	*
`quote`	char	The quotation mark character used to wrap fields containing a delimiter character, a quotation mark, or new lines. Defaults to the double quotation mark, ".	*
`escape`	Character	The character used to escape a quotation mark in a quoted field. Defaults to the double quotation mark, ".	*
`comments`	String[]	A comma separated list of values for identifying commented lines. If a line read from an input stream begins with any of the configured values, the line is ignored. A backslash may be used to escape a comma and itself. All whitespace is preserved. Enabling comments require the input reader passed to `StreamFactory` to support marking. Among others, Java's `BufferedReader` and `StringReader` support marking.	`BeanReader`
`multilineEnabled`	boolean	If set to `true`, quoted fields may contain new line characters. Defaults to `false`.	`BeanReader`
`whitespaceAllowed`	boolean	If set to `true`, whitespace is ignored and allowed before and after quoted values. For example, the following is allowed: Jennifer, "Jones" ,24 Defaults to `false`.	`BeanReader, Unmarshaller`
`unquotedQuotesAllowed`	boolean	If set to `true`, field text containing quotation marks do not need to be quoted unless the field text starts with a quotation mark. For example, the following is allowed: Jennifer,She said "OK" Defaults to `false`.	`BeanReader, Unmarshaller`
`recordTerminator`	String	The character used to signify the end of a record. By default, any new line character (line feed (LF), carriage return (CR), or CRLF combination) is accepted when reading an input stream, and `System.getProperty("line.separator")` is used when writing to a stream.	`BeanWriter`
`alwaysQuote`	boolean	If set to `true`, field text is always quoted. By default, a field is only quoted if it contains a delimeter, a quotation mark or new line characters.	`BeanWriter, Marshaller`

4.1.2. Delimited Streams

The default delimited parsers can be customized using the following properties:

Property Name	Type	Description	Affects
`delimiter`	char	The field delimiter. Defaults to the tab character.	*
`escape`	Character	The escape character allowed to escape a delimiter or itself. By default, escaping is disabled.	*
`lineContinuationCharacter`	Character	If this character is the last character before a new line or carriage return is read, the record will continue reading from the next line. By default, line continuation is disabled.	`BeanReader`
`recordTerminator`	Character	The character used to signify the end of a record. By default, any new line character (line feed (LF), carriage return (CR), or CRLF combination) is accepted when reading an input stream, and `System.getProperty("line.separator")` is used when writing to a stream.	`BeanReader, BeanWriter`
`comments`	String[]	A comma separated list of values for identifying commented lines. If a line read from an input stream begins with any of the configured values, the line is ignored. A backslash may be used to escape a comma and itself. All whitespace is preserved. Enabling comments require the input reader passed to `StreamFactory` to support marking. Among others, Java's `BufferedReader` and `StringReader` support marking.	`BeanReader`

4.1.3. Fixed Length Streams

The default fixed length parsers can be customized using the following properties:

Property Name	Type	Description	Affects
`lineContinuationCharacter`	Character	If this character is the last character before a new line or carriage return is read, the record will continue reading from the next line. By default, line continuation is disabled.	`BeanReader`
`recordTerminator`	Character	The character used to signify the end of a record. By default, any new line character (line feed (LF), carriage return (CR), or CRLF combination) is accepted when reading an input stream, and `System.getProperty("line.separator")` is used when writing to a stream.	`BeanReader, BeanWriter`
`comments`	String[]	A comma separated list of values for identifying commented lines. If a line read from an input stream begins with any of the configured values, the line is ignored. A backslash may be used to escape a comma and itself. All whitespace is preserved. Enabling comments require the input reader passed to `StreamFactory` to support marking. Among others, Java's `BufferedReader` and `StringReader` support marking.	`BeanReader`

4.1.4. XML Streams

The default XML parsers can be customized using the following properties:

Property Name	Type	Description	Affects
`suppressHeader`	boolean	If set to `true`, the XML header is suppressed in the marshalled document. Defaults to `false`.	`BeanWriter, Marshaller`
`version`	String	The XML header version. Defaults to `1.0`.	`BeanWriter, Marshaller`
`encoding`	String	The XML header encoding. Defaults to `utf-8`. Note that this setting has no bearing on the actual encoding of the output stream. If set to "", an encoding attribute is not included in the header.	`BeanWriter, Marshaller`
`namespaces`	String	A space delimited list of XML prefixes and namespaces to declare on the root element of a marshalled document. The property value should be formatted as prefix1 namespace1 prefix2 namespace2...	`BeanWriter, Marshaller`
`indentation`	Integer	The number of spaces to indent each level of XML. By default, indentation is disabled using a value of -1.	`BeanWriter, Marshaller`
`lineSeparator`	String	The character(s) used to separate lines when indentation is enabled. By default, `System.getProperty("line.separator")` is used.	`BeanWriter, Marshaller`

4.2. Records

Each record type read from an input stream or written to an output stream must be mapped using a record element. A stream mapping must include at least one record. A record mapping is used to validate the record and bind field values to a bean object. A simple record configuration is shown below.

<beanio>

  <stream name="stream1" format="csv">
    <record name="record1" class="example.Record">
      <field name="firstName" />
      <field name="lastName" />
      <field name="age" />
    </record>
  </stream>
  
</beanio>

In this example, a CSV formatted stream is mapped to a single record composed of three fields: first name, last name and age. When a record is read from a stream using a BeanReader, the class example.Record is instantiated and its firstName, lastName and age attributes are set using standard Java bean setter naming conventions ( e.g. setFirstName(String)).

Similarly, when a example.Record bean object is written to an output stream using a BeanWriter, its firstName , lastName and age attributes are retrieved from the bean object using standard Java bean getter naming conventions (e.g. getFirstName()).

BeanIO also supports Map based records by setting a record's class attribute to map, or to the fully qualified class name of any class assignable to java.util.Map. Note that if you plan to use Map based records, field types may need be explicitly configured using the type attribute, or BeanIO will assume the field is of type java.lang.String The type attribute is further explained in section 4.6. Field Type Conversion.

<beanio>

  <stream name="stream1" format="csv">
    <record name="record1" class="map">
      <field name="firstName" />
      <field name="lastName" />
      <field name="age" type="int"/>
    </record>
  </stream>
  
</beanio>

4.2.1. Record Identification

Oftentimes, a stream is made up of multiple record types. A typical batch file may include one header, one trailer, and zero to many detail records. BeanIO allows a record to be identified by one or more of its fields using expected literal values or regular expressions. If desired, BeanIO can be used to validate the order of all records in the input stream.

To see how a stream can be configured to handle multiple record types, let's modify our Employee file to include a header and trailer record as shown below. Each record now includes a record type field that identifies the type of record.

Header,01012011
Detail,Joe,Smith,Developer,75000,10012009
Detail,Jane,Doe,Architect,80000,01152008
Detail,Jon,Anderson,Manager,85000,03182007
Trailer,3

The mapping file can now be updated as follows:

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="header" minOccurs="1" maxOccurs="1" class="example.Header">
      <field name="recordType" rid="true" literal="Header" />
      <field name="fileDate" format="MMddyyyy" />
    </record>
    <record name="employee" minOccurs="0" maxOccurs="unbounded" class="example.Employee">
      <field name="recordType" rid="true" literal="Detail" />
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" format="MMddyyyy" />
    </record>
    <record name="trailer" minOccurs="1" maxOccurs="1" class="example.Trailer">
      <field name="recordType" rid="true" literal="Trailer" />
      <field name="recordCount" />
    </record>  
  </stream>
  
</beanio>

There are several new record and field attributes introduced in this mapping file, so we'll explain each new attribute in turn.

First, a field used to identify a record must be configured as a record identifier using rid="true". There is no limitation to the number of fields that can be used to identify a record, but all fields where rid="true" must be satisfied before a record is identified. If there is no field configured as a record identifier, by default the record will always match.

    <record name="header" minOccurs="1" maxOccurs="1" class="example.Header">
      <field name="recordType" rid="true" literal="Header" />
      <field name="fileDate" />
    </record>

Second, all record identifying fields must have a matching validation rule configured. In our example, the literal value Header in the record type field is used to identify the header record. Literal values must match exactly and can be configured using the literal field attribute. Alternatively, record identifying fields may use a regular expression to match field text using the regex field attribute.

    <record name="header" minOccurs="1" maxOccurs="1" class="example.Header">
      <field name="recordType" rid="true" literal="Header" />
      <field name="fileDate" />
    </record>

Third, each record defines the minimum and maximum number of times it may repeat using the attributes minOccurs and maxOccurs. Based on our configuration, exactly one header and trailer record is expected, while the number of detail records is unbounded.

    <record name="header" minOccurs="1" maxOccurs="1" class="example.Header">
      <field name="recordType" rid="true" literal="Header" />
      <field name="fileDate" />
    </record>

If minOccurs and/or maxOccurs are not set, the minimum occurrences of a record defaults to 0 and maximum occurrences is unbounded.

Its also possible to identify delimited and fixed length records based on their length. The ridLength record attribute can be used to specify a range of lengths to identify the record.

4.2.2. Record Ordering

As explained in the previous section, a stream can support multiple record types. By default, a BeanReader will read records in any order. But if desired, BeanIO can enforce record ordering using an order attribute on each record. The order attribute can be assigned any positive integer value greater than 0. Records that are assigned the same number may be read from the stream in any order. If order is set for one record, it must be set for all other records (and groups) that share the same parent.

In our previous example, if we want enforce that the header record is the first record in the file, the trailer is the last, and all detail records appear in the middle, the mapping file could be changed as follows. Using this configuration, if a detail record were to appear before the header record, the BeanReader will throw an UnexpectedRecordException when the detail record is read out of order.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="header" order="1" minOccurs="1" maxOccurs="1" class="example.Header">
      <field name="recordType" rid="true" literal="Header" />
      <field name="fileDate" format="MMddyyyy" />
    </record>
    <record name="employee" order="2" minOccurs="0" maxOccurs="unbounded" class="example.Employee">
      <field name="recordType" rid="true" literal="Detail" />
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" format="MMddyyyy" />
    </record>
    <record name="trailer" order="3" minOccurs="1" maxOccurs="1" class="example.Trailer">
      <field name="recordType" rid="true" literal="Trailer" />
      <field name="recordCount" />
    </record>  
  </stream>
  
</beanio>

4.2.3. Record Grouping

In some cases, a stream may be further divided into batches or groups of records. Continuing with our employee file, lets suppose employee detail records are batched by department, where each group of employees has a department header and a department trailer record. Thus an input file may look something like this:

Header,01012011
DeptHeader,Development
Detail,Joe,Smith,Developer,75000,10012009
Detail,Jane,Doe,Architect,80000,01152008
DeptTrailer,2
DeptHeader,Product Management
Detail,Jon,Anderson,Manager,85000,03182007
DeptTrailer,1
Trailer,2

BeanIO allows you to define groups of records using a group element to wrap the record types that belong to the group. Groups support the same order, minOccurs, and maxOccurs attributes, although there meaning is applied to the entire group. Once a record type is matched that belongs to a group, all other records in that group where minOccurs is greater that 1, must be read from the stream before the group may repeat or a different record can be read. Our mapping file would now look like this:

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="header" order="1" minOccurs="1" maxOccurs="1" class="example.Header">
      <field name="recordType" rid="true" literal="Header" />
      <field name="fileDate" format="MMddyyyy" />
    </record>
    <group name="departmentGroup" order="2" minOccurs="0" maxOccurs"unbounded">
      <record name="deptHeader" order="1" minOccurs="1" maxOccurs="1" class="example.DeptHeader">
        <field name="recordType" rid="true" literal="DeptHeader" />
        <field name="departmentName" />
      </record>
      <record name="employee" order="2" minOccurs="0" maxOccurs="unbounded" class="example.Employee">
        <field name="recordType" rid="true" literal="Detail" />
        <field name="firstName" />
        <field name="lastName" />
        <field name="title" />
        <field name="salary" />
        <field name="hireDate" format="MMddyyyy" />
      </record>
      <record name="deptTrailer" order="3" minOccurs="1" maxOccurs="1" class="example.DeptTrailer">
        <field name="recordType" rid="true" literal="DeptTrailer" />
        <field name="employeeCount" />
      </record>  
    </group>
    <record name="trailer" order="3" minOccurs="1" maxOccurs="1" class="example.Trailer">
      <field name="recordType" rid="true" literal="Trailer" />
      <field name="departmentCount" />
    </record>  
  </stream>
  
</beanio>

The stream definition itself is a record group with defaults minOccurs="0" and maxOccurs="1". If you want your BeanReader to throw an exception if the stream is empty, simply change minOccurs to 1, or if you want to allow the entire stream to repeat indefinitely, simply change maxOccurs to unbounded as shown below.

<beanio>

  <stream name="employeeFile" format="csv" minOccurs="1" maxOccurs="unbounded">
    <!-- Record layout... -->
  </stream>
  
</beanio>

4.3. Fields

A record is made up of one or more fields, which are validated and bound to bean properties using the field element. All fields must specify a name attribute, which by default, is used to get and set the field value from the bean object.

Default getter and setter methods can be overridden using getter and setter attributes as shown below. If a field is a constructor argument, setter can be set to '#N' where N is the position of the argument in the constructor starting at 1 (not shown).

<beanio>

  <stream name="stream1" format="csv">
    <record name="record1" class="example.Record">
      <field name="firstName" />
      <field name="lastName" setter="setSurname" getter="getSurname"/>
      <field name="age" />
    </record>
  </stream>
  
</beanio>

Fields found in a stream that do not map to a bean property can be declared using the ignore field attribute. Note that any configured validation rules are still applied to ignored fields (not shown).

<beanio>

  <stream name="stream1" format="csv">
    <record name="record1" class="example.Record">
      <field name="firstName" />
      <field name="lastName" />
      <field name="age" />
      <field name="filler" ignore="true" />
    </record>
  </stream>
  
</beanio>

By default, BeanIO expects fields to appear in a CSV, delimited or fixed length stream in the same order they are declared in the mapping file. If this is not the case, a position field attribute can be configured for each field. If a position is declared for one field, a position must be declared for all other fields in the same record. For delimited (and CSV) formatted streams, position should be set to the index of the first occurrence of the field in the record, beginning at 0. For fixed length formatted streams, position should be set to the index of the first character of the first occurrence of the field in the record, beginning at 0. A negative position can be used to specify a field location relative to the end of the record. For example, a position of -2 indicates the second to last field in a delimited record.

The following example shows how the position attribute can be used. Although the fields are declared in a different order, the record definition is identical to the previous example. When positions are explicitly configured for an input stream, there is no need to declare all fields in a record, unless desired for validation purposes.

<beanio>

  <stream name="stream1" format="csv">
    <record name="record1" class="example.Record">
      <field name="filler" position="3" ignore="true" />
      <field name="lastName" position="1" />
      <field name="age" position="2"/>
      <field name="firstName" position="0" />
    </record>
  </stream>
  
</beanio>

4.3.1. Field Type Conversion

The property type of a field is determined by introspecting the bean object the field belongs to. If the bean class is of type java.util.Map or java.util.Collection, BeanIO will assume the field is of type java.lang.String, unless a field type is explicitly declared using a field's type attribute.

The type attribute may be set to any fully qualified class name or to one of the supported type aliases below. Type aliases are not case sensitive, and the same alias may be used for primitive types. For example, int and java.lang.Integer bean properties will use the same type handler registered for the type java.lang.Integer, or alias integer or int.

Class Name	Primitive	Alias(es)
java.lang.String	-	`string`
java.lang.Boolean	boolean	`boolean`
java.lang.Byte	byte	`byte`
java.lang.Character	char	`character` `char`
java.lang.Short	short	`short`
java.lang.Integer	int	`integer` `int`
java.lang.Long	long	`long`
java.lang.Float	float	`float`
java.lang.Double	double	`double`
java.math.BigInteger	-	`biginteger`
java.math.BigDecimal	-	`bigdecimal` `decimal`
java.util.Date¹	-	`datetime` `date` `time`
java.util.Calendar²	-	`calendar` `calendar-datetime` `calendar-date` `calendar-time`
java.util.UUID	-	`uuid`
java.net.URL	-	`url`
java.lang.Enum³	-	`-`

¹ By default, the date alias is used for java.util.Date types that contain date information only, and the time alias is used for java.util.Date types that contain only time information. Only the datetime alias can be used to replace the default type handler for the java.util.Date class.

² By default, the calendar-date alias is used for java.util.Calendar types that contain date information only, and the calendar-time alias is used for java.util.Date types that contain only time information. Only the calendar-datetime and calendar aliases can be used to replace the default type handler for the java.util.Calendar class.

³ By default, enums are converted using Enum.valueOf(Class, String). If format="toString", the enum will be converted using values computed by calling toString() for each enum value. In either case, conversion is case sensitive. As with other types, a custom type handler can also be used for enums.

Optionally, a format attribute can be used to pass a decimal format for java.lang.Number types, and for passing a date format for java.util.Date types. In the example below, the hireDate field uses the SimpleDateFormat pattern " yyyy-MM-dd", and the salary field uses the DecimalFormat pattern "#,##0". For more information about supported patterns, please reference the API documentation for Java's java.text.DecimalFormat and java.text.SimpleDateFormat classes.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="header" minOccurs="1" maxOccurs="1" class="map">
      <field name="recordType" rid="true" literal="Header" />
      <field name="fileDate" type="java.util.Date" />
    </record>
    <record name="employee" minOccurs="0" maxOccurs="unbounded" class="map">
      <field name="recordType" rid="true" literal="Detail" />
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" type="int" format="#,##0" />
      <field name="hireDate" type="date" format="yyyy-MM-dd" />
    </record>
    <record name="trailer" minOccurs="1" maxOccurs="1" class="map">
      <field name="recordType" rid="true" literal="Trailer" />
      <field name="recordCount" type="int" />
    </record>
  </stream>

</beanio>

4.3.2. Custom Type Handlers

Field type conversion is performed by a type handler. BeanIO includes type handlers for common Java types, or you can create your own type handler by implementing the org.beanio.types.TypeHandler interface shown below. When writing a custom type handler, make sure to handle null values and empty strings. Only one instance of your type handler is created, so if you plan to concurrently read or write multiple streams, make sure your type handler is also thread safe.

package org.beanio.types;

public interface TypeHandler {
    public Object parse(String text) throws TypeConversionException;
    public String format(Object value);
    public Class<?> getType();
}

The following example shows a custom type handler for the java.lang.Boolean class and boolean primitive based on "Y" or "N" indicators.

import org.beanio.types.TypeHandler;

public class YNTypeHandler implements TypeHandler {
    public Object parse(String text) throws TypeConversionException {
        return "Y".equals(text);
    }
    public String format(Object value) {
        return value != null && ((Boolean)value).booleanValue() ? "Y" : "N";
    }
    public Class<?> getType() {
        return Boolean.class;
    }
}

A type handler may be explicitly named using the name attribute, and/or registered for all fields of a particular type by setting the type attribute. The type attribute can be set to the fully qualified class name or type alias of the class supported by the type handler. To reference a named type handler, use the typeHandler field attribute when configuring the field.

Many default type handlers included with BeanIO support customization through the use of one or more property elements, where the name attribute is a bean property of the type handler, and the value attribute is the property value.

Type handlers can be declared globally (for all streams in the mapping file) or for a specific stream. Globally declared type handlers may optionally use a format attribute to narrow the type handler scope to a specific stream format.

In the example below, the first DateTypeHandler is declared globally for all stream formats. The second DateTypeHandler overrides the first for java.util.Date types in an XML formatted stream, and the YNTypeHandler is declared only for the 'employeeFile' stream. Stream specific type handlers override global type handlers when declared with the same name or for the same type.

<beanio>

  <typeHandler type="java.util.Date" class="org.beanio.types.DateTypeHandler">
    <property name="pattern" value="MMddyyyy" />
    <property name="lenient" value="true" />
  </typeHandler>
  <typeHandler type="java.util.Date" format="xml" class="org.beanio.types.DateTypeHandler">
    <property name="pattern" value="yyyy-MM-dd" />
  </typeHandler>

  <stream name="employeeFile" format="csv">
    <typeHandler name="ynHandler" class="example.YNTypeHandler" />

    <record name="employee" minOccurs="0" maxOccurs="unbounded" class="map">
      <field name="recordType" rid="true" literal="Detail" />
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" />
      <field name="exempt" typeHandler="ynHandler" />
    </record>
  </stream>

</beanio>

4.3.3. Repeating Fields

Repeating fields are also supported by BeanIO. For example, lets assume our Employee bean object contains a list of account numbers.

package example;
import java.util.Date;

public class Employee {
    String firstName;
    String lastName;
    String title;
    int salary;
    Date hireDate;
    List<Integer> accounts;

    // getters and setters not shown...
}

And lets assume our input file now looks like this:

Joe,Smith,Developer,75000,10012009
Chris,Johnson,Sales,80000,05292006,100012,200034,200045
Jane,Doe,Architect,80000,01152008
Jon,Anderson,Manager,85000,03182007,333001

In this example, the accounts bean property can be defined in the mapping file using a collection field attribute. The collection attribute can be set to the fully qualified class name of a java.util.Collection subclass, or to one of the collection type aliases below.

Class	Alias	Default Implementation
java.util.Collection	`collection`	java.util.ArrayList
java.util.List	`list`	java.util.ArrayList
java.util.Set	`set`	java.util.HashSet
(Java Array)	`array`	N/A

Repeating fields can declare the number of occurrences of the field using the minOccurs and maxOccurs field attributes. If not declared, minOccurs will default to 1, and maxOccurs will default to the minOccurs value or 1, whichever is greater.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" format="MMddyyyy" />
      <field name="accounts" type="int" collection="list" minOccurs="0" maxOccurs="unbounded" />
    </record>
  </stream>

</beanio>

Flat file formats (CSV, delimited and fixed length) may only contain one field or segment of indeterminate length (i.e. where maxOccurs is greater than minOccurs). The position of components that follow are assumed to be relative to the end of the record.

If a field repeats a fixed number of times based on a preceding field in the same record, the occursRef attribute can be used to identify the name of the controlling field. If the controlling field is not bound to a separate property of its parent bean object, be sure to specify ignore="true". The following mapping file shows how to configure the _ accounts_ field occurrences to be dependent on the numberOfAccounts field. If desired, minOccurs and maxOccurs may still be specified to validate the referenced field occurrences value.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" format="MMddyyyy" />
      <field name="numberOfAccounts" ignore="true" />
      <field name="accounts" type="int" collection="list" occursRef="numberOfAccounts" />
    </record>
  </stream>

</beanio>

Note that a repeating field can not be used for record identification.

4.3.4. Fixed Length Fields

Fixed length fields require a little extra configuration than their delimited counterparts. Let's redefine our employee file example using the fixed length format below.

Position	Field	Format	Length
0	First Name	Text	10
10	Last Name	Text	10
20	Job Title	Text	10
30	Salary	Number	6
36	Hire Date	Date (MMDDYYYY)	8

A fixed length version of the employee file might look like the following:

Joe       Smith    Developer 07500010012009
Jane      Doe      Architect 08000001152008
Jon       Anderson Manager   08500003182007

The length of a fixed length field must be configured using the length field attribute. By default, fixed length fields are left justified and padded with spaces, but these settings can be overridden using the padding and justify field attributes. Field padding can be set to any single character, and field justification can be set to left or right. Using these attributes, our mapping file can now be updated as follows:

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="employee" class="example.Employee">
      <field name="firstName" length="10" />
      <field name="lastName" length="10" />
      <field name="title" length="10" />
      <field name="salary" length="6" padding="0" justify="right" />
      <field name="hireDate" length="8" format="MMddyyyy" />
    </record>
  </stream>

</beanio>

The configured padding character is removed from the beginning of the field if right justified, or from the end of the field if left justified, until a character is found that does not match the padding character. If the entire field is padded, Number property types default to the padding character if it is a digit, and the padding character is ignored for Character types. To illustrate this, some examples are shown in the table below.

Justify	Type	Padding	Padded Text	Unpadded Text
`left`	`String`	`" "`	`"George "`	`"George"`
	`String`	`" "`	`" "`	`""`
	`Character`	`" "`	`"A"`	`"A"`
	`Character`	`" "`	`" "`	`" "`
`right`	`Number`	`"0"`	`"00123"`	`"123"`
		`"0"`	`"00000"`	`"0"`
		`"9"`	`"00000"`	`"00000"`
		`"9"`	`"99999"`	`"9"`
		`"X"`	`"XXXXX"`	`""`

The marshalling and unmarshalling behavior of null field values for a padded field is further controlled using the required attribute. If required is set to true, null field values are marshalled by filling the field with the padding character. If required is set to false, a null field value is marshalled as spaces for fixed length streams and an empty string for non-fixed length streams. Similarly, if required is set to false, spaces are unmarshalled to a null field value regardless of the padding character. To illustrate this, the following table shows the field text for a right justified zero padded 3 digit number.

Required	Field Value	Field Text (Fixed Length)	Field Text (CSV, Delimited, XML)
`true`	0	"`000`"	"`000`"
`true`	null	"`000`"¹	"`000`"¹
`false`	0	"`000`"	"`000`"
`false`	null	" "	""

¹ Applies to marshalling only. Unmarshalling "000" would produce a field value of 0.

As hinted to above, padding settings can be applied to any field for any stream type.

4.4. Constants

If a bean property does not map to a field in the stream, a constant property value can still be set using a property element. Like a field, all properties must specify a name attribute, which by default, is used to get and set the property value from the bean object. Properties also require a value attribute for setting the textual representation of the property value. The value text is type converted using the same rules and attributes (type, typeHandler and format) used for field type conversion described above. Collection type properties are not supported.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="employee" class="map">
      <property name="recordType" value="employee" />
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" format="MMddyyyy" />
    </record>
  </stream>

</beanio>

Constant properties may be useful in two scenarios:

When reading an input stream (unmarshalling), if multiple records are mapped to the same bean class, such as a Map, a property can be used to set a property, or a Map key, for identifying the record type without querying the BeanReader.
When writing an output stream (marshalling), a record mapping can be selected based on a record identifying property value by setting rid to true. This allows the same bean class to be unmarshalled to different record types based on a property that may not exist in the output stream.

4.5. Segments

A segment is a group of fields within a record. Segments are most often used to bind a group of fields to a nested bean object or collection of bean objects, and are configured in a mapping file using a segment element.

Prior to release 2.x, the bean element performed this task. A segment supports all the functionality of a bean element, but unlike the original bean element, a segment is not required to be bound to a bean object. This allows repeating segments to be fully validated during unmarshalling, without necessarily binding the fields to a bean object. An "unbound" segment also allows an arbitrary number of XML fields to be wrapped by other XML nodes without creating bean objects that mirror the same hierarchy.

4.5.1. Nested Beans

As mentioned, a record can be divided into nested bean objects using a segment element. First, let's suppose we store an address in our CSV employee file, so that the record layout might look like this:

Position	Field	Format
0	First Name	Text
1	Last Name	Text
2	Job Title	Text
3	Salary	Number
4	Hire Date	Date (MMDDYYYY)
5	Street	Text
6	City	Text
7	State	Text
8	Zip	Text

Second, lets suppose we want to store address information in a new Address bean object like the one below, and add an Address reference to our Employee class.

package example;

public class Address {
    String street;
    String city;
    String state;
    String zip;

    // getters and setters not shown...
}

package example;
import java.util.Date;

public class Employee {
    String firstName;
    String lastName;
    String title;
    int salary;
    Date hireDate;
    Address mailingAddress;

    // getters and setters not shown...
}

With this information, we can now update our employee CSV mapping file to accomodate the nested Address object. A segment must include a name attribute, and may optionally provide a class attribute to bind its children to a bean object. If class is set, the attribute must be set to the fully qualified class name of the bean object, or to map, or to the class name of any concrete java.util.Map implementation. If the bean class is of type java.util.Map, field values are stored in the Map using the configured field names for keys. By default, the name attribute is used to determine the getter and setter on its parent bean or record. Alternatively, getter or setter attributes can be used to override the default property name similar to a field property.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" format="MMddyyyy" />
      <segment name="mailingAddress" class="example.Address">
        <field name="street" />
        <field name="city" />
        <field name="state" />
        <field name="zip" />
      </segment>
    </record>
  </stream>

</beanio>

If class is not set, fields will be automatically bound to the segment's parent bean object, which would be the Employee object in the example above.

If needed, segments can be further divided into other segments. There is no limit to the number of nested levels that can be configured in a mapping file.

4.5.2. Repeating Segments

Similar to repeating fields, BeanIO supports repeating segments, which may be bound to a collection of bean objects. Continuing our previous example, let's suppose the employee CSV file may contain 1 or more addresses for each employee. Thus our Employee bean object might look like this:

package example;
import java.util.Date;

public class Employee {
    String firstName;
    String lastName;
    String title;
    int salary;
    Date hireDate;
    List<Address> addressList;

    // getters and setters not shown...
}

And our input file might look like this:

Joe,Smith,Developer,75000,10012009,123 State St,Chicago,IL,60614
Jane,Doe,Architect,80000,01152008,456 Main St,Chicago,IL,60611,111 Michigan Ave,Chicago,IL,60611
Jon,Anderson,Manager,85000,03182007,1212 North Ave,Chicago,IL,60614

In our mapping file, in order to bind a segment to a collection, simply set it's collection attribute to the fully qualified class name of a java.util.Collection or java.util.Map subclass, or to one of the collection type aliases below.

Class	Alias	Default Implementation
java.util.Collection	`collection`	java.util.ArrayList
java.util.List	`list`	java.util.ArrayList
java.util.Set	`set`	java.util.HashSet
java.util.Map	`map`	java.util.LinkedHashMap
(Java Array)	`array`	N/A

Repeating segments can declare the number of occurrences using the minOccurs and maxOccurs attributes. If not declared, minOccurs will default to 1, and maxOccurs will default to the minOccurs value or 1, whichever is greater.

Just like repeating fields, if the number of occurrences of a segment is dependent on a preceding field in the same record, the occursRef attribute can be set to the name of the field that controls the number of occurrences.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" format="MMddyyyy" />
      <segment name="addressList" collection="list" minOccurs="1" maxOccurs="unbounded" class="example.Address">
        <field name="street" />
        <field name="city" />
        <field name="state" />
        <field name="zip" />
      </segment>
    </record>
  </stream>

</beanio>

When working with repeating segments, there are a few restrictions to keep in mind:

Repeating segments must appear consecutively in the record.
Every field in a repeating segment must be declared. (There can be no field gaps in the segment configuration.)
A repeating segment may not contain repeating descendants with variable occurrences.
Repeating fields or fields that belong to a repeating segment may not be used for record identification.

4.5.2.1. Inline Maps

As noted above, a segment can also be bound to a java.util.Map which provides support for "inline" maps. For example, given the following CSV file of users,

id1,firstName1,lastName1,id2,firstName2,lastName2
jsmith,Joe,Smith,jdoe,Jane,Doe

The following mapping file could be used to create a Map of User objects by ID. The key attribute is used to set the name of a descendant field to use for the Map key.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="employee" target="userMap">
      <segment name="userMap" class="example.User" collection="map" key="id"
          minOccurs="1" maxOccurs="unbounded">
        <field name="id" />
        <field name="firstName" />
        <field name="lastName" />
      </segment>
    </record>
  </stream>

</beanio>

If a Map of last names by ID is needed instead, simply replace the class attribute with value and specify the name of the descendant field to use for the Map value. In this case, first name is effectively ignored.

<beanio>

  <stream name="employeeFile" format="csv">
    <record name="employee" target="userMap">
      <segment name="userMap" collection="map" key="id" value="lastName"
          minOccurs="1" maxOccurs="unbounded">
        <field name="id" />
        <field name="firstName" />
        <field name="lastName" />
      </segment>
    </record>
  </stream>

</beanio>

4.6. Stream Validation

A BeanReader will throw an InvalidRecordException if a record or one of its fields fails a configured validation rule. There are two types of errors reported for an invalid record: record level errors and field level errors. If a record level error occurs, further processing of the record is aborted and an excception is immediately thrown. If a field level error is reported, the BeanReader will continue to process the record's other fields before throwing an exception.

When an InvalidRecordException is thrown, the exception will contain the reported record and field level errors. The following code shows how this information can be accessed using the RecordContext.

    try (BeanReader in = ...) {
        Object record = in.read();
        if (record != null) {
            // process record...
        }
    }
    catch (InvalidRecordException ex) {
        RecordContext context = ex.getRecordContext();
        if (context.hasRecordErrors()) {
            for (String error : context.getRecordErrors()) {
                // handle record errors...
            }
        }
        if (context.hasFieldErrors()) {
            for (String field : context.getFieldErrors().keySet()) {
                for (String error : context.getFieldErrors(field)) {
                    // handle field error...
                }
            }
        }
    }
}

Alternatively, it may be simpler to register a BeanReaderErrorHandler for handling non-fatal exceptions. The example below shows how invalid records could be written to a reject file by extending BeanReaderErrorHandlerSupport. (Note that the example assumes the mapping file does not bind a record group to a bean object.)

    try (BeanReader input = ...;
         BufferedWriter rejects = ...) {
        input.setErrorHandler(new BeanReaderErrorHandlerSupport() {
            public void invalidRecord(InvalidRecordException ex) throws Exception {
                rejects.write(ex.getRecordContext().getRecordText());
                rejects.newLine();
            }
        });

        Object record = null;
        while ((record = input.read()) != null) {
            // process a valid record
        }

        rejects.flush();
    }

Record and field level error messages can be customized and localized through the use of resource bundles. A resource bundle is configured at the stream level using the resourceBundle attribute as shown below.

<beanio>

  <typeHandler type="java.util.Date" class="org.beanio.types.DateTypeHandler">
    <property name="pattern" value="MMddyyyy" />
  </typeHandler>

  <stream name="employeeFile" format="csv" resourceBundle="example.messages" >
    <record name="employee" class="map">
      <field name="recordType" rid="true" literal="Detail" />
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" />
    </record>
  </stream>

</beanio>

Record level error messages are retrieved using the following prioritized list of keys. If a message is not configured under the name of the first key, the next key will be tried until a message is found, or a default message is used.

recorderror.[record name].[rule]
recorderror.[rule]

Similarly, field level error messages are retrieved using the following priortized list of keys:

fielderror.[record name].[field name].[rule]
fielderror.[record name].[rule]
fielderror.[rule]

More descriptive or localized labels can be configured for record and field names using the keys label.[record name] and label.[record name].[field name] respectively.

For example, the following resource bundle could be used to customize a few error messages for the employee file.

## 'employee' record label:
label.employee = Employee Record
## 'firstName' field label:
label.employee.firstName = First Name Field
## Unidentified record error message:
recorderror.unidentified = Unidentified record at line {0}
## Type conversion error message for the 'hireDate' field:
fielderror.employee.hireDate.type = Invalid date format
## Maximum field length error message for all fields:
fielderror.maxLength = Maximum field length exceeded for {3}

Error messages are formatted using a java.text.MessageFormat. Depending on the validation rule that was violated, different parameters are passed to the MessageFormat. Appendix B documents the parameters passed to the MessageFormat for each validation rule.

4.6.1. Record Validation

The following record level validation rules may be configured on a record element.

Attribute	Argument Type	Description
`minLength`	Integer	Validates the record contains at least `minLength` fields for delimited and CSV formatted streams, or has at least `minLength` characters for fixed length formatted streams.
`maxLength`	Integer	Validates the record contains at most `maxLength` fields for delimited and CSV formatted streams, or has at most `maxLength` characters for fixed length formatted streams.

4.6.2. Field Validation

BeanIO supports several common field validation rules when reading an input stream. All field validation rules are validated against the field text before type conversion. When field trimming is enabled, trim="true", all validations are performed after the field's text has first been trimmed. Field validations are ignored when writing to an output stream.

The following table lists supported field attributes for validation.

Attribute	Argument Type	Description
`required`	Boolean	When set to `true`, validates the field is present and the field text is not the empty string.
`minLength`	Integer	Validates the field text is at least N characters.
`maxLength`	Integer	Validates the field text does not exceed N characters.
`literal`	String	Validates the field text exactly matches the literal value.
`regex`	String	Validates the field text matches the given regular expression pattern.
`minOccurs`	String	Validates the minimum occurrences of the field in a stream. If the field is present in the stream, `minOccurs` is satisfied, and the `required` setting determines whether a value is required.

4.7. Templates

When a common set of fields is used by multiple record types, configuration may be simplified using templates. A template is a reusable list of components (segments, fields, and properties/constants) that can be included by a record, segment or other template. The following example illustrates some of the ways a template can be used:

<beanio>

  <template name="address">
    <field name="street1" />
    <field name="street2" />
    <field name="city" />
    <field name="state" />
    <field name="zip" />
  </template>

  <template name="employee">
    <field name="firstName" />
    <field name="lastName" />
    <field name="title" />
    <field name="salary" />
    <field name="hireDate" format="MMddyyyy" />
    <segment name="mailingAddress" template="address" class="example.Address" />
  </template>

  <stream name="employeeFile" format="csv">
    <record name="employee" template="employee" class="example.Employee" />
  </stream>

  <stream name="addressFile" format="csv">
    <record name="address" class="example.Address">
      <field name="location" />
      <include template="address"/>
      <field name="attention" />
    </record>
  </stream>

</beanio>

Templates are essentially copied into their destination using the include element. For convenience, record and segment elements support a template attribute which includes the template before any other children.

The include element can optionally specify a positional offset for included fields using the offset attribute. The following example illustrates this behavior. Even when using templates, remember that position must be declared for all fields or none.

<beanio>

  <template name="address">
    <field name="street1" position="0" />
    <field name="street2" position="1" />
    <field name="city" position="2" />
    <field name="state" position="3" />
    <field name="zip" position="4" />
  </template>

  <stream name="addressFile" format="csv">
    <record name="address" class="example.Address">
      <field name="location" position="0" />
      <include template="address" offset="1"/>
      <field name="attention" position="6" />
    </record>
  </stream>

</beanio>

4.8. Advanced Topics

4.8.1. Mapping Bean Objects that Span Multiple Records

Since release 2.0, BeanIO supports the binding of multiple consecutive records to a single bean object. This can be achieved by assigning a bean class to a stream or group containing the record configurations bound to the bean.

Let's suppose we are reading a CSV input file of orders that contains an order, followed by the customer that placed the order, followed by a detailed list of items that make up the order. A sample input file might look like this:

Order,101,2012-02-01,5.00
Customer,John,Smith
Item,Apple,2,2.00
Item,Orange,1,1.00
Order,102,2012-02-01,3.00
Customer,Jane,Johnson
Item,Ham,1,3.00

Let's then suppose we want to read the following Order class from the stream, which contains a reference to Customer and Item classes. (For brevity, getters and setters are not shown.)

package example;
import java.util.Date;

public class Order {
    String id;
    Date date;
    BigDecimal amount;
    Customer customer;
    List&lt;Item&gt; items;
}

public class Customer {
    String firstName;
    String lastName;
}

public class Item {
    String name;
    int quantity;
    BigDecimal amount;
}

Now to read and write Order objects from our example stream, the following mapping file can be used:

<beanio>

  <stream name="orders" format="csv">
    <group name="order" class="example.Order" minOccurs="0" maxOccurs="unbounded">
      <record name="orderRecord" order="1" minOccurs="1">
        <field name="recordType" rid="true" literal="Order" ignore="true" />
        <field name="id" />
        <field name="date" format="yyyy-MM-dd" />
        <field name="amount" />
      </record>
      <record name="customer" class="example.Customer" order="2" minOccurs="1" maxOccurs="1">
        <field name="recordType" rid="true" literal="Customer" ignore="true" />
        <field name="firstName" />
        <field name="lastName" />
      </record>
      <record name="items" class="example.Item" collection="list" order="3" minOccurs="1" maxOccurs="unbounded">
        <field name="recordType" rid="true" literal="Item" ignore="true" />
        <field name="name" />
        <field name="quantity" />
        <field name="amount" />
      </record>
    </group>
  </stream>

</beanio>

By configuring a class on a group component, BeanIO will automatically marshal or unmarshal all of the group's descendants in a single call to read or write from the stream. Also note that by not configuring a class on a record , in this case our "orderRecord", the fields are instead set on the bean class assigned to it's parent group. Finally, repeating records can be aggregated into a collection using a collection attribute at the record level, as used for the "items" record. If necessary, getter and setter attributes can be configured on a record component as well.

If any record included in a group bound to a bean object is invalid, an InvalidRecordException is thrown, but only after reading all the other records in the group. In such cases, the InvalidRecordException will contain RecordContext objects for every record in the group read from the stream. If multiple records in the group are invalid, only one InvalidRecordException is thrown.

If a malformed or unidentified record is read from the stream while unmarsahalling a record group, an exception is immediately thrown, and the BeanReader will most likely not be able to recover. For this reason, when unmarshalling untrusted sources, it is recommended that you read the stream twice, using the first pass to validate the integrity of the file including syntax, record identification, record ordering, possible header/trailer counts, etc. For example, the following mapping file might be used to validate our orders file.

<beanio>

  <stream name="orders-validation" format="csv">
    <group name="order" minOccurs="0" maxOccurs="unbounded">
      <record name="orderRecord" order="1" minOccurs="1">
        <field name="recordType" rid="true" literal="Order" ignore="true" />
      </record>
      <record name="customer" order="2" minOccurs="1">
        <field name="recordType" rid="true" literal="Customer" ignore="true" />
      </record>
      <record name="items" order="3" minOccurs="1" maxOccurs="unbounded">
        <field name="recordType" rid="true" literal="Item" ignore="true" />
      </record>
    </group>
  </stream>

</beanio>

In this case, we are validating syntax, record ordering and record identification for the entire file in a single call to beanReader.read(), while leaving other record and field level validations for unmarshalling, which can be caught and handled without worrying whether the BeanReader will be able to recover.

5.0. Mapping XML Streams

This section provides further details for using BeanIO to marshall and unmarshall Java objects to and from XML formatted streams. This section assumes you are already familiar with the mapping file concepts documented in previous sections.

5.1. Introduction

BeanIO is similar to other OXM (Object to XML Mapping) libraries, except that it is also capable of marshalling and unmarshalling extremely large XML files by reading and writing Java beans one record at a time. BeanIO uses a streaming XML (StAX) parser to read and write XML, and will never hold more than the minimum amount of XML in memory needed to marshall or unmarshall a single bean object. That said, it is still possible to run out of memory (heap space) with poorly designed XML documents and/or misconfigured mapping files.

5.1.1. My First XML Stream

Before diving into the details, let's start with a basic example using the employee input file from Section 2.1 after it's been converted to XML (shown below).

<?xml version="1.0"?>
<employeeFile>
  <employee>
    <firstName>Joe</firstName>
    <lastName>Smith</lastName>
    <title>Developer</title>
    <salary>75000</salary>
    <hireDate>2009-10-12</hireDate>
  </employee>
  <employee>
    <firstName>Jane</firstName>
    <lastName>Doe</lastName>
    <title>Architect</title>
    <salary>80000</salary>
    <hireDate>2008-01-15</hireDate>
  </employee>
  <employee>
    <firstName>Jon</firstName>
    <lastName>Andersen</lastName>
    <title>Manager</title>
    <salary>85000</salary>
    <hireDate>2007-03-18</hireDate>
  </employee>
</employeeFile>

In this example, let's suppose we are unmarshalling the XML employee file into the same Employee bean object from Section 2.1 and repeated below.

package example;
import java.util.Date;

public class Employee {
    String firstName;
    String lastName;
    String title;
    int salary;
    Date hireDate;

    // getters and setters not shown...
}

Our original mapping file from Section 2.1 can now be updated to parse XML instead of CSV with only two minor changes. First, the stream format is changed to xml. And second, the hire date field format is removed and replaced with type="date". With XML, the date format does not need to be explicity declared because it conforms to the W3C XML Schema date syntax. (This will be further explained in Section 5.7.1).

<beanio xmlns="http://www.beanio.org/2012/03"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">

  <stream name="employeeFile" format="xml">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" type="date" />
    </record>
  </stream>
</beanio>

That's it! No Java code changes are required, and as before, Employee bean objects will be unmarshalled from the XML input stream each time beanReader.read() is called.

And also as before, Employee objects can be marshalled to an XML output stream using beanWriter.write(Object). However, please note that when marshalling/writing XML, it is even more important to call beanWriter.close() so that the XML document can be properly completed.

5.1.2. A Note on XML Validation

Because BeanIO is built like a pull parser, it does not support XML validation against a DTD or XML schema. Where this functionality is needed, it is recommended to make two passes on the input document. The first pass can use a SAX parser or other means to validate the XML, and the second pass can use BeanIO to parse and process bean objects read from the document.

5.2. XML Names

Each BeanIO mapping component (stream, group, record, segment and field), is mapped to an XML element with the same local name. If the name of the stream, group, etc. does not match the XML element name, the xmlName attribute can be used. For example, if the name of the root element in the previous example's employee file is changed from " employeeFile" to "employees", and "title" was renamed "position", the mapping file could be updated as shown below.

<beanio>

  <stream name="employeeFile" format="xml" xmlName="employees">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" xmlName="position" />
      <field name="salary" />
      <field name="hireDate" type="date" />
    </record>
  </stream>

</beanio>

5.3. XML Namespaces

XML namespaces can be enabled through the use of the xmlNamespace attribute on any mapping component (stream, group, record, segment or field). By default, all mapping elements inherit their namespace (or lack thereof) from their parent. When a namespace is declared, the local name and namespace must match when unmarshalling XML, and appropriate namespace declarations are included when marshalling bean objects. For example, let's suppose our employee file contains namespaces as shown below.

<?xml version="1.0"?>
<employeeFile xmlns="http://example.com/employeeFile" xmlns:n="http://example.com/name">
  <e:employee xmlns:e="http://example.com/employee">
    <n:firstName>Joe</n:firstName>
    <n:lastName>Smith</n:lastName>
    <e:title>Developer</e:title>
    <e:salary>75000</e:salary>
    <e:hireDate>2009-10-12</e:hireDate>
  </e:employee>
  .
  .
  .
</employeeFile>

To unmarshall the file using namespaces, and to marshall Employee bean objects in the same fashion as they appear above, the following mapping file can be used.

<beanio>

  <stream name="employeeFile" format="xml" xmlNamespace="http://example.com/employeeFile">
    <parser>
      <property name="namespaces" value="n http://example.com/name"/>
    </parser>
    <record name="employee" class="example.Employee" xmlNamespace="http://example.com/employee" xmlPrefix="e">
      <field name="firstName" xmlNamespace="http://example.com/name" />
      <field name="lastName" xmlNamespace="http://example.com/name" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" type="date" />
    </record>
  </stream>

</beanio>

From this example, the following behavior can be observed:

An xmlPrefix attribute can be used to assign a namespace prefix anywhere a xmlNamespace is declared.
If a prefix is configured, the namespace is assigned to the prefix and the prefix is used from that point forward. This can be seen on the 'employee' element (record configuration).
If a prefix is not configured, a namespace declaration will replace the default namespace. This can be seen on the ' employeeFile' element (stream configuration).
As previously mentioned, namespace are by default inherited from parent mapping elements. This can be seen on the ' title', 'salary' and 'hireDate' elements (field configurations).
Namespaces can be eagerly declared on the root element using the writer's namespaces property. Multiple namespaces can be declared with space delimiters such as 'prefix1 namespace1 prefix2 namespace2...'.

BeanIO also supports a special wildcard namespace. If xmlNamespace is set to '*', any namespace is allowed when unmarshalling XML, and no namespace declaration will be made when marshalling XML.

The following table summarizes namespace configuration options and their effect on the configured element and a child that inherits it's parent namespace.

Mapping Configuration	Marshalled Element And Child
[None]	<element> <child/> </element>
xmlNamespace="*"	<element> <child/> </element>
xmlNamespace=""	<element xmlns=""> <child/> </element>
xmlNamespace="http://example.com"	<element xmlns="http://example.com"> <child/> </element>
xmlNamespace="http://example.com" xmlPrefix="e"	<e:element xmlns="http://example.com"> <e:child/> </e:element>

5.4. Streams

When unmarshalling multiple records from an XML document, the stream configuration is mapped to the root element in the XML formatted stream. This default behavior has been demonstrated in previous examples. If on the other hand, an XML document contains only a single record, the document can be fully read or written by setting the stream configuration's xmlType attribute to none. This behavior is similar to other OXM libraries that marshall or unmarshall one bean object per XML document.

For example, if BeanIO was used to unmarshall a single employee record submitted via a web service, the XML document might look like the following. Notice there is no 'employeeFile' root element for containing multiple employee records.

<employee>
  <firstName>Joe</firstName>
  <lastName>Smith</lastName>
  <title>Developer</title>
  <salary>75000</salary>
  <hireDate>2009-10-12</hireDate>
</employee>

In this example, the following highlighted changes can be made to our mapping file to allow BeanIO to unmarshall/marshall a single employee record.

<beanio>

  <stream name="employeeFile" format="xml" xmlType="none">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" type="date" />
    </record>
  </stream>

</beanio>

5.5. Groups

Like other mapping elements, groups are also mapped to XML elements by default. Or if a group is used only for control purposes, the group's xmlType attribute can be set to none.

5.6. Records

A record is always mapped to an XML element. As we've seen before, records are matched based on their group context and configured record identifying fields. XML records are further matched using their XML element name, as defined by xmlName, or if not present, name. Other than configured record identifying fields, segment and field names declared within the record are not used to identify records.

For example, let's suppose our employee file differentiated managers using 'manager' tags.

<?xml version="1.0"?>
<employeeFile>
  <employee>
    <firstName>Joe</firstName>
    <lastName>Smith</lastName>
    <title>Developer</title>
    <salary>75000</salary>
    <hireDate>2009-10-12</hireDate>
  </employee>
  <employee>
    <firstName>Jane</firstName>
    <lastName>Doe</lastName>
    <title>Architect</title>
    <salary>80000</salary>
    <hireDate>2008-01-15</hireDate>
  </employee>
  <manager>
    <firstName>Jon</firstName>
    <lastName>Andersen</lastName>
    <title>Manager</title>
    <salary>85000</salary>
    <hireDate>2007-03-18</hireDate>
  </manager>
</employeeFile>

To bind managers to a new Manager bean we could use the following mapping configuration.

<beanio>

  <stream name="employeeFile" format="xml">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" type="date" />
    </record>
    <record name="manager" class="example.Manager">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" type="date" />
    </record>
  </stream>

</beanio>

5.7. Fields

A field is mapped to XML using the field's xmlType attribute, which defaults to element. The field XML type can be set to element, attribute, text, or none. The following table illustrates possible configurations, except for none which is not covered here.

Record Definition	Sample Record
<record name="person" className="map"> <field name="name" xmlType="element"/> </record>	<person> <name>John</name> </person>
<record name="person" className="map"> <field name="name" xmlType="attribute"/> </recors>	<person name="John"/>
<record name="person" className="map"> <field name="name" xmlType="text"/> </record>	<person>John</person>

Record Definition

Sample Record

<record name="person" className="map">
<field name="name" xmlType="element"/>
</record>

<person>
<name>John</name>
</person>

<record name="person" className="map">
<field name="name" xmlType="attribute"/>
</recors>

<person name="John"/>

<record name="person" className="map">
<field name="name" xmlType="text"/>
</record>

<person>John</person>

5.7.1. Field Type Conversion

Field type conversion works the same way for XML formatted streams as it does for other formats. However, several default type handlers are overridden specifically for XML formatted streams to conform with W3C XML Schema built-in data types according to this specification. The following table summarizes overriden type handlers:

Class or Type Alias	XML Schema Data Type	Example
`date calendar-date`	date	`2011-01-01`
`datetime java.util.Date calendar calendar-datetime java.util.Calendar`	dateTime	`2011-01-01T15:14:13`
`time calendar-time`	time	`15:14:13`
`boolean`	boolean	`true`

Like other type handlers, XML specific type handlers can be customized or completely replaced. Please consult BeanIO javadocs for customization details.

5.7.2. Marshalling Null Field Values

The nillable and minOccurs field attributes control how a null field value is marshalled. If minOccurs is 0, an element or attribute is not marshalled for the field. If an element type field has nillable set to true and minOccurs set to 1, the W3C XML Schema Instance attribute nil is set to true.

This behavior is illustrated in the following table.

Field Type	Record Definition	Marshalled Record (Field Value is Null)
element	<record name="person" className="map"> <field name="name" /> </record>	<person> <name/> </person>
	<record name="person" className="map"> <field name="name" minOccurs="0" /> </record>	<person/>
	<record name="person" className="map"> <field name="name" nillable="true"/> </record>	<person> <name xsi:nil="true"/> </person>
attribute	<record name="person" className="map"> <field name="name" xmlType="attribute"/> </record>	<person/>
attribute	<record name="person" className="map"> <field name="name" xmlType="attribute" minOccurs="1"/> </record>	<person name=""/>
text	<record name="person" className="map"> <field name="name" xmlType="text"/> </record>	<person/>

5.8. Segments

A segment can be used to bind a group of fields to a nested bean object, or to wrap a field or group of fields under an XML element.

5.8.1. Nested Beans

Segments can be used to bind a group of fields to a bean object. The xmlType assigned to the segment determines the format of the XML. Possible values are element (default) and none. The difference can be explored using the Address and Employee beans defined in Section 4.4 and repeated here.

package example;

public class Address {
    String street;
    String city;
    String state;
    String zip;

    // getters and setters not shown...
}

package example;
import java.util.Date;

public class Employee {
    String firstName;
    String lastName;
    String title;
    int salary;
    Date hireDate;
    Address mailingAddress;

    // getters and setters not shown...
}

By default, a segment's xmlType is set to element, so it is not necessary to declare it in the mapping file below.

<beanio>

  <stream name="employeeFile" format="xml">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" type="date" />
      <segment name="mailingAddress" class="example.Address" xmlType="element">
        <field name="street" />
        <field name="city" />
        <field name="state" />
        <field name="zip" />
      </segment>
    </record>
  </stream>

</beanio>

This mapping configuration can be used to process the sample XML document below. When a segment is mapped to an XML element, nillable and minOccurs will control the marshalling behavior of null bean objects in the same fashion as a field (see Section 5.7.2).

<?xml version="1.0"?>
<employeeFile>
  <employee>
    <firstName>Joe</firstName>
    <lastName>Smith</lastName>
    <title>Developer</title>
    <salary>75000</salary>
    <hireDate>2009-10-12</hireDate>
    <mailingAddress>
      <street>123 Main Street</street>
      <city>Chicago</city>
      <state>IL</state>
      <zip>12345</zip>
    </mailingAddress>
  </employee>
  .
  .
  .
</employeeFile>

Alternatively, if the segment's xmlType is set to none, the following XML document can be processed.

<?xml version="1.0"?>
<employeeFile>
  <employee>
    <firstName>Joe</firstName>
    <lastName>Smith</lastName>
    <title>Developer</title>
    <salary>75000</salary>
    <hireDate>2009-10-12</hireDate>
    <street>123 Main Street</street>
    <city>Chicago</city>
    <state>IL</state>
    <zip>12345</zip>
  </employee>
  .
  .
  .
</employeeFile>

5.8.2. Wrapped Segments

In some cases, an XML document may contain extraneous elements that do not map directly to a bean object or property value. In these cases, a segment (without a class attribute) can be used to wrap a field or group of fields.

Extending the previous example, let's suppose the Employee bean object is modified to hold a list of addresses.

package example;
import java.util.Date;

public class Employee {
    String firstName;
    String lastName;
    String title;
    int salary;
    Date hireDate;
    List<Address> addressList;

    // getters and setters not shown...
}

And let's further suppose that each employee's list of addresses is enclosed in a new element called addresses.

<?xml version="1.0"?>
<employeeFile>
  <employee>
    <firstName>Joe</firstName>
    <lastName>Smith</lastName>
    <title>Developer</title>
    <salary>75000</salary>
    <hireDate>2009-10-12</hireDate>
    <addresses>
      <mailingAddress>
        <street>123 Main Street</street>
        <city>Chicago</city>
        <state>IL</state>
        <zip>12345</zip>
      </mailingAddress>
    </addresses>
  </employee>
  .
  .
  .
</employeeFile>

The mapping file can now be updated as follows:

<beanio>

  <stream name="employeeFile" format="xml">
    <record name="employee" class="example.Employee">
      <field name="firstName" />
      <field name="lastName" />
      <field name="title" />
      <field name="salary" />
      <field name="hireDate" type="date" />
      <segment name="addresses">
        <segment name="mailingAddress" class="example.Address" collection="list" minOccurs="0" maxOccurs="unbounded">
          <field name="street" />
          <field name="city" />
          <field name="state" />
          <field name="zip" />
        </segment>
      </segment>
    </record>
  </stream>

</beanio>

The following table illustrates various effects using a segment based on the xmlType of a field, and the effect of minOccurs and nillable when marshalling null field values.

Field Mapping	Non-Null Field Value	Null Field Value
<segment name="wrapper"> <field name="field" /> </segment>	<wrapper> <field>value</field> </wrapper>	<wrapper> <field/> </wrapper>
<segment name="wrapper" minOccurs="0"> <field name="field" /> </segment>		-
<segment name="wrapper" nillable="true"> <field name="field" /> </segment>		<wrapper xsi:nil="true"/>
<segment name="wrapper"> <field name="field" nillable="true" /> </segment>		<wrapper> <field xsi:nil="true"/> </wrapper>
<segment name="wrapper"> <field name="field" minOccurs="0"/> </segment>		<wrapper/>
<segment name="wrapper"> <field name="field" xmlType="attribute" /> </segment>	<wrapper field="value"/>	<wrapper/>
<segment name="wrapper"> <field name="field" xmlType="attribute" minOccurs="1" /> </segment>		<wrapper field=""/>
<segment name="wrapper" minOccurs="0"> <field name="field" xmlType="attribute" minOccurs="1" /> </segment>		-
<segment name="wrapper"> <field name="field" xmlType="text" /> </segment>	<wrapper>value</wrapper>	<wrapper/>
<segment name="wrapper" nillable="true"> <field name="field" xmlType="text" /> </segment>		<wrapper xsi:nil="true"/>
<segment name="wrapper" minOccurs="0"> <field name="field" xmlType="text"/> </segment>		-

Similarly, a segment can be used to wrap a repeating field as illustrated below.

Field Mapping	Collection	Null or Empty Collection
<segment name="wrapper"> <field name="field" collection="list" minOccurs="0" maxOccurs="10" /> </segment name="wrapper">	<wrapper> <field>value1</field> <field>value2</field> </wrapper>	<wrapper />
<segment name="wrapper"> <field name="field" collection="list" minOccurs="1" maxOccurs="10" /> </wrapper>		<wrapper> <field/> </wrapper>
<segment name="wrapper" minOccurs="0"> <field name="field" collection="list" minOccurs="1" maxOccurs="10" /> </wrapper>		-
<segment name="wrapper" nillable="true"> <field name="field" collection="list" minOccurs="1" maxOccurs="10" /> </wrapper>		<wrapper xsi:nil="true"/>

6.0. Annotations and the Stream Builder API

Since release 2.1, BeanIO includes support for Java annotations and a stream builder API.

6.1. The Stream Builder API

The stream builder API can be used to programatically create a stream mapping without the need for a mapping file.

    StreamFactory factory = StreamFactory.newInstance();

    // create a new StreamBuilder and define its layout
    StreamBuilder builder = new StreamBuilder("employeeFile")
        .format("delimited")
        .parser(new DelimitedParserBuilder(','))
        .addRecord(new RecordBuilder("employee")
            .type(Employee.class)
            .minOccurs(1)
            .addField(new FieldBuilder("type").rid().literal("EMP").ignore())
            .addField(new FieldBuilder("recordType").rid(
            .addField(new FieldBuilder("firstName"))
            .addField(new FieldBuilder("lastName"))
            .addField(new FieldBuilder("title"))
            .addField(new FieldBuilder("salary"))
            .addField(new FieldBuilder("hireDate").format("MMddyyyy")));

    // pass the StreamBuilder to the factory
    factory.define(builder);

    BeanReader in = factory.createReader("employeeFile", new File("employee.csv"));
    // etc...

Like in a mapping file, components are assumed to be ordered as they are added to their parent, unless at (for fields) or order (for records and groups) is explicitly set. For more information, refer to the Javadocs for the org.beanio.builder package.

6.2. Annotations

Java classes can also be annotated to augment the use of the stream builder API or mapping file. Classes may be annotated with @Record, and class attributes or getter/setter methods may be annotated with @Field. Any component annotated with @Record or @Segment may be further annotated using @Fields to include fields not bound to a Java property.

Continuing our previous example, the example.Employee class could be annotated like so:

@Record(minOccurs=1)
@Fields({
    @Field(name="type", at=0, rid=true, literal="EMP")
})
public class Employee {

    @Field(at=1)
    private String firstName;
    @Field(at=2)
    private String lastName;
    @Field(at=3)
    private String title;
    @Field(at=4)
    private String salary,
    @Field(at=5, format="MMddyyyy")
    private Date hireDate;

    // getters and setters...
}

Using an annotated Employee class, the stream builder example can be greatly simplified:

    StreamFactory factory = StreamFactory.newInstance();

    StreamBuilder builder = new StreamBuilder("employeeFile")
        .format("delimited")
        .parser(new DelimitedParserBuilder(','))
        .addRecord(Employee.class);

    factory.define(builder);

As can a mapping file:

<beanio>

  <stream name="employeeFile" format="delimited">
    <parser>
      <property name="delimiter" value="," />
    </parser>
    <record name="employee" class="example.Employee" />
  </stream>

</beanio>

When using annotations, it is strongly recommended to explicitly set the position (using at) for all fields and segments. BeanIO does not guarrantee the order in which annotated components are added to a layout.

Annotation settings are generally named according to their mapping file counterparts and follow the same convention as well. Refer to Appendix A for a complete explanation of all settings.

Where used, annotated records can not be overridden by mapping file components. Configuration settings other than class and descendent components will be ignored.

8.0. Configuration

In some cases, BeanIO behavior can be controlled by setting optional property values. Properties can be set using System properties or a property file. BeanIO will load configuration setting in the following order of priority:

System properties.
A property file named beanio.properties. The file will be looked for first in the application's working directory, and then on the classpath.

The name and location of beanio.properties can be overridden using the System property org.beanio.configuration. In the following example, configuration settings will be loaded from the file named config/settings.properties, first relative to the application's working directory, and if not found, then from the root of the application's classpath.

java -Dorg.beanio.configuration=config/settings.properties example.Main

8.1. Settings

The following configuration settings are supported by BeanIO:

Property	Description	Default
`org.beanio.allowProtectedAccess`	Whether private and protected class variables and constructors can be accessed (i.e. make accessible using the reflection API).	`true`
`org.beanio.lazyIfEmpty`	Whether objects are lazily instantiated if String properties are empty (and not just null).	`true`
`org.beanio.errorIfNullPrimitive`	Whether null field values will cause an exception if bound to a primitive property.	`false`
`org.beanio.useDefaultIfMissing`	Whether default values apply to fields missing from the stream.	`true`
`org.beanio.propertyEscapingEnabled`	Whether `property` values (for `typeHandler`, `reader` and `writer` elements) support escape patterns for line feeds, carriage returns, tabs, etc. Set to `true` or `false`.	`true`
`org.beanio.nullEscapingEnabled`	Whether the null character can be escaped using `\0` when property escaping is enabled. Set to `true` or `false`.	`true`
`org.beanio.marshalDefaultEnabled`	Whether a configured `field` default is marshalled for null property values. May be disabled for backwards compatibility by setting the value to `false`.	`true`
`org.beanio.defaultTypeHandlerLocale`	Sets the default type handler locale.	`Locale.getDefault()`
`org.beanio.defaultDateFormat`	Sets the default `SimpleDateFormat` pattern for `date` and `calendar-date` type fields in CSV, delimited and fixed length file formats.	`DateFormat. getDateInstance()`
`org.beanio.defaultDateTimeFormat`	Sets the default `SimpleDateFormat` pattern for `datetime`, `calendar-datetime` and `calendar` type fields in CSV, delimited and fixed length file formats..	`DateFormat. getDateTimeInstance()`
`org.beanio.defaultTimeFormat`	Sets the default `SimpleDateFormat` pattern for `time` and `calendar-time` type fields in CSV, delimited and fixed length file formats..	`DateFormat. getTimeInstance()`
`org.beanio.group.minOccurs`	Sets the default `minOccurs` for a `group`.	`0`
`org.beanio.record.minOccurs`	Sets the default `minOccurs` for a `record`.	`0`
`org.beanio.field.minOccurs.[format]`	Sets the default `minOccurs` for a `field` by stream format.	`1`
`org.beanio.propertyAccessorFactory`	Sets the method of property invocation to use. Defaults to `reflection`.
`org.beanio.xml.defaultXmlType`	Sets the default XML type for a field in an XML formatted stream. May be set to `element` or `attribute`.	`element`
`org.beanio.xml.xsiNamespacePrefix`	Sets the default prefix for the namespace `http://www.w3.org/2001/XMLSchema-instance`.	`xsi`
`org.beanio.xml.sorted`	Whether XML fields are sorted by position (if assigned).	`true`
`org.beanio.fixedlength.countMode`	Determines how field lengths are computed, either `chars` or `bytes`.	`chars`
`org.beanio.fixedlength.charset`	The charset used to compute field lengths and values when `org.beanio.fixedlength.countMode` is `bytes`	`utf-8`

Appendix A: XML Mapping File Reference

Appendix A is the complete reference for the BeanIO 2.x XML mapping file schema. The root element of a mapping file is [beanio](#beanio) with namespace http://www.beanio.org/2012/03. The following notation is used to indicate the allowed number of child elements:

* Zero, one or more
+ One or more
? Zero or one

Ranges

Where noted, some attributes can be configured using a range notation. A range is expressed using the following syntax, where N and M are integer values:

N	Upper and lower boundaries are set to N.
N-M	Lower boundary is set to N. Upper boundary is set to M.
N+	Lower boundary is set to N. No upper boundary.

A.1. `beanio`

The beanio element is the root element for a BeanIO mapping file.

Children: property*, import*, typeHandler*, template *, stream*

A.2. `import`

The import element is used to import type handlers, templates and streams from an external mapping file. Streams declared in a mapping file being imported are not affected by global type handlers or templates declared in the file that imported it.

Attributes:

Attribute	Description	Required
`resource`	The name of the resource to import. The resource name must be qualified with 'classpath:' to load the resource from the classpath, or with 'file:' to load the file relative to the application's working directory.	Yes

A.3. `typeHandler`

A typeHandler element is used to declare a custom field type handler that implements the org.beanio.types.TypeHandler interface. A type handler can be registered for a specific Java type, or registered for a Java type and stream format combination, or explicitly named.

Attribute	Description	Required
`name`	The type handler name. A field can always reference a type handler by name, even if the stream format does not match the configured type handler `format` attribute. When configured, the name of a globally declared type handler must be unique within a mapping and any imported mapping files.	One of `name` or `type` is required.
`type`	The fully qualified classname or type alias to register the type handler for. If `format` is also set, the type handler will only be used by streams that match the configured format.	One of `name` or `type` is required.
`class`	The fully qualified classname of the `TypeHandler` implementation.	Yes
`format`	When used in conjunction with the `type` attribute, a type handler can be registered for a specific stream format. Set to `xml`, `csv`, `delimited`, or `fixedlength`. If not set, the type handler may be used by any stream format.	No

Children: property*

A.4. `property`

A property element has several uses.

When used at the top of a mapping file as a direct child of beanio, a property may declare properties to use for property substitution in other attributes within the mapping file. Property substitution uses the syntax ${propertyName,default}, where all whitespace between the brackets is retained. Properties cannot be imported from another file.
Or, a property element may be used to customize other elements, such as a typeHandler or parser.
Or finally, a property value can be used to set constant values on a bean object, which is further described below.

Attribute Description Required

name The property name. Yes

Attribute	Description	Required
`name`	The property name.	Yes
`value`	The property value. When used to customize a `typeHandler` or `parser`, default type handlers only are used to convert property text to an object value. String and Character type property values can use the following escape sequences: `\` (Backslash), `\n` (Line Feed), `\r` (Carriage Return), `\t` (Tab), `\0` (Null) and `\f` (Form Feed). A backslash preceding any other character is ignored.	Yes

value

The property value.

When used to customize a typeHandler or parser, default type handlers only are used to convert property text to an object value. String and Character type property values can use the following escape sequences: \ (Backslash), \n (Line Feed), \r (Carriage Return), \t (Tab), \0 (Null) and \f (Form Feed). A backslash preceding any other character is ignored.

Yes

A property element, when used as child of a record or segment element, can be used to set constant values on a record or bean object that do not map to a field in the input or output stream. The following additional attributes are accepted in this scenario:

Attributes:

Attribute	Description	Required	Format(s)
`getter`	The getter method used to retrieve the property value from its parent bean class. By default, the getter method is determined through introspection using the property name.	No	*
`setter`	The setter method used to set the property value on its parent bean class. By default, the setter method is determined through introspection using the property name.	No	*
`rid`	Record identifier indicator for marshalling/writing only. Set to `true` if this property is used to identify the record mapping configuration used to marshall a bean object. More than one property or field can be used for identification. Defaults to `false`.	No	*
`type`	The fully qualified class name or type alias of the property value. By default, BeanIO will derive the property type from the bean class. This attribute can be used to override the default or may be required if the bean class is of type `Map`.	No	*
`typeHandler`	The name of the type handler to use for type conversion. By default, BeanIO will select a type handler based on `type` when set, or through introspection of the property's parent bean class.	No	*
`format`	The decimal format pattern for `Number` type properties, or the simple date format pattern for `Date` type properties. The `format` value can accessed by any custom type handler that implements `ConfigurableTypeHandler`.	No	*

A.5. `template`

The template element is used to create reusable lists of bean properties.

Note that templates are "expanded" at the time they are included. This means an imported template that relies on property substitution will use property values from the mapping file that included it and not the mapping file where the template was declared.

Attributes:

Attribute	Description	Required
`name`	The name of the template. Template names must be unique within a mapping file and any imported mapping files.	Yes

Children: ( field | property | segment | include )*

A.6. `include`

The include element is used to include a template in a record, segment, or another template.

Attributes:

Attribute	Description	Required
`template`	The name of the template to include.	Yes
`offset`	The offset added to field positions included by the template. Defaults to 0.	No

A.7. `stream`

A stream element defines the record layout of an input or output stream.

Attributes:

Attribute	Description	Required	Format(s)
`name`	The name of the stream.	Yes	*
`format`	The stream format. Either `xml`, `csv`, `delimited` or `fixedlength`	Yes	*
`mode`	By default, a stream mapping can be used for both reading input streams and writing output streams, called `readwrite` mode. Setting mode to `read` or `write` instead, respectively restricts usage to a `BeanReader` or a `BeanWriter` only, but relaxes some validations on the mapping configuration. When mode is set to `read`, a bean class does not require getter methods. When mode is set to `write`, a bean class may be abstract or an interface, and does not require setter methods.	No	*
`resourceBundle`	The name of the resource bundle for customizing error messages.	No	*
`strict`	When set to `true`, BeanIO will calculate and enforce record ordering based on the order records are declared. The record `order` attribute can still be used to override a particular section of the stream. When set to `true`, BeanIO will also calculate and enforce record lengths based on configured fields and their occurrences. The record `minLength` and `maxLength` attributes can still be used to override BeanIO defaults. Defaults to `false`.	No	*
`minOccurs`	The minimum number of times the record layout must be read from an input stream. Defaults to `0`.	No	*
`maxOccurs`	The maximum number of times the record layout can repeat when read from an input stream. Defaults to `1`.	No	*
`occurs`	An alternative to specifying both `minOccurs` and `maxOccurs` that uses range notation.	No	*
`ignoreUnidentified Records`	If set to true, BeanIO will skip records that cannot be identified, otherwise an `UnidentifiedRecordException` is thrown. This feature is not recommended for use with record groups, since a record sequencing error could cause large portions of a stream to go unprocessed without any exception.	No	*
`xmlType`	The XML node type mapped to the stream. If not specified or set to `element`, the stream is mapped to the root element of the XML document being marshalled or unmarshalled. If set to `none`, the XML input stream will be fully read and mapped to a child `group` or `record`.	No	xml
`xmlName`	The local name of the XML element mapped to the stream. Defaults to the stream name.	No	xml
`xmlNamespace`	The namespace of the XML element mapped to the stream. Defaults to '*' which will ignore namespaces while marshalling and unmarshalling.	No	xml
`xmlPrefix`	The namespace prefix assigned to the declared `xmlNamespace` for marshalling XML. If not specified, the default namespace (i.e. `xmlns="..."`) is used.	No	xml

Children: parser?, typeHandler*, ( record | group )+

A.8. `parser`

A parser element is used to customize or replace the default record parser factory for a stream.

Attributes:

Attribute Description Required

Attribute	Description	Required
`class`	The fully qualified class name of the `org.beanio.stream.RecordParserFactory` implementation to use for this stream. If not specified, one of the following default factories is used based on the stream format: csv - `org.beanio.stream.csv.CsvRecordParserFactory` delimited - `org.beanio.stream.delimited.DelimitedRecordParserFactory` fixedlength - `org.beanio.stream.fixedlength.FixedLengthRecordParserFactory` xml - `org.beanio.stream.xml.XmlRecordParserFactory` Overriding the record parser factory for XML is not supported (but also not prevented).	No

class

The fully qualified class name of the org.beanio.stream.RecordParserFactory implementation to use for this stream. If not specified, one of the following default factories is used based on the stream format:

csv - org.beanio.stream.csv.CsvRecordParserFactory
delimited - org.beanio.stream.delimited.DelimitedRecordParserFactory
fixedlength - org.beanio.stream.fixedlength.FixedLengthRecordParserFactory
xml - org.beanio.stream.xml.XmlRecordParserFactory

Overriding the record parser factory for XML is not supported (but also not prevented).

Children: property*

A.9. `group`

A group element is used to group records together for validating occurrences of the group as a whole.

Attributes:

Attribute	Description	Required	Format(s)
`name`	The name of the group.	Yes	*
`class`	The fully qualified class name of the bean object mapped to this group. A `class` may be bound to a group when its marshalled form spans multiple consecutive records. During umarshalling, if any record in the group fails validation, an `InvalidRecordGroupException` is thrown.	No	*
`value target`	The name of a child component (typically a record) to return in lieu of an assigned class. There can be only one iteration of the named value. For example, if a repeating segment bound to a collection contains a repeating field (also bound to a collection), the segment can be targeted, but the field cannot.	No	*
`collection`	The collection type for repeating groups bound to a parent bean object (configured on a `group`). The value may be set to any fully qualified class name assignable to `java.util.Collection`, or to one of the collection type aliases: `list`, `set` or `array`. A collection type can only be set if `class` is also set. BeanIO will not derive the collection type from it's parent bean object.	No	*
`getter`	The getter method used to get the bean object bound to this group from it's parent. By default, the getter method is determined through introspection using the group name. Ignored if `class` is not set.	No	*
`setter`	The setter method used to set the bean object bound to this group on the bean object of it's parent. By default, the setter method is determined through introspection using the group name. Ignored if `class` is not set.	No	*
`order`	The order this group must appear within its parent group or stream. If `strict` is set to true at the stream level, `order` will default to the order assigned to its preceding sibling plus one (i.e. the record or group that shares the same parent), or 1 if this group is the first child in its parent group or stream. If `strict` is false, defaults to 1. If `order` is explicitly set for one group, it must be set for all other siblings that share the same parent.	No	*
`minOccurs`	The minimum number of occurences of this group within its parent group or stream. Defaults to 1.	No	*
`maxOccurs`	The maximum number of occurences of this group within its parent group or stream. Defaults to `unbounded`.	No	*
`occurs`	An alternative to specifying both `minOccurs` and `maxOccurs` that uses range notation.	No	*
`xmlType`	The XML node type mapped to this group. If not specified or set to `element`, this group is mapped to an XML element. When set to `none`, this group is used only to define expected record sequencing.	No	xml
`xmlName`	The local name of the XML element mapped to this group. Defaults to the group name.	No	xml
`xmlNamespace`	The namespace of the XML element mapped to this group. Defaults to the namespace declared for the parent stream or group definition.	No	xml
`xmlPrefix`	The namespace prefix assigned to the declared `xmlNamespace` for marshalling XML. If not specified, the default namespace is used (i.e. `xmlns="..."`).	No	xml

Children: record*

A.10. `record`

A record is used to define a record mapping within a stream.

Attributes:

Attribute	Description	Required	Format(s)
`name`	The name of the record.	Yes	*
`class`	The fully qualified class name of the bean object mapped to this record. If set to `map` or any `java.util.Map` implementation, a Map object will be used with field names for keys and field values for values. If set to `list`, `set`, `collection`, or any `java.util.Collection` implementation, child fields are added to the declared collection, including null values for missing or null fields. If neither `class` or `target` is set, a `BeanReader` will fully validate the record, but no bean object will be returned and the reader will continue reading the next record.	No	*
`value target`	The name of a child segment or field to return in lieu of an assigned class. There can be only one iteration of a named value. For example, if a repeating segment bound to a collection contains a repeating field (also bound to a collection), the segment can be targeted, but the field cannot. If neither `class` or `value` is set, a `BeanReader` will fully validate the record, but no bean object will be returned and the reader will continue reading the next record.	No	*
`getter`	The getter method used to get the bean object bound to this record from it's parent. By default, the getter method is determined through introspection using the record name. Ignored if `class` is not set.	No	*
`setter`	The setter method used to set the bean object bound to this record on the bean object of it's parent. By default, the setter method is determined through introspection using the record name. Ignored if `class` is not set.	No	*
`collection`	The collection type for repeating records bound to a parent bean object (configured on a `group`). The value may be set to any fully qualified class name assignable to `java.util.Collection` or `java.util.Map`, or to one of the collection type aliases: `list`, `set`, `map` or `array`. A collection type can only be set if `class` or `target` is also set. BeanIO will not derive the collection type from it's parent bean object.	No	*
`key`	The name of a descendant field to use for the Map key when `collection` is assignable to a `java.util.Map`.	No	*
`order`	The order this record must appear within its parent group or stream. If `strict` is set to true at the stream level, `order` will default to the order assigned to its preceding sibling plus one (i.e. the record or group that shares the same parent), or 1 if this record is the first child in its parent group or stream. If `strict` is false, defaults to 1. If `order` is explicitly set for one record, it must be set for all other siblings that share the same parent.	No	*
`minOccurs`	The minimum number of occurences of this record within its parent group or stream. Defaults to 0.	No	*
`maxOccurs`	The maximum number of occurrences of this record within its parent group or stream. Defaults to `unbounded`.	No	*
`occurs`	An alternative to specifying both `minOccurs` and `maxOccurs` that uses range notation.	No	*
`lazy`	If set to `true`, the class or collection bound to this record will only be instantiated if at least one child attribute is not null or the empty String. Defaults to `false`. [Only applies if this record is bound to an attribute of a parent group.]	No	*
`template`	The name of the template to include. The template is added to the record layout before any child of this record.	No	*
`ridLength`	The expected length of this record for identifying it. The value uses range notation. If the stream format is `delimited` or `csv`, record length is measured by number of fields. If the stream format is `fixedlength`, record length is measured in characters.	No	csv, delimited, fixedlength
`minLength`	If the stream format is `delimited` or `csv`, `minLength` is the minimum number of fields required by this record. If `strict` is true, defaults to the number of fields defined for the record, otherwise 0. If the stream format is `fixedlength`, `minLength` is the minimum number of characters required by this record. If `strict` is true, defaults to the sum of all field lengths definied for the record, otherwise 0.	No	csv, delimited, fixedlength
`maxLength`	If the stream format is `delimited` or `csv`, `maxLength` is the maximum number of fields allowed by this record. If `strict` is true, defaults to the number of fields defined for the record, or if no fields are declared or `strict` is false, then `unbounded`. If the stream format is `fixedlength`, `maxLength` is the maximum number of characters allowed by this record. If `strict` is true, defaults to the sum of all field lengths defined for the record, or if no fields are declared or `strict` is false, then `unbounded`.	No	csv, delimited, fixedlength
`xmlName`	The local name of the XML element mapped to this record. Defaults to the record name.	No	xml
`xmlNamespace`	The namespace of the XML element mapped to this record. Defaults to the namespace declared for this record's parent group or stream.	No	xml
`xmlPrefix`	The namespace prefix assigned to the declared `xmlNamespace` for marshalling XML. If not specified, the default namespace is used (i.e. `xmlns="..."`).	No	xml

Children: ( field | property | segment | include )*

A.11. `segment`

A segment is used to bind groups of fields to a nested bean object, or to validate repeating groups of fields, or in an XML formatted stream, to wrap one or more fields in an element.

Attributes:

Attribute	Description	Required	Format(s)
`name`	The name of the segment. If the segment is bound to a bean object, the segment name is used for the name of the bean property unless a `getter` or `setter` is set.	Yes	*
`class`	The fully qualified class name of the bean object bound to this segment. If set to `map` or any `java.util.Map` implementation, a Map object will be used with field/segment names for keys and field/segment values for values.	No	*
`value` ~~`target`~~	The name of a child segment or field to return in lieu of an assigned class. If set, all other descendants are not bound to the parent bean property.	No	*
`getter`	The getter method used to get the bean object bound to this segment from it's parent. By default, the getter method is determined through introspection using the segment name. Ignored if `class` is not set.	No	*
`setter`	The setter method used to set the bean object bound to this segment on the bean object of it's parent. By default, the setter method is determined through introspection using the segment name. Ignored if `class` is not set.	No	*
`collection`	The collection type for repeating segments bound to a parent bean object. The value may be set to any fully qualified class name assignable to `java.util.Collection` or `java.util.Map`, or to one of the collection type aliases: `list`, `set`, `map` or `array`. A collection type can only be set if `class` or `target` is also set. BeanIO will not derive the collection type from it's parent bean object. There are a few restrictions specific to repeating segments in any "flat" format (delimited, CSV or fixedlength): Repeating segments must appear in the stream consecutively. A repeating segment cannot contain repeating fields or segments where the length is indeterminate (i.e. where `minOccurs` does not match `maxOccurs`). Repeating segments must fully declare all child fields- there can be no gaps in the definition. (However, you can still skip over unbound values using the `ignore` field attribute.)	No	*
`key`	The name of a descendant field to use for the Map key when `collection` is assignable to a `java.util.Map`.	No	*
`minOccurs`	The minimum consecutive occurrences of this segment. Defaults to 1. If `minOccurs` is 0, a null bean object bound to this segment will not be marshalled (unless a subsequent field is marshalled for CSV, delimited and fixed legnth stream formats) During unmarshalling, if the configured minimum occurrences is not met, an `InvalidRecordException` is thrown.	No	*
`maxOccurs`	The maximum consecutive occurrences of this segment. By default, `maxOccurs` is set to `minOccurs` or 1, whichever is greater. If there is no limit to the number of occurrences, the value may be set to `unbounded`. If set for a CSV, delimited or fixed length stream, the value can only exceed `minOccurs` if the segment appears at the end of a record. Maximum occurrences is not used for validation. If bounded, the size of a bound collection will not exceed the configured value, and additional occurrences are ignored.	No	*
`occurs`	An alternative to specifying both `minOccurs` and `maxOccurs` that uses range notation.	No	*
`occursRef`	The name of a preceding field in the same record that controls the number of occurrences of this segment. If the controlling field is not bound to a separate property (i.e. `ignore="true"`), its automatically set based on the size of the segment collection during marshalling.	No	csv, delimited, fixedlength
`lazy`	If set to `true`, the class or collection bound to this segment will only be instantiated if at least one child attribute is not null or the empty String. Defaults to `false`. This functionality differs from `minOccurs` in that the fields may still exist in the input stream.	No	*
`template`	The name of the template to include. The template is added to the layout before any child of this segment.	No	*
`xmlType`	The XML node type mapped to this segment. If not specified or set to `element`, this bean is mapped to an XML element. If set to `none`, children of this segment are expected to be contained by this segment's parent.	No	xml
`xmlName`	The local name of the XML element mapped to this segment. Defaults to the segment name.	No	xml
`xmlNamespace`	The namespace of the XML element mapped to this segmnet. Defaults to the namespace declared for the parent `record` or `segmnet`.	No	xml
`xmlPrefix`	The namespace prefix assigned to the declared `xmlNamespace` for marshalling XML. If not specified, the default namespace is used (i.e. `xmlns="..."`).	No	xml
`nillable`	Set to `true` if the W3C Schema Instance attribute `nil` should be set to true when marshalling a null bean object. Defaults to `false`. During unmarshalling, a nillable element will cause an `InvalidRecordException` if `nillable` is false..	No	xml

Children: ( field | property | segment | include )*

A.12. `field`

A field element is used to bind a field belonging to a record or segment to a bean property.

Attributes:

Attribute	Description	Required	Formats
`name`	The name of field. Unless a getter and/or setter is defined, the field name is used for the bean property name.	Yes	*
`getter`	The getter method used to retrieve the property value for this field from its parent bean class. By default, the getter method is determined through introspection using the field name.	No	*
`setter`	The setter method used to set the property value of this field on its parent bean class. By default, the setter method is determined through introspection using the field name. If the field is a constructor argument, `setter` may be set to '#N', where N is the position of the argumnet in the constructor beginning at 1.	No	*
`rid`	Record identifier indicator. Set to `true` if this field is used to identify a record. More than one field can be used to identify a record. Defaults to `false`. Record identifying fields must have `regex` or `literal` configured to match a record, unless the field is a named XML element or attribute.	No	*
`at` `position`	For delimited and CSV formatted streams, `position` (or `at`) is the index of the field within the record, beginning at 0. And for fixed length formatted streams, `position` is the index of the first character of the field within the record, beginning at 0. Negative numbers can be used to indicate the position is relative to the end of the record. For example, the position -2 indicates the second to last field in a delimited record. If the field repeats, or the field belongs to a segment that repeats, `position` should be set based on the first occurrence of the field in a record. A position must be specified for all fields in a record, or for none at all. If positions are not specified, BeanIO will automatically calculate field positions based on the order in which the fields are defined in the mapping file. Position, if defined, is also used in XML formatted streams for ordering fields within their parent record or segment. This is typically not needed when using a mapping file, but can be useful when using annotations. If a position is configured for a parent segment (with annotations), the positions declared for fields added to the segment are assumed to be relative to their parent.	No	*
`until`	The maximum position of the field in the record. Only applies to fields that repeat where the number of occurrences is indeterminate (i.e. `maxOccurs` is greater than `minOccurs`). `until` must always be specified relative to the end of the record, and is therefore always a negative number.	No	csv, delimited, fixedlength
`trim`	Set to `true` to trim the field text before validation and type conversion. Defaults to `false`.	No	*
`lazy`	Set to true to convert empty field text to null before type conversion. For repeating fields bound to a collection, the collection will not be created if all field values are null or the empty String. Defaults to false.	No	*
`required`	Set to `true` if this field is required. If a field is required and its field text is empty, a `BeanReader` will throw an `InvalidRecordException` when reading the record. Defaults to `false`.	No	*
`minLength`	The minimum length of the field text before type conversion. Minimum length is only validated if the field length is greater than 0. Defaults to 0.	No	*
`maxLength`	The maximum length of the field text before type conversion. Defaults to `unbounded`.	No	*
`regex`	The regular expression pattern the field text must match.	No	*
`literal`	Sets the literal or constant value of this field. When unmarshalled, an `InvalidRecordException` is thrown if the field text does not exactly match the literal value.	No	*
`default`	The default value of this field. When unmarshalling a stream, this value is set on the bean object when the field text is null or the empty string. And when marshalling, the default value is used when the property value is null or ignore is set to `true` (unless disabled). A default value is converted to a Java object using the same type handler configured for the field.	No	*
`type`	The fully qualified class name or type alias of the field value. By default, BeanIO will derive the field type from its parent bean class. This attribute can be used to override the default, or may be needed when its parent class is of type `java.util.Map`.	No	*
`collection`	If a repeating field is bound to a collection object, `collection` is the fully qualified class name of the `java.util.Collection` implementation, or a collection type alias. When a `collection` is configured, the `type` attribute is used to declare the property type of an item stored in the collection. May be set to `array` if the collection type is a Java array. Repeating fields bound to a property value must have `collection` configured. BeanIO will not derive the collection type from a field's parent bean class.	No	*
`minOccurs`	The minimum consecutive occurrences of this field in a record. Defaults to 1, with one exception: a field in an XML formatted stream bound to an attribute defaults to 0. `minOccurs` controls whether a field is marshalled for a null field value, and whether the field must be present during unmarshalling. If `minOccurs` is 1 or greater and the field is not present during unmarshalling, an `InvalidRecordException` is thrown.	No	*
`maxOccurs`	The maximum consecutive occurrences of this field in a record. By default, `maxOccurs` is set to `minOccurs` or 1, whichever is greater. If overridden for a non-XML stream format, the value can only exceed `minOccurs` if this is the last field in the record. The value may be set to `unbounded` if there is no limit to the number of occurrences of this field. Maximum occurrences is not used for validation. When bounded, the size of a collection will not exceed the configured value, and additional occurrences are ignored.	No	*
`occurs`	An alternative to specifying both `minOccurs` and `maxOccurs` that uses range notation.	No	*
`occursRef`	The name of a preceding field in the same record that controls the number of occurrences of this field. If the controlling field is not bound to a separate property (i.e. `ignore="true"`), its automatically set based on the size of the field collection during marshalling.	No	csv, delimited, fixedlength
`format`	The decimal format pattern for `java.lang.Number` field values, or the simple date format pattern for `java.util.Date` field properties. The `format` value can also be accessed by any custom type handler that implements `org.beanio.types.ConfigurableTypeHandler`.	No	*
`typeHandler`	The name of the type handler to use for type conversion. By default, BeanIO will select a type handler based on the field type when set, or through introspection of this field's parent bean class.	No	*
`ignore`	Set to `true` if this field is not a property of it's parent bean class. Defaults to `false`. Note that any configured validation rule on an ignored field is still performed.	No	*
`length`	The padded length of this field measured in characters. Length is required for fixed length formatted streams, and can be set for fields in other stream formats (along with a padding character) to enable field padding. The length of the last field in a fixed length record may be set to `unbounded` to disable padding and allow a single variable length field at the end of the otherwise fixed length record.	Yes¹	`*`
`padding`	The character used to pad this field. For fixed length formatted streams, `padding` defaults to a space. For non-fixed length formatted streams, padding is disabled unless a padding character and length are specified. If padding is enabled, the `required` field attribute has some control over the marshalling and unmarshalling of null values. When unmarshalling a field consisting of all spaces in a fixed length stream, if `required` is false, the field is accepted regardless of the padding character. If `required` is true, a required field validation error is triggered. And when marshalling a null field value, if `required` is false, the field text is formatted as spaces regardless of the configured padding character. In other stream formats that are not fixed length, null field values are unmarshalled and marshalled as empty strings when `required` is false. When `required` is true, unmarshalling an empty string will trigger a required field validation error, and marshalling a null value will fill the field text with the padding character up to the padded length of the field.	No	`*`
`keepPadding`	Set to `true` if field padding should not be removed when unmarshalling a fixed length field. Defaults to `false`.	No	fixedlength
`lenientPadding`	Set to `true` to disable enforcement of the padded field length when unmarshalling a fixed length field. Defaults to `false`.	No	fixedlength
`justify align`	The justification (i.e. alignment) of the field text within its padding. Either `left` or `right`. Defaults to `left`.	No	`*`
`xmlType`	The XML node type mapped to this field. The type can be set to `element` (default) to map this field to an XML element, `attribute` to map to an XML attribute, or `text` to map the field value to the enclosed text of it's parent record or segment. When set to `text`, `xmlName` and `xmlNamespace` have no effect.	No	xml
`xmlName`	The local name of the XML element or attribute mapped to this field. Defaults to the field name.	No	xml
`xmlNamespace`	The namespace of the XML element mapped to this field. Defaults to the namespace configured for it's immediate parent record or segment.	No	xml
`xmlPrefix`	The namespace prefix assigned to the configured `xmlNamespace` for marshalling XML. If not specified, the default namespace (i.e. `xmlns="..."`) is used.	No	xml
`nillable`	Set to `true` if the W3C Schema Instance attribute `nil` should be set to true when the marshalled field value is null. Defaults to `false`. Unmarshalling a non-nillalbe field where `nil="true"` will cause an `InvalidRecordException`.	No	xml

¹Only required for fixed length fields. If a literal value is supplied for a fixed length field, length will default to the length of the literal value.

Appendix B: Error Message Parameters

The following table shows the message parameters used to format an error message for each configurable validation rule.

Type	Rule Name	Index	Value
Record Error	`malformed`	0	Line Number
	`unidentified`	0	Line Number
	`unexpected`	0	Line Number
	`unexpected`	1	Record Label/Name
	`minLength`	0	Line Number
		1	Record Label/Name
		2	Minimum Length
		3	Maximum Length
	`maxLength`	0	Line Number
		1	Record Label/Name
		2	Minimum Length
		3	Maximum Length
Field Error	`required`	0	Line Number
		1	Record Label/Name
		2	Field Label/Name
		3	Field Text
	`nillable`	0	Line Number
		1	Record Label/Name
		2	Field Label/Name
		3	Field Text
	`minLength`	0	Line Number
		1	Record Label/Name
		2	Field Label/Name
		3	Field Text
		4	Minimum Length
		5	Maximum Length
	`maxLength`	0	Line Number
		1	Record Label/Name
		2	Field Label/Name
		3	Field Text
		4	Minimum Length
		5	Maximum Length
	`length`	0	Line Number
		1	Record Label/Name
		2	Field Label/Name
		3	Field Text
		4	Fixed Length Field Length
	`regex`	0	Line Number
		1	Record Label/Name
		2	Field Label/Name
		3	Field Text
		4	Regular Expression Pattern
	`type`	0	Line Number
		1	Record Label/Name
		2	Field Label/Name
		3	Field Text
		4	`TypeConversionException` error message.
	`literal`	0	Line Number
		1	Record Label/Name
		2	Field Label/Name
		3	Field Text
		4	Literal value
	`minOccurs`	0	Line Number
		1	Record Label/Name
		2	Field or Bean Label/Name
		3	-
		4	Minimum occurrences
		5	Maximum occurences
	`maxOccurs`	0	Line Number
		1	Record Label/Name
		2	Field or Bean Label/Name
		3	-
		4	Minimum occurrences
		5	Maximum occurences

Appendix C: Upgrading a 1.x Mapping File Example

This appendix illustrates typical changes required to update an 1.x mapping file to 2.x.

Given the following 1.x mapping file:

<beanio xmlns="http://www.beanio.org/2011/01"
		xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
		xsi:schemaLocation="http://www.beanio.org/2011/01 http://www.beanio.org/2011/01/mapping.xsd">

	<stream name="employees" format="delimited">
		<reader>
			<property name="delimiter" value="," />
		</reader>
		<writer>
			<property name="delimiter" value="," />
		</writer>
		<record name="header" class="example.Header" maxOccurs="1">
			<field name="recordType" rid="true" literal="H" />
			<field name="fileDate" format="yyyy-MM-dd" />
		</record>
		<record name="employee" class="example.Employee" minOccurs="0" minLength="6" maxLength="7">
			<field name="recordType" rid="true" literal="D" />
			<field name="firstName" />
			<field name="lastName" />
			<bean name="address" class="example.Address" >
				<field name="city" />
				<field name="state" />
				<field name="zip" />
			</bean>
			<field name="phoneNumber" />
		</record>
		<record name="trailer" class="example.Trailer" maxOccurs="1">
			<field name="recordType" rid="true" literal="T" />
			<field name="recordCount" />
		</record>
	</stream>

	<stream name="contacts" format="xml" ordered="false">
		<record name="person" class="example.Person" minOccurs="0">
			<field name="firstName" />
			<field name="lastName" minOccurs="1" />
			<field name="phone" collection="list" minOccurs="0" maxOccurs="5" xmlWrapper="phoneList" />
		</record>
		<record name="company" class="example.Person" minOccurs="0">
			<field name="companyName" minOccurs="1" />
			<field name="phone" />
		</record>
	</stream>

</beanio>

The following 2.x mapping file can be created:

<!-- Namespace updated -->
<beanio xmlns="http://www.beanio.org/2012/03"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">

  <!-- Use 'strict' to have BeanIO calculate and enforce default record lengths and ordering -->
  <stream name="employees" format="delimited" strict="true">
    <!-- Combine 'reader' and 'writer' elements into 'parser' -->
    <parser>
      <property name="delimiter" value="," />
    </parser>
    <!-- 'minOccurs' defaults to 0 if not specified -->
    <record name="header" class="example.Header" minOccurs="1" maxOccurs="1">
      <field name="recordType" rid="true" literal="H" />
      <field name="fileDate" format="yyyy-MM-dd" />
    </record>
    <!-- 'minLength/maxLength' not needed, see 'phoneNumber' field below for explanation -->
    <record name="employee" class="example.Employee" minOccurs="0" minLength="6" maxLength="7">
      <field name="recordType" rid="true" literal="D" />
      <field name="firstName" />
      <field name="lastName" />
      <!-- Change 'bean' elements to 'segment' elements -->
      <segment name="address" class="example.Address" >
        <field name="city" />
        <field name="state" />
        <field name="zip" />
      </segment>
      <!-- Use 'minOccurs' to denote optional fields at the end of a record.  When used with
        -- 'strict', there is no need to set 'minLength' and 'maxLength' on the record, unless
        -- you are not mapping every field -->
      <field name="phoneNumber" minOccurs="0" />
    </record>
    <record name="trailer" class="example.Trailer" minOccurs="1" maxOccurs="1">
      <field name="recordType" rid="true" literal="T" />
      <field name="recordCount" />
    </record>
  </stream>

  <!-- Records are not ordered by default.  -->
  <stream name="contacts" format="xml" ordered="false">
    <!-- minOccurs defaults to 0 -->
    <record name="person" class="example.Person" minOccurs="0">
      <!-- Optional XML elements must set minOccurs to 0 -->
      <field name="firstName" minOccurs="0"/>
      <field name="lastName" minOccurs="1" />
      <!-- Use a 'segment' instead of an 'xmlWrapper' -->
      <segment name="phoneList" minOccurs="0">
        <field name="phone" collection="list" minOccurs="0" maxOccurs="5" xmlWrapper="phoneList" />
      </segment>
    </record>
    <record name="company" class="example.Person" minOccurs="0">
      <field name="companyName" minOccurs="1" />
      <!-- Optional XML elements must set minOccurs to 0 -->
      <field name="phone" minOccurs="0"/>
    </record>
  </stream>

</beanio>

1.0. Introduction​

1.1. What's new in 3.0?​

1.2. Migrating from 2.x to 3.0​

2.0. Getting Started​

2.1. My First Stream​

3.0. Core Concepts​

3.1. BeanReader​

3.2. BeanWriter​

3.3. Unmarshaller​

3.4. Marshaller​

3.5. Mapping Files​

3.6. StreamFactory​

3.7. Exception Handling​

3.7.1. BeanReaderErrorHandler​

4.0. Stream Components​

4.1. Streams​

4.1.1. CSV Streams​

4.1.2. Delimited Streams​

4.1.3. Fixed Length Streams​

4.1.4. XML Streams​

4.2. Records​

4.2.1. Record Identification​

4.2.2. Record Ordering​

4.2.3. Record Grouping​

4.3. Fields​

4.3.1. Field Type Conversion​

4.3.2. Custom Type Handlers​

4.3.3. Repeating Fields​

4.3.4. Fixed Length Fields​

4.4. Constants​

4.5. Segments​

4.5.1. Nested Beans​

4.5.2. Repeating Segments​

4.5.2.1. Inline Maps​

4.6. Stream Validation​

4.6.1. Record Validation​

4.6.2. Field Validation​

4.7. Templates​

4.8. Advanced Topics​

4.8.1. Mapping Bean Objects that Span Multiple Records​

5.0. Mapping XML Streams​

5.1. Introduction​

5.1.1. My First XML Stream​

5.1.2. A Note on XML Validation​

5.2. XML Names​

5.3. XML Namespaces​

5.4. Streams​

5.5. Groups​

5.6. Records​

5.7. Fields​

5.7.1. Field Type Conversion​

5.7.2. Marshalling Null Field Values​

5.8. Segments​

5.8.1. Nested Beans​

5.8.2. Wrapped Segments​

6.0. Annotations and the Stream Builder API​

6.1. The Stream Builder API​

6.2. Annotations​

8.0. Configuration​

8.1. Settings​

Appendix A: XML Mapping File Reference​

Ranges​

A.1. beanio​

A.2. import​

A.3. typeHandler​

A.4. property​

A.5. template​

A.6. include​

A.7. stream​

A.8. parser​

A.9. group​

A.10. record​

A.11. segment​

A.12. field​

Appendix B: Error Message Parameters​

Appendix C: Upgrading a 1.x Mapping File Example​

1.0. Introduction

1.1. What's new in 3.0?

1.2. Migrating from 2.x to 3.0

2.0. Getting Started

2.1. My First Stream

3.0. Core Concepts

3.1. BeanReader

3.2. BeanWriter

3.3. Unmarshaller

3.4. Marshaller

3.5. Mapping Files

3.6. StreamFactory

3.7. Exception Handling

3.7.1. BeanReaderErrorHandler

4.0. Stream Components

4.1. Streams

4.1.1. CSV Streams

4.1.2. Delimited Streams

4.1.3. Fixed Length Streams

4.1.4. XML Streams

4.2. Records

4.2.1. Record Identification

4.2.2. Record Ordering

4.2.3. Record Grouping

4.3. Fields

4.3.1. Field Type Conversion

4.3.2. Custom Type Handlers

4.3.3. Repeating Fields

4.3.4. Fixed Length Fields

4.4. Constants

4.5. Segments

4.5.1. Nested Beans

4.5.2. Repeating Segments

4.5.2.1. Inline Maps

4.6. Stream Validation

4.6.1. Record Validation

4.6.2. Field Validation

4.7. Templates

4.8. Advanced Topics

4.8.1. Mapping Bean Objects that Span Multiple Records

5.0. Mapping XML Streams

5.1. Introduction

5.1.1. My First XML Stream

5.1.2. A Note on XML Validation

5.2. XML Names

5.3. XML Namespaces

5.4. Streams

5.5. Groups

5.6. Records

5.7. Fields

5.7.1. Field Type Conversion

5.7.2. Marshalling Null Field Values

5.8. Segments

5.8.1. Nested Beans

5.8.2. Wrapped Segments

6.0. Annotations and the Stream Builder API

6.1. The Stream Builder API

6.2. Annotations

8.0. Configuration

8.1. Settings

Appendix A: XML Mapping File Reference

Ranges

A.1. `beanio`

A.2. `import`

A.3. `typeHandler`

A.4. `property`

A.5. `template`

A.6. `include`

A.7. `stream`

A.8. `parser`

A.9. `group`

A.10. `record`

A.11. `segment`

A.12. `field`

Appendix B: Error Message Parameters

Appendix C: Upgrading a 1.x Mapping File Example