01
Oct
11

The 3 R’s of Spring Batch


This page describes how you can Read, wRrite and perform aRithmetic on flat files using The Spring Batch Framework. We will take a comma separated file (csv) that contain employee information, add some information to it, and write it back to the file system.

A newer version of this page is available here.

Background

The basic building blocks of any batch process is

  1. Reading a Item
  2. Performing an operation on it
  3. Writing the Item back

Please take some time to review The Domain Language of Batch before proceeding. It covers much of the fundamental concepts we will be covering here.

Batch Steps

This page is focused on an individual step of the batch process.

The following is from the spring batch documentation

A Step is a domain object that encapsulates an independent, sequential phase of a batch job. Therefore, every Job is composed entirely of one or more steps. A Step contains all of the information necessary to define and control the actual batch processing. This is a necessarily vague description because the contents of any given Step are at the discretion of the developer writing a Job. A Step can be as simple or complex as the developer desires. A simple Step might load data from a file into the database, requiring little or no code. (depending upon the implementations used) A more complex Step may have complicated business rules that are applied as part of the processing.

Step Processing types

There are 2 ways a step can process data,

Tasklet

If the step requires only to execute a single task then you can use a tasklet. Typical use case for this is when you need to run a stored procedure, or copy a file from one location to the other. In the “Hello World” example we used a Tasklet to print the message to the console.

Chunk oriented

Chunk oriented processing involves specifying a reader, processor and writer. The input is read one item at a time in sequence and passed to the processor and eventually to the writer in chunks within a transaction boundary. Once the commit interval is reached the items are committed to the writer. Chunk oriented processing is what we will cover on this page.

Requirements

Project Setup

We will be modifying an existing project so please review the articles listed in the section above.

Input Data

The following is the input csv file that will be read. Please create the following file in the projects resource directory.

src/main/resources/input_data.txt

7876,ADAMS,CLERK,1100
7499,ALLEN,SALESMAN,1600
7698,BLAKE,MANAGER,2850
7782,CLARK,MANAGER,2450
7902,FORD,ANALYST,3000
7900,JAMES,CLERK,950
7566,JONES,MANAGER,2975
7839,KING,PRESIDENT,-5000
7654,MARTIN,SALESMAN,1250
7934,MILLER,CLERK,1300
7788,SCOTT,ANALYST,3000
7369,SMITH,CLERK,800
7844,TURNER,SALESMAN,1500
7521,WARD,SALESMAN,1250

Employee Bean

This is a simple bean that represents a single Employee.

src/main/java/com/test/Employee.java

package com.test;

public class Employee {

	private Integer empId;
	private String lastName;
	private String title;
	private Integer salary;
	private String rank;
	
	public Integer getEmpId() {
		return empId;
	}
	public void setEmpId(Integer empId) {
		this.empId = empId;
	}
	public String getLastName() {
		return lastName;
	}
	public void setLastName(String lastName) {
		this.lastName = lastName;
	}
	public String getTitle() {
		return title;
	}
	public void setTitle(String title) {
		this.title = title;
	}
	public Integer getSalary() {
		return salary;
	}
	public void setSalary(Integer salary) {
		this.salary = salary;
	}
	public void setRank(String rank) {
		this.rank = rank;
	}
	public String getRank() {
		return rank;
	}
	@Override
	public String toString() {
		return "Employee [empId=" + empId + ", lastName=" + lastName
				+ ", title=" + title + ", salary=" + salary + ", rank=" + rank
				+ "]";
	}	
}

Reading

In order to read data from a file we will only need to write the FieldSetMapper class that takes a FieldSet object and maps its contents into an bean.

src/main/java/com/test/EmployeeFieldSetMapper.java

package com.test;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

public class EmployeeFieldSetMapper implements FieldSetMapper<Employee> {

	public Employee mapFieldSet(FieldSet fieldSet) throws BindException {
		if(fieldSet == null) return null;
		
		Employee emp = new Employee();	
		// unlike jdbc the index is 0 based	
		emp.setEmpId(fieldSet.readInt(0)); 
		emp.setLastName(fieldSet.readString(1));
		emp.setTitle(fieldSet.readString(2));
		emp.setSalary(fieldSet.readInt(3));
		
		return emp;
	}

}

Arithmetic

Not really! All we are doing is assigning a Rank based on the salary amount. The item processor takes an input Bean and converts it to an output bean. In this case the beans are the same but they don’t have to be.

src/main/java/com/test/EmployeeProcessor.java

package com.test;

import org.springframework.batch.item.ItemProcessor;

public class EmployeeProcessor implements ItemProcessor<Employee, Employee> {

	public Employee process(Employee emp) throws Exception {
		// if salary >= 2500 then set rank as "Director"		
		if(emp.getSalary() >= 2500 ) {
			emp.setRank("Director");			
		} else {
			emp.setRank("N/A");
		}
		return emp;
	}

}

Writing

All the objects needed for writing the file are configured using xml. See the xml file below for more details.

Job Configuration

Modify the job xml file to define the new beans.

src/main/resources/simpleJob.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans:beans xmlns="http://www.springframework.org/schema/batch"
     xmlns:beans="http://www.springframework.org/schema/beans"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="
 
http://www.springframework.org/schema/batch
 
http://www.springframework.org/schema/batch/spring-batch-2.1.xsd

http://www.springframework.org/schema/beans
 
http://www.springframework.org/schema/beans/spring-beans-2.0.xsd">
 
    <beans:import resource="applicationContext.xml"/>
 
<!-- Tokenizer - Converts a delimited string into a Set of Fields -->
<beans:bean name="defaultTokenizer" 
    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"/>

<!-- FieldSetMapper - Populates a bean's attributes with using the FieldSet -->
<beans:bean name="employeeFieldSetMapper" class="com.test.EmployeeFieldSetMapper"/>

<!-- LineMapper - Uses the tokenizer and Mapper to create instances of a Bean. -->
<beans:bean name="employeeLineMapper" class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
    <beans:property name="lineTokenizer" ref="defaultTokenizer"/>    
    <beans:property name="fieldSetMapper" ref="employeeFieldSetMapper"/>        
</beans:bean>

<!-- Reader - used by the tasklet to process one Item from the input. -->
<beans:bean name="empReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    <beans:property name="lineMapper" ref="employeeLineMapper"/>
    
    <!-- use spring integrations for the following, but for now filename is hard coded -->
    <beans:property name="resource" value="input_data.txt"/>    
</beans:bean>

<!-- Processor -->

<beans:bean name="empProcessor" class="com.test.EmployeeProcessor">
</beans:bean>

<!-- Writer -->
<beans:bean id="empWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
    <beans:property name="resource" value="file:target/output_data.txt" />
    <beans:property name="lineAggregator">
        <beans:bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
            <beans:property name="delimiter" value=","/>
            <beans:property name="fieldExtractor">
                <beans:bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
                    <beans:property name="names" value="empId,lastName,title,salary,rank"/>
                </beans:bean>
            </beans:property>
        </beans:bean>
    </beans:property>
</beans:bean>

<job id="helloWorldJob">
    <step id="step1" next="step2">
        <tasklet ref="helloWorldTasklet" />
    </step>
    <step id="step2">
        <tasklet>
            <chunk reader="empReader" processor="empProcessor" writer="empWriter" commit-interval="1"/>
        </tasklet>
    </step>
</job>
 
<beans:bean name="helloWorldTasklet" class="com.test.HelloWorldTasklet"/>
 
<!--
To run the job from the command line type the following:
mvn clean compile exec:java -Dexec.mainClass=org.springframework.batch.core.launch.support.CommandLineJobRunner -Dexec.args="simpleJob.xml helloWorldJob"
 -->
</beans:beans>

Execute the job

Go to the command line and type the following:

mvn clean compile exec:java -Dexec.mainClass=org.springframework.batch.core.launch.support.CommandLineJobRunner -Dexec.args="simpleJob.xml helloWorldJob"

View the Results

The output file will appear in the target/ folder of the project.

Further Reading

To keep things simple we were reading and writing files located in the project own folders. There are many enterprise design patterns that describe the best practices for feeding data into the batch programs. For further reading on this topic please see the Spring Integrations Framework Homepage.

Advertisements

7 Responses to “The 3 R’s of Spring Batch”


  1. 1 James White
    October 19, 2011 at 12:24 pm

    Excellent tutorial. I have picked more from your tutorials than I have from a couple of books. Thanks for writing blog posts.

  2. February 29, 2012 at 4:02 pm

    hi, good work dude! Thanks a lot..I understood the ItemProcessor now…


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 77 other followers

October 2011
S M T W T F S
« Jul   Jan »
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Blog Stats

  • 846,580 hits

%d bloggers like this: