Archive for October 1st, 2011

01
Oct
11

Spring Batch Headers and Footers

This page describes the process of creating running a Spring Batch job that creates an output file with header and footer information.

Background

In prior articles we created a batch job that read information and outputted the result to another file. The output file was missing a header and footer lines. In this article we will look into generating this information.

Requirements

  • Java 5
  • Maven 2
  • This page picks up where the following page left off. Please Review and Implement the following article.

Header

The header of the will simply contain the filename and the timestamp.

fileName,Timestamp

Footer

The footer contained in the output be composed of 2 fields.

#ofRecords,salarySum

where:
#ofRecords – Count of the number of records in the output
salarySum – Sum of all values in the salary field

Solution

Create a Item Writer that will wrap the “FlatFileItemWriter”. The “headerCallback” and “footerCallback” methods of the ItemWriter will be called at the beginning and ending of the file. This allows developers to implement code that will allow them to write information to the header and footer of the output file.

src/main/java/com/test/EmployeeItemWriter.java

package com.test;

import java.io.IOException;
import java.io.Writer;
import java.math.BigDecimal;
import java.util.Date;
import java.util.List;

import org.springframework.batch.item.ExecutionContext;
import org.springframework.batch.item.ItemStream;
import org.springframework.batch.item.ItemStreamException;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.file.FlatFileFooterCallback;
import org.springframework.batch.item.file.FlatFileHeaderCallback;
import org.springframework.batch.item.file.FlatFileItemWriter;

public class EmployeeItemWriter implements ItemWriter<Employee>,
		FlatFileFooterCallback, FlatFileHeaderCallback, ItemStream {

    private FlatFileItemWriter<Employee> delegate;

    private BigDecimal totalAmount = BigDecimal.ZERO;

	private int recordCount = 0;
    
    public void writeFooter(Writer writer) throws IOException {
        writer.write(""+recordCount + "," + totalAmount);
    }
	public void writeHeader(Writer writer) throws IOException {
        writer.write("output_file.txt" + "," + new Date());
	}
    public void write(List<? extends Employee> items) throws Exception {
        BigDecimal chunkTotal = BigDecimal.ZERO;
        int chunkRecord = 0;
        for (Employee employee : items) {
            chunkRecord++;
            chunkTotal = chunkTotal.add(new BigDecimal(employee.getSalary()));
        }
        delegate.write(items);
        // After successfully writing all items
        totalAmount = totalAmount.add(chunkTotal);
        recordCount += chunkRecord;
	}
    
    public void setDelegate(FlatFileItemWriter<Employee> delegate) {
		this.delegate = delegate;
	}

	public void close() throws ItemStreamException {
		this.delegate.close();
	}

	public void open(ExecutionContext arg0) throws ItemStreamException {
		this.delegate.open(arg0);
	}

	public void update(ExecutionContext arg0) throws ItemStreamException {
		this.delegate.update(arg0);
	}
}

Batch XML Configuration

The following xml will allow the ItemWriter to wrap the FlatFileItemWriter and maintain a running total.

The highlighted lines are the changes from the prior version.

/src/main/resources/simpleJob.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans:beans xmlns="http://www.springframework.org/schema/batch"
     xmlns:beans="http://www.springframework.org/schema/beans"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="
 
http://www.springframework.org/schema/batch
 
http://www.springframework.org/schema/batch/spring-batch-2.1.xsd

http://www.springframework.org/schema/beans
 
http://www.springframework.org/schema/beans/spring-beans-2.0.xsd">
 
    <beans:import resource="applicationContext.xml"/>
 
<!-- Tokenizer - Converts a delimited string into a Set of Fields -->
<beans:bean name="defaultTokenizer" 
    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"/>

<!-- FieldSetMapper - Populates a bean's attributes with using the FieldSet -->
<beans:bean name="employeeFieldSetMapper" class="com.test.EmployeeFieldSetMapper"/>

<!-- LineMapper - Uses the tokenizer and Mapper to create instances of a Bean. -->
<beans:bean name="employeeLineMapper" class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
    <beans:property name="lineTokenizer" ref="defaultTokenizer"/>    
    <beans:property name="fieldSetMapper" ref="employeeFieldSetMapper"/>        
</beans:bean>

<!-- Reader - used by the tasklet to process one Item from the input. -->
<beans:bean name="empReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    <beans:property name="lineMapper" ref="employeeLineMapper"/>
    
    <!-- use spring integrations for the following, but for now filename is hard coded -->
    <beans:property name="resource" value="input_data.txt"/>    
</beans:bean>

<!-- Processor -->

<beans:bean name="empProcessor" class="com.test.EmployeeProcessor">
</beans:bean>

<!-- Writer -->
<beans:bean id="empWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
    <beans:property name="resource" value="file:target/output_data.txt" />
    <beans:property name="lineAggregator">
        <beans:bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
            <beans:property name="delimiter" value=","/>
            <beans:property name="fieldExtractor">
                <beans:bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
                    <beans:property name="names" value="empId,lastName,title,salary,rank"/>
                </beans:bean>
            </beans:property>
        </beans:bean>
    </beans:property>
    <beans:property name="footerCallback" ref="empHeaderFooterWriter" />
    <beans:property name="headerCallback" ref="empHeaderFooterWriter" />
</beans:bean>

<beans:bean id="empHeaderFooterWriter" class="com.test.EmployeeItemWriter">
    <beans:property name="delegate" ref="empWriter"/>
</beans:bean>

<job id="helloWorldJob">
    <step id="step1" next="step2">
        <tasklet ref="helloWorldTasklet" />
    </step>
    <step id="step2">
        <tasklet>
            <chunk reader="empReader" processor="empProcessor" writer="empHeaderFooterWriter" commit-interval="1"/>
        </tasklet>
    </step>
</job>
 
<beans:bean name="helloWorldTasklet" class="com.test.HelloWorldTasklet"/>
 
<!--
To run the job from the command line type the following:
mvn exec:java -Dexec.mainClass=org.springframework.batch.core.launch.support.CommandLineJobRunner -Dexec.args="simpleJob.xml helloWorldJob"
 -->
</beans:beans>

Execute the job

Go to the command line and type the following:

mvn exec:java -Dexec.mainClass=org.springframework.batch.core.launch.support.CommandLineJobRunner -Dexec.args="simpleJob.xml helloWorldJob"

View the results

The output file will appear in the target/ folder of the project.

The file should look like this:

output_file.txt,Sat Oct 01 22:50:17 EDT 2011
7876,ADAMS,CLERK,1100,N/A
7499,ALLEN,SALESMAN,1600,N/A
7698,BLAKE,MANAGER,2850,Director
7782,CLARK,MANAGER,2450,N/A
7902,FORD,ANALYST,3000,Director
7900,JAMES,CLERK,950,N/A
7566,JONES,MANAGER,2975,Director
7839,KING,PRESIDENT,5000,Director
7654,MARTIN,SALESMAN,1250,N/A
7934,MILLER,CLERK,1300,N/A
7788,SCOTT,ANALYST,3000,Director
7369,SMITH,CLERK,800,N/A
7844,TURNER,SALESMAN,1500,N/A
7521,WARD,SALESMAN,1250,N/A
14,29025

That’s all for now!

01
Oct
11

The 3 R’s of Spring Batch

This page describes how you can Read, wRrite and perform aRithmetic on flat files using The Spring Batch Framework. We will take a comma separated file (csv) that contain employee information, add some information to it, and write it back to the file system.

Background

The basic building blocks of any batch process is

  1. Reading a Item
  2. Performing an operation on it
  3. Writing the Item back

Please take some time to review The Domain Language of Batch before proceeding. It covers much of the fundamental concepts we will be covering here.

Batch Steps

This page is focused on an individual step of the batch process.

The following is from the spring batch documentation

A Step is a domain object that encapsulates an independent, sequential phase of a batch job. Therefore, every Job is composed entirely of one or more steps. A Step contains all of the information necessary to define and control the actual batch processing. This is a necessarily vague description because the contents of any given Step are at the discretion of the developer writing a Job. A Step can be as simple or complex as the developer desires. A simple Step might load data from a file into the database, requiring little or no code. (depending upon the implementations used) A more complex Step may have complicated business rules that are applied as part of the processing.

Step Processing types

There are 2 ways a step can process data,

Tasklet

If the step requires only to execute a single task then you can use a tasklet. Typical use case for this is when you need to run a stored procedure, or copy a file from one location to the other. In the “Hello World” example we used a Tasklet to print the message to the console.

Chunk oriented

Chunk oriented processing involves specifying a reader, processor and writer. The input is read one item at a time in sequence and passed to the processor and eventually to the writer in chunks within a transaction boundary. Once the commit interval is reached the items are committed to the writer. Chunk oriented processing is what we will cover on this page.

Requirements

Project Setup

We will be modifying an existing project so please review the articles listed in the section above.

Input Data

The following is the input csv file that will be read. Please create the following file in the projects resource directory.

src/main/resources/input_data.txt

7876,ADAMS,CLERK,1100
7499,ALLEN,SALESMAN,1600
7698,BLAKE,MANAGER,2850
7782,CLARK,MANAGER,2450
7902,FORD,ANALYST,3000
7900,JAMES,CLERK,950
7566,JONES,MANAGER,2975
7839,KING,PRESIDENT,-5000
7654,MARTIN,SALESMAN,1250
7934,MILLER,CLERK,1300
7788,SCOTT,ANALYST,3000
7369,SMITH,CLERK,800
7844,TURNER,SALESMAN,1500
7521,WARD,SALESMAN,1250

Employee Bean

This is a simple bean that represents a single Employee.

src/main/java/com/test/Employee.java

package com.test;

public class Employee {

	private Integer empId;
	private String lastName;
	private String title;
	private Integer salary;
	private String rank;
	
	public Integer getEmpId() {
		return empId;
	}
	public void setEmpId(Integer empId) {
		this.empId = empId;
	}
	public String getLastName() {
		return lastName;
	}
	public void setLastName(String lastName) {
		this.lastName = lastName;
	}
	public String getTitle() {
		return title;
	}
	public void setTitle(String title) {
		this.title = title;
	}
	public Integer getSalary() {
		return salary;
	}
	public void setSalary(Integer salary) {
		this.salary = salary;
	}
	public void setRank(String rank) {
		this.rank = rank;
	}
	public String getRank() {
		return rank;
	}
	@Override
	public String toString() {
		return "Employee [empId=" + empId + ", lastName=" + lastName
				+ ", title=" + title + ", salary=" + salary + ", rank=" + rank
				+ "]";
	}	
}

Reading

In order to read data from a file we will only need to write the FieldSetMapper class that takes a FieldSet object and maps its contents into an bean.

src/main/java/com/test/EmployeeFieldSetMapper.java

package com.test;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

public class EmployeeFieldSetMapper implements FieldSetMapper<Employee> {

	public Employee mapFieldSet(FieldSet fieldSet) throws BindException {
		if(fieldSet == null) return null;
		
		Employee emp = new Employee();	
		// unlike jdbc the index is 0 based	
		emp.setEmpId(fieldSet.readInt(0)); 
		emp.setLastName(fieldSet.readString(1));
		emp.setTitle(fieldSet.readString(2));
		emp.setSalary(fieldSet.readInt(3));
		
		return emp;
	}

}

Arithmetic

Not really! All we are doing is assigning a Rank based on the salary amount. The item processor takes an input Bean and converts it to an output bean. In this case the beans are the same but they don’t have to be.

src/main/java/com/test/EmployeeProcessor.java

package com.test;

import org.springframework.batch.item.ItemProcessor;

public class EmployeeProcessor implements ItemProcessor<Employee, Employee> {

	public Employee process(Employee emp) throws Exception {
		// if salary >= 2500 then set rank as "Director"		
		if(emp.getSalary() >= 2500 ) {
			emp.setRank("Director");			
		} else {
			emp.setRank("N/A");
		}
		return emp;
	}

}

Writing

All the objects needed for writing the file are configured using xml. See the xml file below for more details.

Job Configuration

Modify the job xml file to define the new beans.

src/main/resources/simpleJob.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans:beans xmlns="http://www.springframework.org/schema/batch"
     xmlns:beans="http://www.springframework.org/schema/beans"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="
 
http://www.springframework.org/schema/batch
 
http://www.springframework.org/schema/batch/spring-batch-2.1.xsd

http://www.springframework.org/schema/beans
 
http://www.springframework.org/schema/beans/spring-beans-2.0.xsd">
 
    <beans:import resource="applicationContext.xml"/>
 
<!-- Tokenizer - Converts a delimited string into a Set of Fields -->
<beans:bean name="defaultTokenizer" 
    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"/>

<!-- FieldSetMapper - Populates a bean's attributes with using the FieldSet -->
<beans:bean name="employeeFieldSetMapper" class="com.test.EmployeeFieldSetMapper"/>

<!-- LineMapper - Uses the tokenizer and Mapper to create instances of a Bean. -->
<beans:bean name="employeeLineMapper" class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
    <beans:property name="lineTokenizer" ref="defaultTokenizer"/>    
    <beans:property name="fieldSetMapper" ref="employeeFieldSetMapper"/>        
</beans:bean>

<!-- Reader - used by the tasklet to process one Item from the input. -->
<beans:bean name="empReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    <beans:property name="lineMapper" ref="employeeLineMapper"/>
    
    <!-- use spring integrations for the following, but for now filename is hard coded -->
    <beans:property name="resource" value="input_data.txt"/>    
</beans:bean>

<!-- Processor -->

<beans:bean name="empProcessor" class="com.test.EmployeeProcessor">
</beans:bean>

<!-- Writer -->
<beans:bean id="empWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
    <beans:property name="resource" value="file:target/output_data.txt" />
    <beans:property name="lineAggregator">
        <beans:bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
            <beans:property name="delimiter" value=","/>
            <beans:property name="fieldExtractor">
                <beans:bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
                    <beans:property name="names" value="empId,lastName,title,salary,rank"/>
                </beans:bean>
            </beans:property>
        </beans:bean>
    </beans:property>
</beans:bean>

<job id="helloWorldJob">
    <step id="step1" next="step2">
        <tasklet ref="helloWorldTasklet" />
    </step>
    <step id="step2">
        <tasklet>
            <chunk reader="empReader" processor="empProcessor" writer="empWriter" commit-interval="1"/>
        </tasklet>
    </step>
</job>
 
<beans:bean name="helloWorldTasklet" class="com.test.HelloWorldTasklet"/>
 
<!--
To run the job from the command line type the following:
mvn clean compile exec:java -Dexec.mainClass=org.springframework.batch.core.launch.support.CommandLineJobRunner -Dexec.args="simpleJob.xml helloWorldJob"
 -->
</beans:beans>

Execute the job

Go to the command line and type the following:

mvn clean compile exec:java -Dexec.mainClass=org.springframework.batch.core.launch.support.CommandLineJobRunner -Dexec.args="simpleJob.xml helloWorldJob"

View the Results

The output file will appear in the target/ folder of the project.

Further Reading

To keep things simple we were reading and writing files located in the project own folders. There are many enterprise design patterns that describe the best practices for feeding data into the batch programs. For further reading on this topic please see the Spring Integrations Framework Homepage.

01
Oct
11

Hello World With Spring Batch 2.1.x

This page describes how convert the spring hello world example application to work using spring batch 2.1.x.

Background

The Hello World With Spring Batch was written a while ago. Since then there was a new version of spring batch released. Instead of changing the original article, I chose to write a new one just describing the changes needed to get the application described there to work.

The original article is located here:

http://numberformat.wordpress.com/2010/02/05/hello-world-with-spring-batch/

Step 1

Update the pom.xml dependency for spring-batch-core.

pom.xml

        <dependency>
            <groupId>org.springframework.batch</groupId>
            <artifactId>spring-batch-core</artifactId>
            <version>2.1.8.RELEASE</version>
        </dependency>

Step 2

update the xml:

src/main/resources/simpleJob.xml

http://www.springframework.org/schema/batch/spring-batch-2.0.xsd">

with

http://www.springframework.org/schema/batch/spring-batch-2.1.xsd">

Finally Run the command just like before:

mvn clean compile exec:java -Dexec.mainClass=org.springframework.batch.core.launch.support.CommandLineJobRunner -Dexec.args="simpleJob.xml helloWorldJob"

That’s all for now!




Follow

Get every new post delivered to your Inbox.

Join 34 other followers