Spring boot batch processing is the automatic processing of large amounts of data without the intervention of a human. Spring boot batch simplifies automated data processing. Spring boot batch creates a job with steps. Each stage will contain a variety of tasks such as reading, processing, and writing. The spring boot batch chunk helps in the execution configuration. This post will walk you through the process of configuring and running spring boot batch with a simple example step by step.
Spring boot batch tasks are classified into three categories. Three interfaces, ItemReader, ItemProcessing, and ItemWriter, are used to specify three tasks: reading, processing, and writing data. The spring boot steps describe how to carry out the task. The spring boot job and steps are defined by the JobBuilderFactory and StepBuilderFactory classes. The spring boot batch is executed by the JobLauncher class.
Spring boot batch has two types: Tasklet and Chunk oriented. Tasklet allows you to create a complete task that will be executed. There is an execute method on the Tasklet inferface. The execute method will be used to get, process, and save data. Tasklet will be used for simple batch execution and processing of modest amounts of data.
The chunk oriented processing consists of three stages: read, process, and write. The three stages are defined by three interfaces. In batch execution, the chunk is flexible. Chunk will allow you to read and process several pieces of data at the same time. It reduces database connectivity overhead and speeds up the commit process.
The spring boot batch is described in simple steps here. The step-by-step configuration will produce a simple example of how the spring boot batch works.
1. Pom.xml
The pom.xml file contains spring boot batch and database dependencies. The spring boot batch requires a database to store batch-related information. Spring boot batch dependencies will supply all dependant classes for batch execution. The dependency spring-boot-starter-batch will include all jars relevant to spring boot batch. The h2 database dependency will include the h2 database driver in the spring boot application.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.5.2</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.yawintutor</groupId>
<artifactId>SpringBootBatch2</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>SpringBootBatch2</name>
<description>Demo project for Spring Boot</description>
<properties>
<java.version>11</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
2. Spring Boot Main class
The default spring boot Main class is used to start in the spring boot batch. The Main class has no additional annotations or configurations.
SpringBootBatch2Application.java
package com.yawintutor;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class SpringBootBatch2Application {
public static void main(String[] args) {
SpringApplication.run(SpringBootBatch2Application.class, args);
}
}
3. Application properties
Two additional application properties should be added. The first property will include the database url, which will be used to connect to the database. The second property allows the spring boot batch to create batch tables while the application is running.
application.properties
spring.datasource.url=jdbc:h2:file:./DB
spring.batch.initialize-schema=ALWAYS
4. Start Spring Boot Application
If you launch the spring boot application now, it should run without issue. Before configuring the spring boot batch, all database errors should be resolved.
5. ItemReader Implementation
The reader class is defined using the ItemReader interface. This class should include the code for reading the data. Spring boot batch will use the read method to read the data.
MyCustomReader.java
package com.yawintutor;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.NonTransientResourceException;
import org.springframework.batch.item.ParseException;
import org.springframework.batch.item.UnexpectedInputException;
public class MyCustomReader implements ItemReader<String>{
private String[] stringArray = { "Zero", "One", "Two", "Three", "Four", "Five" };
private int index = 0;
@Override
public String read() throws Exception, UnexpectedInputException,
ParseException, NonTransientResourceException {
if (index >= stringArray.length) {
return null;
}
String data = index + " " + stringArray[index];
index++;
System.out.println("MyCustomReader : Reading data : "+ data);
return data;
}
}
6. ItemProcessor Implementation
The data processing class is defined using the ItemProcessor interface. This class should include the code for processing the data. Spring boot batch will call the process method to process the data.
MyCustomProcessor.java
package com.yawintutor;
import org.springframework.batch.item.ItemProcessor;
public class MyCustomProcessor implements ItemProcessor<String, String> {
@Override
public String process(String data) throws Exception {
System.out.println("MyCustomProcessor : Processing data : "+data);
data = data.toUpperCase();
return data;
}
}
7. ItemWriter Implementation
The writer class is defined using the ItemWriter interface. This class should include the code for writing the data after processing. Spring boot batch will use the write method to write the data.
MyCustomWriter.java
package com.yawintutor;
import java.util.List;
import org.springframework.batch.item.ItemWriter;
public class MyCustomWriter implements ItemWriter<String> {
@Override
public void write(List<? extends String> list) throws Exception {
for (String data : list) {
System.out.println("MyCustomWriter : Writing data : " + data);
}
System.out.println("MyCustomWriter : Writing data : completed");
}
}
8. Spring Boot Batch Configrations
The spring boot batch configuration file specifies the batch job and batch steps. A batch task is created by the JobBuilderFactory class. A batch step is created by the StepBuilderFactory class. The batch job will invoke the batch steps. The batch step will define batch jobs like ItemReader, ItemProcessor, and ItemWriter. The spring boot batch configuration specifies how the batch should be executed.
BatchConfig.java
package com.yawintutor;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
@EnableBatchProcessing
public class BatchConfig {
@Autowired
public JobBuilderFactory jobBuilderFactory;
@Autowired
public StepBuilderFactory stepBuilderFactory;
@Bean
public Job createJob() {
return jobBuilderFactory.get("MyJob")
.incrementer(new RunIdIncrementer())
.flow(createStep()).end().build();
}
@Bean
public Step createStep() {
return stepBuilderFactory.get("MyStep")
.<String, String> chunk(1)
.reader(new MyCustomReader())
.processor(new MyCustomProcessor())
.writer(new MyCustomWriter())
.build();
}
}
9. Spring Boot Batch Schedulers
The spring boot batch schedulers are automatically executed and invoke the spring boot batch tasks. The spring boot batch will be executed by the JobLauncher class. In this example, a spring boot scheduler is used to automatically launch the spring boot batch at regular intervals.
SchedulerConfig.java
package com.yawintutor;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.scheduling.annotation.Scheduled;
@Configuration
@EnableScheduling
public class SchedulerConfig {
@Autowired
JobLauncher jobLauncher;
@Autowired
Job job;
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.S");
@Scheduled(fixedDelay = 5000, initialDelay = 5000)
public void scheduleByFixedRate() throws Exception {
System.out.println("Batch job starting");
JobParameters jobParameters = new JobParametersBuilder()
.addString("time", format.format(Calendar.getInstance().getTime())).toJobParameters();
jobLauncher.run(job, jobParameters);
System.out.println("Batch job executed successfully\n");
}
}
10. Start Spring Boot Application
All the configuration for spring boot batch is completed. Start the spring boot batch application. The scheduler will start the spring boot batch and execute all the steps and tasks. The log will show as below. The log displays logs for read, process, and write tasks. It is repeated several times until the data is ready to be processed. When the data is finished, the batch will come to a halt.
2021-07-22 23:00:55.897 INFO 39139 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [FlowJob: [name=MyJob]] launched with the following parameters: [{run.id=3, time=2021-07-22 16:22:57.293}]
2021-07-22 23:00:55.927 INFO 39139 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [MyStep]
MyCustomReader : Reading data : 0 Zero
MyCustomProcessor : Processing data : 0 Zero
MyCustomWriter : Writing data : 0 ZERO
MyCustomWriter : Writing data : completed
MyCustomReader : Reading data : 1 One
MyCustomProcessor : Processing data : 1 One
MyCustomWriter : Writing data : 1 ONE
MyCustomWriter : Writing data : completed
MyCustomReader : Reading data : 2 Two
MyCustomProcessor : Processing data : 2 Two
MyCustomWriter : Writing data : 2 TWO
MyCustomWriter : Writing data : completed
MyCustomReader : Reading data : 3 Three
MyCustomProcessor : Processing data : 3 Three
MyCustomWriter : Writing data : 3 THREE
MyCustomWriter : Writing data : completed
MyCustomReader : Reading data : 4 Four
MyCustomProcessor : Processing data : 4 Four
MyCustomWriter : Writing data : 4 FOUR
MyCustomWriter : Writing data : completed
MyCustomReader : Reading data : 5 Five
MyCustomProcessor : Processing data : 5 Five
MyCustomWriter : Writing data : 5 FIVE
MyCustomWriter : Writing data : completed
2021-07-22 23:00:55.954 INFO 39139 --- [ main] o.s.batch.core.step.AbstractStep : Step: [MyStep] executed in 27ms
2021-07-22 23:00:55.958 INFO 39139 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [FlowJob: [name=MyJob]] completed with the following parameters: [{run.id=3, time=2021-07-22 16:22:57.293}] and the following status: [COMPLETED] in 42ms