Creation of tasklet model job

Overview

How to create a tasklet model job is explained. Refer to Spring Batch architecture for the architecture of tasklet model.

Components

Tasklet model job does not register multiple components. It only implements org.springframework.batch.core.step.tasklet.Tasklet and sets it in Bean definition. ItemReader and ItemWriter which are components of the chunk model can also be used as constructive means for implementation.

HowToUse

How to implement tasklet model job is explained in the following order here.

Job configuration
Implementation of tasklet

Job configuration

Define tasklet model job in Bean definition file. An example is shown below.

Example of Bean definition file (Tasklet model)

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:batch="http://www.springframework.org/schema/batch"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
             http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
             http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">

    <!-- (1) -->
    <import resource="classpath:META-INF/spring/job-base-context.xml"/>

    <!-- (2) -->
    <context:component-scan
          base-package="org.terasoluna.batch.functionaltest.app.common"/>

    <!-- (3) -->
    <batch:job id="simpleJob" job-repository="jobRepository"> <!-- (4) -->
        <batch:step id="simpleJob.step01"> <!-- (5) -->
            <batch:tasklet transaction-manager="jobTransactionManager"
                           ref="simpleJobTasklet"/> <!-- (6) -->
        </batch:step>
    </batch:job>

</beans>

Example of tasklet implementation class

package org.terasoluna.batch.functionaltest.app.common;

@Component // (3)
public class SimpleJobTasklet implements Tasklet {
  // omitted.
}

S. No.	Explanation
(1)	Import the settings to always read the required Bean definition when using TERASOLUNA Batch 5.x.
(2)	Set base package to component-scan. The tasklet model is based on the annotation bean definition, and the bean definition of the Tasklet implementation class is unnecessary in the XML.
(3)	Job configuration. `id` attribute must be unique for all the jobs included in 1 batch application.
(4)	`JobRepository` configuration. The value to be set in the `job-repository` attribute should be fixed to `jobRepository` unless there is a special reason. This will allow all the jobs to be managed in one `JobRepository`. Resolve Bean definition of `jobRepository` by (1).
(5)	Step configuration. Although it is not necessary to use a unique `id` attribute for all the jobs in 1 batch application, a unique id is used for enabling easy tracking at the time of failure occurrence. A format of [step+serial number] is added to id attribute specified in (3) unless there is a special reason to use a different format.
(6)	Tasklet configuration. The value to be set in the `transaction-manager` attribute should be fixed to `jobTransactionManager` unless there is a special reason. This will manage the processes of the entire tasklet in one transaction. For details, refer to Transaction control. Resolve Bean definition of `jobTransactionManager` by (1). Also, the `ref` attribute specifies a Bean ID of Tasklet implementation class to be resolved by (2). `SimpleJobTasklet`, the tasklet implementation class name should be `simpleJobTasklet` with the first letter in lower case.

Bean name when using annotation

Bean name when using @Component annotation is generated through org.springframework.context.annotation.AnnotationBeanNameGenerator. Refer to Javadoc of this class when you want to confirm the naming rules.

Implementation of tasklet

First, understand the overview with simple implementation, then proceed to implementation using the components of the chunk model.

It is explained in the following order.

Implementation of simple tasklet
Implementation of tasklet using the components of chunk model

Implementation of simple tasklet

The basic points are explained through tasklet implementation only for log output.

Example of simple tasklet implementation class

package org.terasoluna.batch.functionaltest.app.common;

// omitted.

@Component
public class SimpleJobTasklet implements Tasklet { // (1)

    private static final Logger logger =
            LoggerFactory.getLogger(SimpleJobTasklet.class);

    @Override
    public RepeatStatus execute(StepContribution contribution,
            ChunkContext chunkContext) throws Exception {  // (2)
        logger.info("called tasklet."); // (3)
        return RepeatStatus.FINISHED; // (4)
    }
}

Sr. No.	Explanation
(1)	Implement `org.springframework.batch.core.step.tasklet.Tasklet` interface using `implements`.
(2)	Implement the `execute` method defined by `Tasklet` interface. The arguments `StepContribution` and `ChunkContext` are used as necessary but the explanation is omitted here.
(3)	Implement any process. INFO log is output here.
(4)	Return whether or not the tasklet process is completed. Always specify as `return RepeatStatus.FINISHED;`.

Implementation of tasklet using the components of chunk model

Spring Batch does not mention using various components of the chunk model during tasklet implementation. In TERASOLUNA Batch 5.x, you may select this depending on the following situations.

When multiple resources are combined and processed, it is difficult to conform to chunk model format
In the chunk model, processing is implemented in multiple places, so the tasklet model is easier to understand the overall image.
When recovery is made simple and you want to use batch commit of tasklet model instead of intermediate commit of chunk model

Note that, processing units should also be considered to implement Tasklet by using components of chunk model. Following 3 patterns can be considered as units of output records.

Units and features of output records
Output records	Features
1 record	Since data is input, processed and output one by one for each record, processing images is easy. It must be noted that performance deterioration is likely to occur due to frequent I/O in case of large amount of data.
All records	Data is input and processed one by one for each record and stored in the memory, all records are output together in the end. Data consistency can be ensured and performance can be improved in case of small amount of data. However, it must be noted that high load is likely to be applied on resources (CPU, memory) in case of large amount of data.
Fixed records	Data is input and processed one by one for each record and stored in the memory, data is output when a certain number of records are reached. Performance improvement is anticipated by efficiently processing large amount of data with certain resources (CPU, memory). Also, since the data is processed for a fixed number of records, intermediate commit can also be employed by implementing transaction control. However, it must be noted that, processed and unprocessed data are likely to exist together in the recovery if the job has terminated abnormally, in case of intermediate commit method.

The tasklet implementation that uses ItemReader and ItemWriter which are the components of the chunk model is explained below.

The implementation example shows processing data one by one for each record.

Tasklet implementation example that uses the components of chunk model

@Component()
@Scope("step") // (1)
public class SalesPlanChunkTranTask implements Tasklet {

    @Inject
    @Named("detailCSVReader") // (2)
    ItemStreamReader<SalesPlanDetail> itemReader; // (3)

    @Inject
    SalesPlanDetailRepository repository; // (4)

    @Override
    public RepeatStatus execute(StepContribution contribution,
            ChunkContext chunkContext) throws Exception {

        SalesPlanDetail item;

        try {
            itemReader.open(chunkContext.getStepContext().getStepExecution()
                    .getExecutionContext()); // (5)

            while ((item = itemReader.read()) != null) { // (6)

                // do some processes.

                repository.create(item); // (7)
            }
        } finally {
            itemReader.close(); // (8)
        }
        return RepeatStatus.FINISHED;
    }
}

Bean definition example 1

<!-- omitted -->
<import resource="classpath:META-INF/spring/job-base-context.xml"/>

<context:component-scan
    base-package="org.terasoluna.batch.functionaltest.app.plan" />
<context:component-scan
    base-package="org.terasoluna.batch.functionaltest.ch05.transaction.component" />

<!-- (9) -->
<mybatis:scan
    base-package="org.terasoluna.batch.functionaltest.app.repository.plan"
    factory-ref="jobSqlSessionFactory"/>

<!-- (10) -->
<bean id="detailCSVReader"
      class="org.springframework.batch.item.file.FlatFileItemReader" scope="step"
      p:resource="file:#{jobParameters['inputFile']}">
    <property name="lineMapper">
        <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
            <property name="lineTokenizer">
                <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"
                      p:names="branchId,year,month,customerId,amount"/>
            </property>
            <property name="fieldSetMapper">
                <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper"
                      p:targetType="org.terasoluna.batch.functionaltest.app.model.plan.SalesPlanDetail"/>
            </property>
        </bean>
    </property>
</bean>

<!-- (11) -->
<batch:job id="createSalesPlanChunkTranTask" job-repository="jobRepository">
    <batch:step id="createSalesPlanChunkTranTask.step01">
        <batch:tasklet transaction-manager="jobTransactionManager"
                       ref="salesPlanChunkTranTask"/>
    </batch:step>
</batch:job>

Sr. No.	Explanation
(1)	Set the same step scope as the Bean scope of ItemReader to be used in this class.
(2)	Access input resources (flat files in this example) through `ItemReader`. Specify Bean name as `detailCSVReader` but it is optional for clarity purpose.
(3)	Define the type as `ItemStreamReader` that is a sub-interface of `ItemReader`. This is because it is necessary to open/close the resource of (5), (8). It is supplemented later.
(4)	Access output resources (database in this example) through Mapper of MyBatis. The mapper is directly used for the sake of simplicity. There is no need to always use `ItemWriter`. Of course, `MyBatisBatchItemWriter` can be used.
(5)	Open input resource.
(6)	Loop all input resources sequentially. `ItemReader#read` returns `null` when it reads all the input data and reaches the end.
(7)	Output to the database.
(8)	The resource should be closed without fail. Exception handling should be implemented. When an exception occurs, the transactions of the entire tasklet are rolled-backed, stack trace of exception is output and the job terminates abnormally.
(9)	MyBatis-Spring settings. For details of MyBatis-Spring settings, refer Database access.
(10)	To input from a file, add a bean definition of `FlatFileItemReader`. The details are not explained here.
(11)	Since all the components are resolved by annotation, it is same as Implementation of simple tasklet.

On unification of scope

The scope of tasklet implementation class and Bean to be Injected should have the same scope.

For example, if FlatFileItemReader receives an input file path from an argument, the Bean scope should be step. In this case, the scope of tasklet implementation class should also be step.

The case where the scope of the Tasklet implementation class is singleton is explained. At this time, after instantiating the Tasklet implementation class when creating the ApplicationContext at application startup it attempts to resolve and inject the instance of FlatFileItemReader. However, FlatFileItemReader is step scope and it does not exist yet because it is generated at step execution. As a result, it is concluded that the Tasklet implementation class cannot be instantiated and ApplicationContext generation fails.

Regarding the type of field assigned with @Inject

Any one of the following type depending on the implementation class to be used.

ItemReader/ItemWriter
- Used when there is no need to open/close the target resource.
ItemSteamReader/ItemStreamWriter
- Used when there is a need to open/close the target resource.

Type to be used should always be determined after verifying javadoc. Typical examples are shown below.

In case of FlatFileItemReader/Writer: handle by ItemSteamReader/ItemStreamWriter
In case of MyBatisCursorItemReader: handle by ItemStreamReader
In case of MyBatisBatchItemWriter: handle by ItemWriter

The implementation example imitates a chunk model to process a certain number of records

Tasklet implementation example 2 that uses the components of chunk model

@Component
@Scope("step")
public class SalesPerformanceTasklet implements Tasklet {


    @Inject
    ItemStreamReader<SalesPerformanceDetail> reader;

    @Inject
    ItemWriter<SalesPerformanceDetail> writer; // (1)

    int chunkSize = 10; // (2)

    @Override
    public RepeatStatus execute(StepContribution contribution,
            ChunkContext chunkContext) throws Exception {

        try {
            reader.open(chunkContext.getStepContext().getStepExecution()
                    .getExecutionContext());

            List<SalesPerformanceDetail> items = new ArrayList<>(chunkSize); // (2)
            SalesPerformanceDetail item = null;
            do {
                // Pseudo operation of ItemReader
                for (int i = 0; i < chunkSize; i++) { // (3)
                    item = reader.read();
                    if (item == null) {
                        break;
                    }
                    // Pseudo operation of ItemProcessor
                    // do some processes.

                    items.add(item);
                }

                // Pseudo operation of ItemWriter
                if (!items.isEmpty()) {
                    writer.write(items); // (4)
                    items.clear();
                }
            } while (item != null);
        } finally {
            try {
                reader.close();
            } catch (Exception e) {
                // do nothing.
            }
        }

        return RepeatStatus.FINISHED;
    }
}

Bean definition example 2

<!-- omitted -->
<import resource="classpath:META-INF/spring/job-base-context.xml"/>

<context:component-scan
    base-package="org.terasoluna.batch.functionaltest.app.common,
        org.terasoluna.batch.functionaltest.app.performance,
        org.terasoluna.batch.functionaltest.ch06.exceptionhandling"/>
<mybatis:scan
    base-package="org.terasoluna.batch.functionaltest.app.repository.performance"
    factory-ref="jobSqlSessionFactory"/>

<bean id="detailCSVReader"
      class="org.springframework.batch.item.file.FlatFileItemReader" scope="step"
      p:resource="file:#{jobParameters['inputFile']}">
    <property name="lineMapper">
        <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
            <property name="lineTokenizer">
                <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"
                      p:names="branchId,year,month,customerId,amount"/>
            </property>
            <property name="fieldSetMapper">
                <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper"
                      p:targetType="org.terasoluna.batch.functionaltest.app.model.performance.SalesPerformanceDetail"/>
            </property>
        </bean>
    </property>
</bean>

<!-- (1) -->
<bean id="detailWriter"
      class="org.mybatis.spring.batch.MyBatisBatchItemWriter"
      p:statementId="org.terasoluna.batch.functionaltest.app.repository.performance.SalesPerformanceDetailRepository.create"
      p:sqlSessionTemplate-ref="batchModeSqlSessionTemplate"/>


<batch:job id="jobSalesPerfTasklet" job-repository="jobRepository">
    <batch:step id="jobSalesPerfTasklet.step01">
        <batch:tasklet ref="salesPerformanceTasklet"
                       transaction-manager="jobTransactionManager"/>
    </batch:step>
</batch:job>

Sr. No.	Explanation
(1)	Use `MyBatisBatchItemWriter` as the implementation of `ItemWriter`.
(2)	`ItemWriter` outputs a fixed number of records collectively. Here, 10 records are processed and output.
(3)	As per the behavior of chunk model, it should be read→process→read→process→…→write.
(4)	Output through `ItemWriter` collectively.

Decide each time whether to use the implementation class of ItemReader or ItemWriter. For file access, the implementation class of ItemReader and ItemWriter can be used. It is not necessary to forcibly use other database access etc. It can be used to improve performance.