Here, how to distinguish between chunk model and tasklet model is explained by organizing each feature. In the explanation, there are matters to be described in detail in the subsequent chapters, so please refer to corresponding chapters as appropriate.
Following contents should be viewed as examples for the concepts, and not as constraints or recommendations. Refer to it while creating a job depending on the characteristics of the users and systems.
The main differences between the chunk model and the tasklet model are given below.
Item | Chunk | Tasklet |
---|---|---|
Components |
It consists of 3 components mainly |
It is consolidated in one |
Transaction |
A certain number of records are processed by issuing intermediate commit. Batch commit cannot be done. |
It is basic to process at once in batch commit. There is a need for the user to implement intermediate commits. |
Restart |
It can be restarted based on the record count. |
It cannot be restarted based on the record count. |
Based on this, we will introduce some examples of using each one as follows.
- To make recovery as simple as possible
-
When the job, having error, is to be recovered by only re-running the target job, tasklet model can be chosen to make recovery simple.
In chunk model, it should be dealt by returning the processed data to the state before executing the job and by creating a job to process only the unprocessed data. - To consolidate the process contents
-
When you want to prioritize the outlook of job such as one job in one class, tasklet can be chosen.
- To process large data stably
-
For example when performing batch process of 10 million records, consider to use chunk model when the targeting number of cases affects the resources. It means stabilizing the process by intermediate commit. Even in tasklet model, intermediate commit can be used, but it is simpler to implement in chunk model.
- To restart based on the record count for the recovery after error
-
When batch window is severe and you want to resume from erroneous data onwards, chunk model should be chosen to use restart based on the record count provided by Spring Batch. This eliminates the need to create that mechanism for each job.
Chunk model and tasklet model are basically used in combination. For example, in most cases it is natural to choose a tasklet model as the basis for processing number and processing time, and in a very small number of cases, choosing a chunk model for jobs that process large numbers of records. |