Batch processing

Compared to the Pacemaker Community Edition (CE), the Pacemaker Professional Edition (PE) has a built-in batch processing function that stacks data that needs to be added or updated and executes a multi-value SQL statement based on a set of configurable triggers.

By default, there should be no reason for customization or configuration, but in some exeptions or used as a framework, Pacemaker allows configuration when the batch processing is triggered.

  • The first trigger is the stack size.

  • The second trigger has dedicated attributes to start the execution for the multi-value SQL statement.

Configure the stack size

By default, the maximum stack size is 1000.

Before the next value is added to the stack, the SQL statement will be executed, and the stack cleared.

Additionally, a trigger takes care that the stack will also be processed when the import has been finished, but it still contains values.

It is not possible to configure the maximum stack size by the configuration of the workflow engine, as this is the case for most other configuration options. Instead, the stack size has to be configured by the DI configuration.

The constructor of the TechDivision\Import\Batch\Actions\Processors\GenericBatchProcessor implementation, which provides the batch functionality, expects four arguments.

The fourth argument is the maximum stack size after that the stack will be cleaned-up.

In case the maximum stack size of the processor, that handles the creation of the product DateTime attribute values, has to be changed, the DI configuration can be overwritten, e.g. with

<service
    id="import_product.action.processor.product.datetime.create"
    class="TechDivision\Import\Batch\Actions\Processors\GenericBatchProcessor">
    <argument type="service" id="connection"/>
    <argument type="service" id="import_batch.repository.sql.statement"/>
    <argument type="collection">
        <argument type="constant">
            TechDivision\Import\Batch\Utils\SqlStatementKeys::CREATE_UPDATE_PRODUCT_DATETIME
        </argument>
    </argument>
    <argument type="integer">2000</argument>
</service>

Configure the dedicated attributes

The option to change the dedicated attributes that trigger the stack clean-up allows, besides the url_key attribute, to register additional attributes.

If the stack should also be cleaned-up, when a value for the attribute url_path has been added to the stack, the DI configuration can be overwritten by adding the appropriate attribute to the loader’s collection argument.

<service
    id="import_batch.loader.product.varchar.processor.attribute.id"
    class="TechDivision\Import\Batch\Loaders\GenericAttributeIdLoader">
    <argument type="service" id="configuration"/>
    <argument type="service" id="import.processor.import"/>
    <argument type="collection">
        <argument type="constant">TechDivision\Import\Product\Utils\MemberNames::URL_KEY</argument>
        <argument type="constant">TechDivision\Import\Product\Utils\MemberNames::URL_PATH</argument>
    </argument>
</service>

It will be passed as the fifth argument to the processor that handles the creation of the product varchar attribute values and is based on the TechDivision\Import\Batch\Actions\Processors\GenericAttributeBatchProcessor implementation.

The url_key attribute triggers the stack clean-up of the processor, which handles the product’s creation of the product varchar attributes, because URL key management makes it necessary always to have the actual URL keys in the database.