Caching and cache warming

PE adds extended caching functionality as well as the possibilty to warm the caches. In general, caching in a import scenario may not improve speed significant, but it provides an additional option to improve speed or at least save money in a Cloud environment by reducing database queries to a minimum, which are in deed slower comparing to a dedicated server.

The cache is separated into two types. A static cache one, which can NOT be deactivate because it is used to cache the global data as well as the data of the artefacts that will be created during the import process. For example, to create the variants after the simples has been created. By thus, it prevents the database from unnecessary queries and reduces the amount of transferred data.

Console

The cache is DISABLED by default to avoid uncontrolled memory usage. But the configurable cache can simply be enabled on the console by adding the parameter --cache-enabled=true to one of the import commands like

vendor/bin/import-pro import:products --cache-enabled=true

Enable/Disable cache

In some cases it will make sense to enable the cache functionality, especially in scenarios where Pacemaker has to work in a distributed environment e. g. on AWS. To enabled/disabled the configurable cache or change the TTL without adding the console parameter every time it’ll be invoked, add a snippet <magento-install-dir>/app/etc/configuration/cache.json

{
  "caches": [
    {
      "type": "cache.static"
    },
    {
      "type": "cache.configurable",
      "enabled" : true,
      "time": 1440
    }
  ]
}

and set "enabled" : "true|false" depending the requirements. The parameter time adjusts the TTL of the cache entries and should be changed only when there are good reasons for it.

It dosen’t make sense to disable the cache when the the cache warming functionality has been enabled, because loading the data for the cache warming functionality can take a some time, depending on the size of the database, but then the data itself will not be cached and can not be used to reduce database access in the end.

Cache warming

The cache warming functionality is also part of the PE. By activating the cache warming functionality, Pacemaker will load, by using optimized queries, as much data as possible into the cache. This will lower database access during the import process. Cache warming, therefore has the greatest impact on performance, because all products as well as there attributes and many relations will be pre-loaded and not database queries are necessary anymore. Keep in mind, that this great advantage comes for sure along with a massive memory footprint, as those data will be kept in memory till the import process has been finished. The default cache-warmer configuration looks like

{
  "operations": {
    "general": {
      "catalog_product": {
        "cache-warmer": {
          "plugins": {
            "cache-warmer": {
              "id": "import_caching.plugin.cache.warmer",
              "params": {
                "cache-warmers": [
                  "import_caching.repository.cache.warmer.eav.attribute.option.value",
                  "import_caching.repository.cache.warmer.product",
                  "import_caching.repository.cache.warmer.product.varchar",
                  "import_caching.repository.cache.warmer.product.int",
                  "import_caching.repository.cache.warmer.product.text",
                  "import_caching.repository.cache.warmer.product.decimal",
                  "import_caching.repository.cache.warmer.product.datetime"
                ]
              }
            }
          }
        }
      }
    }
  }
}

So actually, Pacemaker provides cache-warmers for the product entity as well as the EAV attribute option values.

To enable the cache warming functionality, assuming you are using Magento Community, simple add the appropriate operation general/catalog_product/cache-warmer to a snippet <magento-install-dir>/app/etc/configuration/shortcuts.json with the shortcuts which could then look like

{
  "shortcuts": {
    "ce": {
      "catalog_product": {
        "add-update": [
          "general/general/global-data",
          "general/general/move-files",
          "general/catalog_product/cache-warmer",
          "general/catalog_product/collect-data",
          "general/eav_attribute/convert",
          "general/eav_attribute/add-update.options",
          "general/eav_attribute/add-update.option-values",
          "general/eav_attribute/add-update.swatch-values",
          "general/catalog_category/convert",
          "ce/catalog_category/sort",
          "ce/catalog_category/add-update",
          "ce/catalog_category/add-update.path",
          "ce/catalog_category/add-update.url-rewrite",
          "ce/catalog_category/children-count",
          "ce/catalog_product/validate",
          "ce/catalog_product/add-update",
          "ce/catalog_product/add-update.variants",
          "ce/catalog_product/add-update.bundles",
          "ce/catalog_product/add-update.links",
          "ce/catalog_product/add-update.grouped",
          "ce/catalog_product/add-update.media",
          "general/catalog_product/add-update.msi",
          "general/catalog_product/add-update.url-rewrites"
        ]
      }
    }
  }
}

In general, the cache warming functionality is only useful in case of the add-update operation, as the delete and replace operations doesn’t make usage of pre-cached data, because of their implementation nature.

Finders

Last but not least, the library techdivision/caching which is also part of Pacemaker PE, comes with replacements for the general finder implementations which uses the configurable cache type. Therefore, the default finders are overwritten by DI and uses the configurable cache to cache the result of database queries and makes sure the cache will be invalided, when existing data has been replaced.

{
  "finder-mappings": {
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_DATETIMES":
        "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_DECIMALS":
        "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_INTS":
        "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_TEXTS":
        "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_VARCHARS":
        "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCTS":
        "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT":
        "import_caching.repository.finder.factory.unique.entity.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_DATETIMES_BY_PK_AND_STORE_ID":
        "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_DECIMALS_BY_PK_AND_STORE_ID":
        "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_INTS_BY_PK_AND_STORE_ID":
        "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_TEXTS_BY_PK_AND_STORE_ID":
        "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_VARCHARS_BY_PK_AND_STORE_ID":
        "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_VARCHAR_BY_ATTRIBUTE_CODE_AND_ENTITY_TYPE_ID_AND_STORE_ID":
        "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_VARCHAR_BY_ATTRIBUTE_CODE_AND_ENTITY_TYPE_ID_AND_STORE_ID_AND_VALUE" :
    "import_caching.repository.finder.factory.unique.cached"
  }
}

Each of those finders replace a version that doesn’t use the cache. So the finders can be used for a fine grained cache configuration and allows the adjust the memory consumption of Pacemaker during the import process.

Fine grained cache configuration

As described above, cache warming may have a significant impact on memory consumption. In combination with the configuration of the finders that has to be used, it will be possible to adjust the memory consumption to your needs. Therefore a snipped <magento-install-dir>/app/etc/configuration/operations.json that overwrites the default cache warmers like

{
  "operations": {
    "general": {
      "catalog_product": {
        "cache-warmer": {
          "plugins": {
            "cache-warmer": {
              "id": "import_caching.plugin.cache.warmer",
              "params": {
                "cache-warmers": [
                  "import_caching.repository.cache.warmer.eav.attribute.option.value",
                  "import_caching.repository.cache.warmer.product.varchar",
                  "import_caching.repository.cache.warmer.product.int",
                  "import_caching.repository.cache.warmer.product.text",
                  "import_caching.repository.cache.warmer.product.decimal",
                  "import_caching.repository.cache.warmer.product.datetime"
                ]
              }
            }
          }
        }
      }
    }
  }
}

and the removes the cache warmer "import_caching.repository.cache.warmer.product" for the products. Additionally, a snippet like <magento-install-dir>/app/etc/configuration/finder-mappings.json, that overwrites the default finder configuration and replaces the cache finder implementation import_caching.repository.finder.factory.unique.entity.cached for the SQL statement TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT with a non-cached version import.repository.finder.factory.unique. This should result in a snippet which would look like

{
  "finder-mappings": {
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_DATETIMES":
      "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_DECIMALS":
      "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_INTS":
      "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_TEXTS":
      "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_VARCHARS":
      "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCTS":
      "import.repository.finder.factory.yielded",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT":
      "import.repository.finder.factory.unique",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_DATETIMES_BY_PK_AND_STORE_ID":
      "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_DECIMALS_BY_PK_AND_STORE_ID":
      "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_INTS_BY_PK_AND_STORE_ID":
      "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_TEXTS_BY_PK_AND_STORE_ID":
      "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_VARCHARS_BY_PK_AND_STORE_ID":
      "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_VARCHAR_BY_ATTRIBUTE_CODE_AND_ENTITY_TYPE_ID_AND_STORE_ID":
      "import_caching.repository.finder.factory.yielded.cached",
    "TechDivision\\Import\\Product\\Utils\\SqlStatementKeys::PRODUCT_VARCHAR_BY_ATTRIBUTE_CODE_AND_ENTITY_TYPE_ID_AND_STORE_ID_AND_VALUE" :
      "import_caching.repository.finder.factory.unique.cached"
  }
}

and prevents Pacemaker from caching product entities.

If only the cache warmer will be removed, the products will be loaded into the cache, anyway. But instead all at once, what gives an signifcant performance boost, one after another. This will finally result in the same level of memory consumption but lower performance.