Batch Processing
The Batch Processing feature, also known as Bulk Processing, flags an agent to group and cache a set of source records into a batch before sending that data to the target Connector. When the target Connector receives the batch of records it uses the target Connector's API methods for bulk, batch, or asynchronous processing to complete the appropriate operation for that set of records. Not all Connector APIs support Batch Processing. See the diagram below for an overview of how the agent executes a flow with Batch Processing enabled.
For Connectors that support Batch Processing, such as Salesforce, performance is improved significantly because their APIs accept batches of data and can process the data set much faster. However, if you use target Lookup functions or Lookup blocks in your flows, performance improvements derived from using Batch Processing are reduced due to the time it takes to look up data in the target for each record. See Lookups And Batch Processing, Lookup Block, and Lookup Functions.
If you configure a batch size limit to be larger than your target application permits, most Connectors correct the batch to the largest possible size accepted by the API during processing. For example: if you have 2,000 source records and the Upsert block is set to a batch size of 2,000 for the Salesforce SOAP API, TIBCO Cloud™ Integration - Connect sends 10 batches of 200 records.
Benefits Of Batch Processing
- Improve performance when there are a large number of source records to process.
- Reduce the number of round-trips between TIBCO Cloud™ Integration - Connect and the target datastore.
- Reduce the number of API calls to the target application. Some applications, such as Salesforce, limit the number of API calls per customer.
Batch Processing Issues
Batch processing is enabled, records with errors are processed by subsequent Blocks.
- Batch processing does not reduce the number of records processed.
- Blocks with Batch Processing enabled cannot be used as a resource for data fields and Result Fields in subsequent blocks, because the data may not have been stored when those blocks executed.
- Blocks that perform Batch Processing cannot be used in control flow.
- The Lookup block does not support Batch Processing. Batch processing provides the greatest benefit when all blocks in the flow are using it. If your flow includes a block that doesn't, such as a Lookup, the benefit is reduced. This is also true if any of your field mappings use a target lookup function. See Lookup Block.
When Batch Processing is enabled for a target block, that block executes independently from the rest of the flow and does not prevent the rest of the flow from executing before the Batch is complete. Even though the agent is caching records for the Batch enabled block until the Batch is complete, those same records are being processed by the other blocks in the flow. Depending on the design of your flow, blocks with Batch Processing enabled may continue to execute after all records have been processed by the rest of the flow.
Example:
Assume you have a very simple flow with three blocks that do the following:
- Block 1 - Query SQL DB
- Block 2 - Update Salesforce with Batch Processing set at 50
- Block 3 - Update HubSpot without Batch Processing
In this example the records would be processed as follows:
- Block 1 returns 150 records.
- The first record goes to Block 2 and is cached.
- The first record goes to Block 3 and is written to HubSpot.
- The second record goes to Block 2 and is cached with record 1.
- The second record goes to Block 3 and is written to HubSpot. Note: This sequence continues until Block 2 receives the 50th record. At that point the first 50 records are sent to Salesforce. Note that the first 50 records have already been processed by Block 3 by the time Block 2 sends the batch. Block 3 does not wait for the cached records to be sent.
- The 51st record goes to Block 2 and is cached. All subsequent records are cached until the 100th record is received. Then records 51-100 are sent to Salesforce.
- The 51st record goes to Block 3 and is written to HubSpot. Each subsequent record is written to HubSpot as it is received.
If Block 3 attempted to do an update to Salesforce instead of to HubSpot, the updates for records 1-50 done by Block 2 would not have been written to the target by the time Block 3 processed updates for records 1-50 to the same target. Depending on the fields being changed for each record, Block 3 could have many record errors because the Block 2 updates were not done yet.
Enabling Batch Processing
Batch processing can be enabled only if the target Connector supports Batch Processing. This option can be enabled on the General tab of the Properties dialog for following TIBCO Cloud™ Integration - Connect operation blocks:
For Data replication apps, if the target Connection configured in the app supports Bulk operations, then Batch Processing is used automatically.