Data Change Poller Activity

This activity polls for new and updated records in a given source table and copies it over to a reference table for future reference. After each polling interval, the changed records are returned as output.

Before you begin

  1. The table which is polled should have an identity column that is auto-incremented sequentially with each data insert.

  2. The table should also have a sort column or a column with a timestamp type that is updated with the time of each insert and update activity.

If the table is not defined with the above details, it should be altered with an identity column and a sort column. The template script provided with the samples can be used for this purpose.

Note: The reference table created for tracking and saving changed data has an additional process_status column. When changed data is polled and populated to this reference table, the process_status column for each row displays N. After the changed data is queried and returned, the status is updated to C. This status update is supported for both types of authentication.

General

The General panel contains the following fields:

Property Module Property?  Description
Name No The name that is displayed as the label for the shared resource in the process.
Redshift Connection Resource No Click to select a Redshift Connection shared resource. If no matching Redshift Connection shared resource is found, click Create Shared Resource to create one. For more information, see Creating and Configuring the Redshift Connection Resource
Entity No Click to select an entity.
The entities downloaded on the Schema tab of the Redshift Connection shared resource are available for selection.
Reference Table No This is the name of the table where new or updated records are stored at the polling interval.
Identity Column Yes The column is defined as the identity column to the table, which is sequentially incrementing.
Sort Column Yes The column records the time of new inserts and updates in the table. Normally it is the moddate column.

Description

Provide a short description for the activity.

Advanced

The Advanced panel contains the following fields:

Property Module Property?  Description
Sequence Key No It contains an XPath expression that specifies the processes that must run in order. Process instances with sequencing keys that evaluate to the same value are executed sequentially in the order the process instance was created.
Poll for new Records No When selected, this check box polls the table defined in the entity for only new records.
Poll for New and Updated Records No When selected, the poller polls for both new inserts and updates.
Note: Polling for deleted records is not supported.
Enable Load Balancing No When selected, the application runs in load balancing mode or multi-node configuration to share polling tasks.
Polling Interval (ms) Yes The interval at which the source table is polled for changed records

Tags

Output

The output for the activity contains the following items:

Output Item Data Type Description
status? string Provides a success or failure
error_message? string Provides an error message or exception during the data polling process
changed_record_count? integer The number of records that are changed
polled_records integer The record data polled is returned as part of the response payload.

Fault

The Fault tab lists exceptions that are generated by this activity.

Error Schema Element Data Type Description
msg string The error message returned by the plug-in.
msgCode string The error code returned by the plug-in.

Load balancing

When enabled, the load-balancing mode distributes the work involved in polling for changed data and the delivery of the changed records among multiple app nodes. This mode reduces the latency between the polling intervals and provides faster delivery of data with implicit batching. By default, the poller operates in non-load-balancing mode.

The Enable Load Balancing check box in the property section must be selected to enable load balancing.

An implicit mutually exclusive locking mechanism is used to orchestrate sharing of work between the processes running in load-balancing mode. The forced stopping and restart might interfere with the locking mechanism. Any conflicts, deadlocks, or the hung session might require the following steps and might be taken to rectify the issue:

• The hung processes might be terminated or canceled by using PG_CANCEL_BACKEND or PG_TERMINATE_BACKEND functions.

• The hung transaction or conflicting data base operations/queries can be detected by querying SVV_LOCKS, STL_TR_CONFLICTS, and SVV_TRANSACTIONS tables and views.

These also provide the process IDs of troubled sessions to be passed to the functions mentioned earlier.

For more information about Redshift additional guidance and techniques, see the Troubleshooting guide in AWS documentation. Also, a database administrator must be involved in troubleshooting and rectifying the issues on the database side.