Overview

TIBCO ActiveMatrix BusinessWorks™ Plug-in for Large XML can be used to process large XML files.

Processing of Large XML documents is a critical use case for most integration projects. It needs the processing capability to be optimized in terms of memory and CPU usage to achieve the required throughput and performance. The current ActiveMatrix BusinessWorks XML palette provides basic XML processing where XML is always loaded in memory. This poses significant challenges and limitations for processing the large XML documents.

ActiveMatrix BusinessWorks Plug-in for Large XML addresses the large XML processing issue for most of the use cases. The key to large XML processing is Streaming where the large XML is never loaded into memory. It provides the following high level capabilities:

The user can split a stream of large XML into fragments based on configurable criteria and, then, process the fragments without bloating the memory and also provide smart recovery options.
The user can validate a stream of large XML against a configured schema without loading the large XML into memory.
The user can transform a stream of large XML by configuring an XSLT.

The user can read and write streams from or to the file system. The user can close the stream generated from the FileToStream activity, explicitly when required.

The XML processing using the plug-in is as follows:

1. Input Large XML: You can source large XML from an already created stream or directly creating a stream from File.

2. XML Validation (optional): You can, optionally, validate the large XML Stream against a schema. Memory and CPU usage is optimum as the large XML is not loaded into memory.

3. XML Transformation (optional): You can, optionally, transform the large XML based on a configured XSLT. Memory and CPU utilization is optimum as compared to the out-of-the-box XML processing in TIBCO ActiveMatrix BusinessWorks.

4. XML Processing: You can configure the criteria to split the XML by providing the split element configuration in addition to either number of records or size of fragment. You can specify the destination location where all the XML fragments can be written to a file using the XMLSplitter activity. Alternately, you can use the GetFragment activity in a group to process every XML fragment.

During fragment processing, you get additional meta data about each fragment being created which helps co-relate the fragment with the large XML.
The earlier fragment is auto-cleaned from memory before the next fragment is output by the GetFragment activity running in a loop. This guarantees that only the fragment getting processed is in memory at any given time.
It also provides smart recovery mechanism by which you can add the Checkpoint activity in the group while processing a fragment of the large XML.

5. Output: You can output XML as a string for further processing or can write the string directly to a file.

6. Splitter Output: You can output the total number of fragments created and the location where the split files are stored.

Subtopics