Split
The Split process splits multi-page image files into individual files based on a user-defined number of pages, file size, or specific metadata conditions. This node works with PDFs and TIFFs.
Note: This node does not support AES (Advanced Encryption Standard) PDFs.
After a split image process is run, the resulting filenames will be derived from the original filename plus the addition of a hyphen and a sequence of numbers, -1, -2, -3 etcetera. For example, if the original filename was “MyFileOne” and it was split into two images, the resulting filenames will be “MyFileOne-1” and “MyFileOne-2.”
To open the Split Node window, add a process node for Split and double-click on it.
-
Check the Enabled box so that the process will be run. When unchecked, this process will be ignored. Documents will pass through as if the node was not present (i.e., continue along the default or ‘positive’ path). Note that a disabled node will not check for logic or error conditions.
-
In the Node Name field, enter a meaningful name for the Split node.
-
In the Description field, enter a description for the Split node. Although this is not required, it can be helpful to distinguish one process from another. If the description is long, you can hover the mouse over the field to read its entire contents.
-
Select the Save button to keep the Split definition. You can also select the Help button to access online help and select the Cancel button to exit the window without saving any changes.
Split On Fixed Pages
To split the incoming file based on pages, select the Split on fixed pages radio button; then enter how often you want the incoming file split into smaller files. For example, enter 3 to separate 1 file into multiple 3 page files.
You can use a metadata value in this field by clicking on the Metadata Browser button to open up the Metadata Browser window and drag-dropping the metadata key into the field, as in the following illustration:
To specify a page-level and/or occurrence number for the metadata reference, enter the correct syntax after drag-dropping the metadata into the field.
Split On Size
To split the incoming file based on size, do the following:
-
Select the Split on size radio button.
-
Specify the maximum file size for each document in the entry field provided.
-
You can use a metadata value in this field by clicking on the Metadata Browser button to open up the Metadata Browser window and drag-dropping the metadata key into the field. See the illustration above for an example of the Metadata Browser window. To specify a page-level and/or occurrence number for the metadata reference, enter the correct syntax after drag-dropping the metadata into the field.
-
Choose the appropriate size units (bytes, KB, MB, GB).
For example, if you choose to split every 5 megabytes, all output files will be less than or equal to 5 MB.
Split On Metadata
To split the incoming file on metadata that meets specific conditions, select the Split on metadata radio button; then do the following:
-
Select which metadata to use as a condition using the Select Metadata Key drop-down list.
Note: The Select Metadata Key window will not display Date, File, or File System dynamic variables since files cannot be split on those variables.
-
Specify a condition and metadata value using the Condition drop-down list and Metadata Value input field. Check whether or not you want to match the case of the Metadata Value entered using the Match case box.
-
Remove page(s) that triggered the split by checking the Remove separator pages box. This is an optional step.
When you are done, you can click on the plus icon to add more conditions, or click on the X icon to delete a condition.
Step #1: Selecting Metadata
You can choose to use any metadata associated with documents in the workflow. When you click on the Select Metadata Key button, the Metadata Browser will appear, listing all of the metadata available, categorized by metadata type. For example, if you have created a zone in an Advanced OCR node, that zone would be listed in the pop-up window, under the OCR heading. On the Metadata Browser, you can do the following:
-
Expand the list by clicking on the + sign next to the metadata that you are interested in. For example, if you are interested in LPR metadata, click on the + sign next to LPR and the following expandable list will appear:
-
Collapse the list by clicking on the - sign next to the appropriate metadata.
-
Choose metadata to use as the split trigger by clicking on the metadata and clicking on the Select button. At this point, the Metadata Browser will close and you will return to the Split window.
-
Search for metadata by entering the appropriate text string in the empty Search field on the right-hand side of the window.
Step #2: Specifying Conditions And Values
Next, you can choose the appropriate option from the Condition drop-down list. Options are: Is, Contains, Regular Expression, and Exists. If you choose Exists, then the Metadata Value field will disappear. Otherwise, enter the appropriate value in the Metadata Value field. In the following illustration, a rule has been set up to split documents every time an invoice containing ‘195’ is encountered:
Step #3: Remove Separator Pages
As an option, you can choose to remove the separator page(s) containing barcodes or other metadata that met the specified rules and were used to split the file. For example, this option is useful to remove any barcode separator sheets. Do the following:
-
Check the Remove Separator Pages box.
-
To accommodate for any front-and-back printed pages, check the Remove next page after the separator box.
-
When specifying that pages be removed, you may still want to keep the metadata that is associated with them (e.g., if you want the barcode that triggered the split to still be referenced later in the workflow). In this case, check the Keep metadata associated with deleted pages box. Since the page that the metadata referred to will no longer exist in the output document, this metadata can be referenced by using a page number of “0,” which means document-level.
Split Metadata
Once a Split node is defined in a workflow, it can be referenced by other nodes in the workflow. You can use the Metadata Browser window to choose the Split metadata key. Split metadata includes:
-
{split:errors} – The number of recoverable errors encountered while processing the input document. The following errors are considered “recoverable”:
-
Split on metadata with the Remove Separator Pages option results in a 0 page document (e.g., the workflow splits on a barcode cover sheet but the user scans two sheets as duplex, or the cover sheet is the last page).
-
Split on size cannot fulfill the requirements (e.g., even after splitting into a one-page document, the file is larger than the target page size).
If no errors are encountered during processing, {split:errors} metadata will NOT be added to the output files.
-
{split:from} – The page number of the input file that is used as the first page of the new document.
-
{split:id} – The index of the output document. (e.g., the “2” in “2 of 5”).
-
{split:input.ext} – The input file’s extension (file name starting from the last period and going to the end).
-
{split:input.fullname} – The input file’s name and extension.
-
{split:input.name} – The input file’s name (up to the last period and not including the file extension).
-
{split:input.size} – The input file’s size (in bytes).
-
{split:to} – The page number of the input file that is used as the last page of the new document.
-
{split:total} – The number of files that were generated by the split of the input file (e.g., the “5” in “2 of 5”). When Dispatcher Phoenix is set to collect all files at once (“as a group”), the total value will be for the number of total files per group. For example, when processing a group of 5 10-page files set to split every 1 page, Split will output 50 files; the {split:total} will be 10 for all 50 files.
-
{split:unmodified} – This metadata key exists only if the input file remains unchanged (no split was performed).
Important! Fatal errors that cause the Split node to output the input file on Error will NOT add any split metadata to the input file (e.g., if the file format is not supported).