Description:

Merges a Group of FlowFiles together based on a user-defined strategy and packages them into a single FlowFile. It is recommended that the Processor be configured with only a single incoming connection, as Group of FlowFiles will not be created from FlowFiles in different connections. This processor updates the mime.type attribute as appropriate.

Additional Details...

Tags:

merge, content, correlation, tar, zip, stream, concatenation, archive, flowfile-stream, flowfile-stream-v3

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the NiFi Expression Language, and whether a property is considered "sensitive", meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the nifi.properties file has an entry for the property nifi.sensitive.props.key.

NameDefault ValueValid ValuesDescription
Merge StrategyBin-Packing Algorithm
  • Bin-Packing Algorithm Generates 'bins' of FlowFiles and fills each bin as full as possible. FlowFiles are placed into a bin based on their size and optionally their attributes (if the <Correlation Attribute> property is set)
  • Defragment Combines fragments that are associated by attributes back into a single cohesive FlowFile. If using this strategy, all FlowFiles must have the attributes <fragment.identifier>, <fragment.count>, and <fragment.index> or alternatively (for backward compatibility purposes) <segment.identifier>, <segment.count>, and <segment.index>
Specifies the algorithm used to merge content. The 'Defragment' algorithm combines fragments that are associated by attributes back into a single cohesive FlowFile. The 'Bin-Packing Algorithm' generates a FlowFile populated by arbitrarily chosen FlowFiles
Merge FormatBinary Concatenation
  • TAR A bin of FlowFiles will be combined into a single TAR file. The FlowFiles' <path> attribute will be used to create a directory in the TAR file if the <Keep Paths> property is set to true; otherwise, all FlowFiles will be added at the root of the TAR file. If a FlowFile has an attribute named <tar.permissions> that is 3 characters, each between 0-7, that attribute will be used as the TAR entry's 'mode'.
  • ZIP A bin of FlowFiles will be combined into a single ZIP file. The FlowFiles' <path> attribute will be used to create a directory in the ZIP file if the <Keep Paths> property is set to true; otherwise, all FlowFiles will be added at the root of the ZIP file. The <Compression Level> property indicates the ZIP compression to use.
  • FlowFile Stream, v3 A bin of FlowFiles will be combined into a single Version 3 FlowFile Stream
  • FlowFile Stream, v2 A bin of FlowFiles will be combined into a single Version 2 FlowFile Stream
  • FlowFile Tar, v1 A bin of FlowFiles will be combined into a single Version 1 FlowFile Package
  • Binary Concatenation The contents of all FlowFiles will be concatenated together into a single FlowFile
Determines the format that will be used to merge the content.
Attribute StrategyKeep Only Common Attributes
  • Keep Only Common Attributes
  • Keep All Unique Attributes
Determines which FlowFile attributes should be added to the bundle. If 'Keep All Unique Attributes' is selected, any attribute on any FlowFile that gets bundled will be kept unless its value conflicts with the value from another FlowFile. If 'Keep Only Common Attributes' is selected, only the attributes that exist on all FlowFiles in the bundle, with the same value, will be preserved.
Correlation Attribute NameIf specified, like FlowFiles will be binned together, where 'like FlowFiles' means FlowFiles that have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue.
Minimum Number of Entries1The minimum number of files to include in a bundle
Maximum Number of EntriesThe maximum number of files to include in a bundle. If not specified, there is no maximum.
Minimum Group Size0 BThe minimum size of for the bundle
Maximum Group SizeThe maximum size for the bundle. If not specified, there is no maximum.
Max Bin AgeThe maximum age of a Bin that will trigger a Bin to be complete. Expected format is <duration> <time unit> where <duration> is a positive integer and time unit is one of seconds, minutes, hours
Maximum number of Bins100Specifies the maximum number of bins that can be held in memory at any one time
Header FileFilename specifying the header to use. If not specified, no header is supplied. This property is valid only when using the binary-concatenation merge strategy; otherwise, it is ignored.
Footer FileFilename specifying the footer to use. If not specified, no footer is supplied. This property is valid only when using the binary-concatenation merge strategy; otherwise, it is ignored.
Demarcator FileFilename specifying the demarcator to use. If not specified, no demarcator is supplied. This property is valid only when using the binary-concatenation merge strategy; otherwise, it is ignored.
Compression Level1
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
Specifies the compression level to use when using the Zip Merge Format; if not using the Zip Merge Format, this value is ignored
Keep Pathfalse
  • true
  • false
If using the Zip or Tar Merge Format, specifies whether or not the FlowFiles' paths should be included in their entry names; if using other merge strategy, this value is ignored

Relationships:

NameDescription
mergedThe FlowFile containing the merged content
originalThe FlowFiles that were used to create the bundle
failureIf the bundle cannot be created, all FlowFiles that would have been used to created the bundle will be transferred to failure