Description:

Scans the content of FlowFiles for terms that are found in a user-supplied dictionary. If a term is matched, the UTF-8 encoded version of the term will be added to the FlowFile using the 'matching.term' attribute

Additional Details...

Tags: aho-corasick, scan, content, byte sequence, search, find, dictionary

Properties:

In the list below, the names of required properties appear in bold. Anyother properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the NiFi Expression Language (or simply EL), and whether a property is considered "sensitive", meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the nifi.properties file has an entry for the property nifi.sensitive.props.key.

NameDescriptionDefault ValueValid ValuesELSensitive
Dictionary FileThe filename of the terms dictionaryNoNo
Dictionary EncodingIndicates how the dictionary is encoded. If 'text', dictionary terms are new-line delimited and UTF-8 encoded; if 'binary', dictionary terms are denoted by a 4-byte integer indicating the term length followed by the term itselftext
  • text
  • binary
NoNo

Relationships:

NameDescription
matchedFlowFiles that match at least one term in the dictionary are routed to this relationship
unmatchedFlowFiles that do not match any term in the dictionary are routed to this relationship