Scans the content of FlowFiles for terms that are found in a user-supplied dictionary. If a term is matched, the UTF-8 encoded version of the term will be added to the FlowFile using the 'matching.term' attribute
Tags: aho-corasick, scan, content, byte sequence, search, find, dictionary
Properties:
In the list below, the names of required properties appear in bold. Anyother properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the NiFi Expression Language (or simply EL), and whether a property is considered "sensitive", meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the nifi.properties file has an entry for the property nifi.sensitive.props.key.
Name | Description | Default Value | Valid Values | EL | Sensitive |
---|---|---|---|---|---|
Dictionary File | The filename of the terms dictionary | No | No | ||
Dictionary Encoding | Indicates how the dictionary is encoded. If 'text', dictionary terms are new-line delimited and UTF-8 encoded; if 'binary', dictionary terms are denoted by a 4-byte integer indicating the term length followed by the term itself | text |
| No | No |
Relationships:
Name | Description |
---|---|
matched | FlowFiles that match at least one term in the dictionary are routed to this relationship |
unmatched | FlowFiles that do not match any term in the dictionary are routed to this relationship |