ExecuteDocumentPython

Runs a Python script on a FlowFile or document.

Properties

Name Default Value Description
IDOL License Service  

An IdolLicenseServiceImpl that provides a way to communicate with an IDOL License Server.

Python script file   The path of the script to run (or the actual Python script).
Python script function handler The name of the function to call when the script runs.
Route to success The connection to route an input FlowFile to.
Route returned to  

The connection to route FlowFiles to when they are returned by the "Python script function". You can return a single FlowFile or a list, set, or tuple of FlowFiles. You can also return a FlowFileDocument.

TIP: To use a Python script to filter out irrelevant FlowFiles, you could return FlowFiles that you want to continue processing, and return None for any FlowFiles that you want to discard. The relationship specified by this property will contain the documents that you want to continue processing and the success relationship will contain documents that you want to discard.

Route untransferred to  

If this property is configured, any FlowFile that has not already been routed is routed to this connection.

If this property is not configured and there are unrouted FlowFiles, the processor will fail and any FlowFiles that have been processed are rolled back to their original state and returned to the input queue.

Additional output relationships   A comma separated list of output relationships to create. Your script can transfer FlowFiles to these relationships.

Relationships

Name Description
success This is the default relationship for FlowFiles that have been processed by your script.
failure FlowFiles that had an invalid or unknown format.

FlowFile Routing

FlowFiles are routed based on the following rules (applied in the following order):

  • When your script calls session.transfer, the specified FlowFile is routed to the specified relationship. The relationship must exist. You can add additional relationships by setting the property "Additional output relationships".
  • When your script returns one or more FlowFiles from the "Python script function", the FlowFiles are routed to the relationship specified by "Route returned to".
  • If a FlowFile is an input FlowFile it is routed automatically to the relationship specified by "Route to".
  • Any remaining FlowFiles are routed automatically to "Route untransferred to".
  • If the "Route untransferred to" property is not set, there might be remaining FlowFiles in which case the processor fails and any FlowFiles that have been processed are rolled back to their original state and returned to the input queue.

Advanced Configuration

The ExecuteDocumentPython processor has an advanced configuration interface. This includes a code editor so that you can write Python scripts. To open the advanced configuration interface, right-click the processor and click Configure. Then, after the Configure Processor dialog box opens, click Advanced.

Write Log Messages

The ExecuteDocumentPython processor can write log messages to NiFi log files.

  • You can use the Python print() function to log messages at the INFO log level.
  • You can use the logging functions in the idolnifi package, for example logError().

Example Script

The following example demonstrates how to access IDOL document metadata and create a Python dictionary containing the field names and values.

Copy
def handler(context, session, document):
    document.read(myAction)


def myAction(docAction):
    xml_metadata = docAction.getXmlMetadata()
    document_metadata = processElement(xml_metadata)
    
    # Print metadata field names and values to log
    printMetadata(document_metadata)


def processElement(input):
    children = input.getChildren()
    
    if (len(children) == 0):
        return input.getValue()
    
    dictionary = dict()
    for element in children:
        dictionary.setdefault(element.getName(), []).append(processElement(element))
    
    return dictionary


def printMetadata(input, depth=0):
    for field,value in input.items():
        for item in value:
            if isinstance(item, dict):
                print(f"Descending into field {field}, at depth {depth+1}")
                printMetadata(item, depth+1)
            else:
                print(f"Field {field} has value {item}")

Python Reference Documentation

For information about the Python classes and methods that you can use in your scripts, right-click the ExecuteDocumentPython processor, and click View Usage. Then, after the documentation window opens, click Additional Details.... You can also access this documentation from the Advanced UI (click View Usage).