Optimize your Data Flow
To maximize throughput, monitor the size of the queues across your dataflow. If a large queue builds up before a processor, and the queues for the subsequent processors are empty, the processor is causing a bottleneck. This isn't unusual because some tasks are expected to take longer. To improve throughput you can configure a processor to process multiple FlowFiles concurrently.
- Right-click the processor and click Configure.
- Click the Scheduling tab.
-
In the Concurrent Tasks box, type the number of CPU threads to use.
NOTE: The total number of CPU threads that you can use is limited by your hardware. If your server's CPU is under maximum load, adding additional threads does not help. Additionally, NiFi places a limit on the maximum number of threads, which you might need to increase. For more information see Maximum Number of Threads.
Sometimes, allowing a processor to use more resources just moves the bottleneck to a processor further down the pipeline. You might need to repeat this process several times and monitor the queue sizes until the CPU threads are distributed correctly.