17. High Performance Considerations¶
Depending on your hardware, Sagan can operate comfortably up to about 5k "events per/second" (EPS) using default configurations. When you hit this level and higher, there are a few configuration options to take into consideration.
17.1. batch-size¶
The most important thing is the batch-size
sagan.yaml configuration option. By default,
when Sagan receives a log line, the data is sent to any available thread. Due to memory protections
(pthread mutex lock/unlock), this isn't efficient. The system starts to spend more time protecting the
memory location of the single line of log data than processing the log line.
The batch-size
allows Sagan to send more data to worker threads and use less "locks". For example,
with a batch-size
of 10, Sagan can send 10 times more data with only one "lock" being applied. At
even higher rates, you may want to consider setting the batch-size
to 100.
The default batch sizes are 1 to 100. On very high performance systems (100k+ EPS or more), you may want to consider rebuilding to handleeven larger batches. To do this, you would edit the sagan-defs.h and change the following.
#define MAX_SYSLOG_BATCH 100
To
#define MAX_SYSLOG_BATCH 1000
Then rebuild Sagan and set your batch-size
to 1000. While you will save CPU, Sagan will
use more memory. If you sent the MAX_SYSLOG_BATCH to 1000 and only set the batch-size
to
100, Sagan will still allocate memory for 1000 log lines. In fact, it will do the per-thread!
Think of it this way:
- ::
- ( MAX_SYSLOG_BATCH * 10240 bytes ) * Threads = Total memory usage.
The default allocation per log line is 10240 bytes.
17.2. Rule sets¶
At high rates, consideration should be given to the rules that you are loading. Unneeded and unused rules waste CPU.
If you are writing rules, make sure you use simple rule keywords first (content
, meta_content
,
program
, etc) before moving to more complex rule options like pcre
. The more simple rule
keywords can be used to "short circuit" a rule before it has to do more complex operations.
Software like Snort
attempts to arrange the rule set in memory to be more efficient. For example,
when Snort
detects multiple content
modifiers, it shifts the shortest lenght content
to
the front (first searched). Regardless of the content
rule keywords placement within a rule.
Because logs are inherently different than packets, Sagan
does not do this! If you have multiple
content
keywords, Sagan
will use them in the order they are placed in the rule. You will
want to use the least matched keywords as the first content
. For example:
::
# This will use more CPU because "login" is common.
content: "login"; content: "mary";
# This will use less CPU because "mary" is likely less common.
content: "mary"; content: "login";
The same login applied to pcre
and meta_content
.
17.3. Rule order of execution¶
Sagan attempts to use the least CPU intensive rule options first. This means that if a Sagan
rule
has multiple content
keywords and multiple pcre
keywords, the content
rule keywords are
processed first. If the content
keywords do not match, then there is no need to process the pcre
keywords. The order of execution within a rule is as follows:
The program
field is the very first thing to be evaluated.
The content
is the next option Sagan takes into consideration.
The meta_content
is next.
Finally the pcre
option, which is consided the heaviest, is the last.