Processing Modules

CAPE’s processing modules are Python scripts that let you define custom ways to analyze the raw results generated by the sandbox and append some information to a global container that will be later used by the signatures and the reporting modules.

You can create as many modules as you want, as long as they follow a predefined structure that we will present in this chapter.

Global Container

After an analysis is completed, CAPE will invoke all the processing modules available in the modules/processing/ directory. Any additional module you decide to create must be placed inside that directory.

Every module should also have a dedicated section in the file conf/processing.conf: for example, if you create a module module/processing/foobar.py you will have to append the following section to conf/processing.conf:

[foobar]
enabled = on

Every module will then be initialized and executed and the data returned will be appended in a data structure that we’ll call global container.

This container is simply just a big Python dictionary that includes the abstracted results produced by all the modules classified by their identification key.

CAPE already provides a default set of modules that will generate a standard global container. It’s important for the existing reporting modules (HTML report etc.) that these default modules are not modified, otherwise, the resulting global container structure would change and the reporting modules wouldn’t be able to recognize it and extract the information used to build the final reports.

The currently available default processing modules are:

AnalysisInfo (modules/processing/analysisinfo.py) - generates some basic information on the current analysis, such as timestamps, version of CAPE, and so on.
BehaviorAnalysis (modules/processing/behavior.py) - parses the raw behavioral logs and perform some initial transformations and interpretations, including the complete processes tracing, a behavioral summary, and a process tree.
Debug (modules/processing/debug.py) - includes errors and the analysis.log generated by the analyzer.
Dropped (modules/processing/dropped.py) - includes information on the files dropped by the malware and dumped by CAPE.
Memory (modules/processing/memory.py) - executes Volatility on a full memory dump.
NetworkAnalysis (modules/processing/network.py) - parses the PCAP file and extracts some network information, such as DNS traffic, domains, IPs, HTTP requests, IRC, and SMTP traffic.
ProcMemory (modules/processing/procmemory.py) - performs analysis of process memory dump. Note: the module can process user-defined Yara rules from data/yara/memory/index_memory.yar. Just edit this file to add your Yara rules.
StaticAnalysis (modules/processing/static.py) - performs some static analysis of PE32 files.
Strings (modules/processing/strings.py) - extracts strings from the analyzed binary.
TargetInfo (modules/processing/targetinfo.py) - includes information on the analyzed file, such as hashes.
VirusTotal (modules/processing/virustotal.py) - searches on VirusTotal.com for antivirus signatures of the analyzed file. Note: the file is not uploaded on VirusTotal.com, if the file was not previously uploaded on the website no results will be retrieved.

Getting started

To make them available to CAPE, all processing modules must be placed inside the folder at modules/processing/.

A basic processing module could look like this:

from lib.cuckoo.common.abstracts import Processing

class MyModule(Processing):

    def run(self):
        self.key = "key"
        data = do_something()
        return data

Every processing module should contain:

A class inheriting Processing.
A run() function.
A self.key attribute defining the name to be used as a sub-container for the returned data.
A set of data (list, dictionary, string, etc.) will be appended to the global container.

You can also specify an order value, which allows you to run the available processing modules in an ordered sequence. By default, all modules are set with an order value of 1 and are executed in alphabetical order.

If you want to change this value your module would look like this:

from lib.cuckoo.common.abstracts import Processing

class MyModule(Processing):
    order = 2

    def run(self):
        self.key = "key"
        data = do_something()
        return data

You can also manually disable a processing module by setting the enabled attribute to False:

from lib.cuckoo.common.abstracts import Processing

class MyModule(Processing):
    enabled = False

    def run(self):
        self.key = "key"
        data = do_something()
        return data

The processing modules are provided with some attributes that can be used to access the raw results for the given analysis:

self.analysis_path: path to the folder containing the results (e.g. storage/analysis/1)

self.log_path: path to the analysis.log file.

self.conf_path: path to the analysis.conf file.

self.file_path: path to the analyzed file.

self.dropped_path: path to the folder containing the dropped files.

self.logs_path: path to the folder containing the raw behavioral logs.

self.shots_path: path to the folder containing the screenshots.

self.pcap_path: path to the network pcap dump.

self.memory_path: path to the full memory dump, if created.

self.pmemory_path: path to the process memory dumps, if created.

With these attributes, you should be able to easily access all the raw results stored by CAPE and perform your analytic operations on them.

As a last note, a good practice is to use the CuckooProcessingError exception whenever the module encounters an issue you want to report to CAPE. This can be done by importing the class like this:

from lib.cuckoo.common.exceptions import CuckooProcessingError
from lib.cuckoo.common.abstracts import Processing

class MyModule(Processing):

    def run(self):
        self.key = "key"

        try:
            data = do_something()
        except SomethingFailed:
            raise CuckooProcessingError("Failed")

        return data