Comparison of alert formats

We have compared IDMEF with five alert formats :

 

PROPRIETARY FORMATS OPEN STANDARD FORMATS

HP ArcSight CEF

IBM QRadar LEEF

ICSA/CISCO SDEE

DMTF CIM

CEE

The Open Group XDAS

HP ArcSight CEF

References

HP has published a complete but brief documentation of CEF format : https://protect724.hp.com/docs/DOC-1072. The meaning of some field are too briefly explained but globally the meaning of each field is well explained. Alert can be unambiguously formated in CEF.

 

Transport and encoding

CEF uses Syslog transport and a simple textual key/value encoding.

 

Format expressive power

The format is quite expressive and can embed information about sensors (devices), attack source and target, time, files, process and users. Those fields maps well to the different categories of information provided by IDMEF. However IDMEF is far more expressive and provides more fields for each category. About twenty fields of CEF seem difficult to translate into IDMEF. Some of them (for example NAT addresses) are possible options for IDMEF evolution.

 

Format structure

The format is loosely structured and use a flat schema. 91 fields are available. Different fields are sometimes used for the same type of information (for example IP address). The format use no dictionary (that could be useful for the outcome, reason or cat fields).

 

Format extensibility

The format provides few additional fields.

 

General remarks

The format is clearly designed to describe security events in general.

 

IBM QRadar LEEF

References

IBM has published a complete but brief documentation of LEEF format : https://www.ibm.com/developerworks/community/wikis/form/anonymous/api/wiki/9989d3d7-02c1-444e-92be-576b33d2f2be/page/3dc63f46-4a33-4e0b-98bf-4e55b74e556b/attachment/a19b9122-5940-4c89-ba3e-4b4fc25e2328/media/QRadar_LEEF_Format_Guide.pdf

All the fields are documented but the meaning of each field is often too briefly explained. The meaning of some fields is not clear (for example, the identxxx fields).

 

Transport and encoding

LEEF uses Syslog transport and a simple textual key/value encoding.

 

Format expressive power

The format expressiveness is limited and can only embed information about sources, destinations, time, users and services. There is no fields to describe files, process or sensors.

 

Format structure

The format is loosely structured and use a flat schema. 51 fields are available. Different fields are sometimes used for the same type of information (for example IP address). The format use no dictionary but few fields require it.

 

Format extensibility

The specification provides no standard mean to extend the format.

General remarks

The format is clearly designed to describe (network) security events. It uses encoding and transport similar to those used by CEF. However, the two formats differ in the number and types of fields.

 

DMTF CIM

References

CIM is a standard format described in a Distributed Management Task Force (DMTF) specification (DSP) : http://www.dmtf.org/standards/cim. The CIM format is dedicated to distributed management in general. Only a subpart of the format is dedicated to security events (CIM_Security) : http://dmtf.org/sites/default/files/cim/cim_schema_v2420/Visio-CIM_SecurityEvents.pdf

The online documentation provides UML diagrams of the classes in MS Visio, PDF, XSD and MOF format. The specification also provides a complete HTML documentation of the schema:

 

The documentation is complete and UML diagrams are useful to understand quickly the structure of alert messages.

 

Transport and encoding

CIM is a data model and could be used with any transport and encoding. The DMTF provides an XML schema of CIM.
WBEM, another DMTF format, uses HTTP transport and an XML implementation of CIM.

 

Format expressive power

The expressiveness of the subpart of CIM dedicated to security alert is limited. It consists of few classes. SecurityIndication extends the class AlertIndication from CIM core. This class is inherited by IPNetworkSecurityIndication, which is in turns is inherited by IPPacketFilterIndication. Those classes are considered as experimental.

There is no field in those classes to describe precisely the probe (for example to describe its IP address). There is no field to describe processes, files or users. This is quite paradoxical because the core of CIM could be used to describe precisely this type of information. However, those classes do not reuse classes taken from the core of CIM (there is no aggregation).

 

Format structure

CIM in general is a well-structured object-oriented format. However, the subpart dedicated to security event is less structured than the CIM core. It only uses inheritance. All the fields of the security event classes use primitive values (there is no aggregation). 59 fields are available in those classes. Different fields are sometimes used for the same type of information (for example IP address). The format proposes few dictionaries.

 

Format extensibility

Extension to the format is managed using inheritance.

 

General remarks

CIM in general is quite a mature solution to manage distributed systems. It is implemented in existing tools such as Microsoft WMI. CIM is used in other DMTF formats such as WBEM. It could be used in other alert formats to describe the different nodes (source, target, analyzer). This option seems to be considered by XDAS working group.

CEE

Common Event Expression is a format initiated by the MITRE Corporation, a US non-profit organization. The ambitious goal of the project was to design a standard for log events in general and not only for security events.

Some work have been done by a dedicated working group including US organization (NIST, DoD, etc.) and companies (Microsoft, Novell, etc.). However, the project stopped when the US government decided to stop funding it: https://cee.mitre.org/

The CEE working-group decided to make a clear separation between the data model (CEE Event Model or CEE Profil), the encoding (CEE Log Syntax) and the transport (CEE Log Transport). A CEE profile is composed of a schema, a field dictionary and an event taxonomy. The schema defines message structure. The field dictionary defines the semantic of each field.

 

References

CEE specifications are only available in a very preliminary beta version. MITRE published those specifications on a dedicated web site for archive purpose: https://cee.mitre.org/language/1.0-beta1/

The CEE core profile specification is provided as an XML schema (XSD file). It is also described in a dedicated Web page and CSV files. This specification only briefly describes the semantic of each field. The semantic of some fields is ambiguous. For example, the differences between appname and application.name is not clear. It is often difficult to know to what entity (target, source or probe) those fields are related.

 

Transport and encoding

One of the goal of CEE is to make a clear distinction between data model, encoding and transport. Thus it should be possible to use different encodings and transports.

In the beta specification, only JSON and XML encodings are provided: https://cee.mitre.org/language/1.0-beta1/cls.html

The CEE log transport specification only consists of requirements and some examples using JSON encoding over Syslog: https://cee.mitre.org/language/1.0-beta1/clt.html

 

Format expressive power

CEE is quite expressive. However, the exact semantic of many fields is not clear. It is often difficult to know if the field is related to the source, the target or the probe that generate the alert. It seems that CEE coverts the different categories of information provided by IDMEF. However, IDMEF is more complete and it is hard to translate an IDMEF event to CEE since the semantic of CEE fields is often ambiguous. It is probably due to the fact that only the beta version of the specification is available.

 

Format structure

CEE is loosely structured compared to object-oriented format such as IDMEF or CIM. However, it is more structured that pure flat-style format such as CEF or LEEF. It is composed of 64 fields or subfields. Some fields are in fact subfields of another field. For example, app.name and app.vend are subfields of app.

Some dictionaries are provided for fields such as action, domain or status. However, some dictionaries, such as service, are incomplete. Some of them, such as object or action, are quite generic and embed heterogeneous information.

 

Format extensibility

In theory, CEE extensibility could be managed using additional profiles. However, no additional profiles has been published so far.

General remarks

 

The Open Group XDAS

References

Transport and encoding

Format expressive power

Format structure

Format extensibility

General remarks

Comparison of the formats

The result of the comparison of the different alert format are summarized in the following Excel document : idmef-cef-leef-3.2.xlsx