Search documentation
Dashboard
Security

Overview

To find an overview of Gremlin’s security practices, check out gremlin.com/security.

Gremlin makes it easy to find weaknesses in your system before they cause problems for your customers. Gremlin is a simple, safe, and secure way to use Chaos Engineering to improve system resilience.

Gremlin experiments are generated on the Control Plane. Gremlin Agents make outbound TLS calls to poll for experiments. Gremlin provides secure command execution, security auditing, multi-factor authentication (MFA), and SAML SSO.

Linux

Gremlin is installed on Linux with a least privilege setup. When installed directly on the host, Gremlin does not require root privileges to any machines in your infrastructure. Gremlin operations are run via a gremlin user created with default Linux privileges.

Gremlin needs the following Linux capabilities to perform the corresponding experiments.

capabilitypurpose
cap_sys_bootused by shutdown to shutdown (and optionally reboot) your hosts
cap_sys_timeused by time travel to move your hosts forward and backward through time
cap_net_adminused by the network gremlins for all network experiments
cap_killused by process killer to kill requested process(es)

When targeting containers, Gremlin spawns its own sidecars to impact those containers so that you don't need to restart the targets. This is necessary so that the attack impacts the container target (eg. its virtual network, resource limits, etc) specifically. In order to do this Gremlin may require additional capabilies when running without elevented/root privileges. These are the additional capabilities:

capabilitypurpose
cap_setfcapthis allows Gremlin to pass the user id into the target container's namespace
cap_sys_chrootthis allows Gremlin to enter the target container's filespace for IO and Disk based experiments
cap_audit_writenecessary for communicating with the Kernel's audit log for containers spawned by containerd and cri-o
cap_mknodused to setup devices attached to a given container being targeted by Gremlin
cap_sys_adminneeded to enter the target container's process namespace (see: setns(2))
cap_dac_read_searchgrants us the ability to execute directories (list contents) without having access granted by the file owner/mode to obtain sockets for certificate expiry experiments
cap_sys_ptraceused by process collection to grant access to absolute path to process binary for hosts and container services, see proc(5) and ptrace(2)

|

Windows

The Gremlin daemon is installed as a Windows service under the LocalSystem account. Experiments created from the user interface run as a child process of the deamon so they too run under the LocalSystem account.

Gremlin configuration and work files are placed in the %ALLUSERSPROFILE%\Gremlin\Agent directory. By default Windows places that location at C:\ProgramData\Gremlin\Agent. The Gremlin folders and files inherit permissions from the parent %ALLUSERSPROFILE%/C:\ProgramData folder. Normally the permissions are read-write for administrators and read-only for all others. Those permissions prevent non-administrators from being able to run experiments from the command line.

Gremlin agent includes a kernel driver. The kernel driver is used for latency experiments. Like the Gremlin daemon, the Gremlin kernel driver loads with the operating system.

Network Access

Gremlin never intercepts the content or payload of any network traffic. Gremlin only looks at routing information in order to apply its impact to the intended network traffic.

No Ingress ports required

The primary communication between Gremlin installations and the Gremlin Control Plane is handled by the Gremlin daemon. However, when targeting a container or Kubernetes pod Gremlin spawns a sidecar that communicates directly with the Gremlin control plane for the duration of the experiment. For this reason, the daemon and experiment targets (including containers and Kubernetes pods) must have an outbound network path to the Gremlin service (api.gremlin.com).

Proxy support

The Gremlin Agent supports http/https proxies via the environment variables http_proxy and https_proxy. These are set to use a proxy server via HTTP and HTTPS traffic, respectively. Values used should be of the form http[s]://[username:password@]address:port, such as export https_proxy=https://proxy.your_company.com:8080 or export https_proxy=https://your_username:your_password@proxy.your_company.com:8080.

For Linux, the Gremlin daemon, which is typically run as a service, requires these environment variables to be set in /etc/default/gremlind:

bash
1echo "https_proxy=https://localhost:8888" | sudo tee -a /etc/default/gremlind
2sudo systemctl restart gremlind

For Windows the environment variables can be set through Control Panel or using PowerShell commands.

Note that the Gremlin Service only functions via encrypted communication (HTTPS). Attempts to connect to it via unencrypted protocols (HTTP) are denied.

Secure command execution

The Gremlin Daemon periodically communicates with our service over a TLS-protected channel which is authenticated using your organization's credentials. Once authenticated, the daemon sends heartbeat messages to the service and receives instructions from the service as responses to the heartbeat messages. If an experiment has been scheduled, the daemon receives the instructions for executing that experiment. Each instruction action is pre-defined within the daemon. Arbitrary instructions cannot be executed.

The service API only supports TLSv1.2 connections.

Security auditing

The Gremlin Agent, Daemon, API, and web app undergo regular security auditing, including penetration testing, by the external security auditor Bishop Fox. All identified vulnerabilities are remediated promptly and confirmed via remediation testing by our auditors. We can provide a Letter of Assessment from our auditors outlining our most recent audit findings and remediation results upon request.

Two Factor Authentication (MFA)

Gremlin offers Two Factor Authentication. See MFA under User Authentication.

SAML SSO

Gremlin supports SAML SSO. See SAML under User Authentication.