Skip to content

Monitors

A monitor defines what to check and how often. UpSlim supports two monitor types: http and tcp.

Common fields

These fields apply to both HTTP and TCP monitors.

FieldTypeDefaultDescription
namestringrequiredUnique name for this monitor
typehttp | tcprequiredMonitor type
intervalduration60sHow often to run the check
timeoutduration30sMaximum time to wait
failure_thresholdinteger3Consecutive failures before alerting
success_thresholdinteger2Consecutive successes to mark recovered
send_on_resolvedbooleantrueSend recovery notification
conditionslistrequiredConditions that must all pass
alertslist[]Alert providers to notify

HTTP monitor

Performs an HTTP request and evaluates the response.

yaml
monitors:
  - name: api-health
    type: http
    url: "https://api.example.com/health"
    interval: 30s
    timeout: 10s
    method: GET                           # optional, default: GET
    headers:
      Authorization: "Bearer ${API_TOKEN}"
    conditions:
      - "[STATUS] == 200"
      - "[RESPONSE_TIME] < 500"
      - "[BODY].status == healthy"
    alerts:
      - name: slack-ops

HTTP-specific fields

FieldTypeDefaultDescription
urlstringrequiredFull URL including scheme
methodstringGETHTTP method
headersmap{}Request headers
bodystringnoneRequest body (for POST/PUT)

Available conditions for HTTP

ExpressionDescription
[STATUS] == 200HTTP status code equals 200
[STATUS] < 400HTTP status code less than 400
[RESPONSE_TIME] < 500Response time in milliseconds
[BODY] == okRaw response body equals string
[BODY].field == valueJSON body dot-path equals value

See the Conditions DSL for the full syntax.

TCP monitor

Opens a TCP connection and checks it succeeds within the timeout.

yaml
monitors:
  - name: postgres
    type: tcp
    host: "db.internal"
    port: 5432
    interval: 60s
    timeout: 5s
    conditions:
      - "[CONNECTED] == true"
    alerts:
      - name: slack-ops

TCP-specific fields

FieldTypeDefaultDescription
hoststringrequiredHostname or IP address
portintegerrequiredPort number (1–65535)

Available conditions for TCP

ExpressionDescription
[CONNECTED] == trueTCP handshake completed successfully
[RESPONSE_TIME] < 100Time to establish connection in ms

Per-monitor alert overrides

Thresholds can be overridden per alert reference, not just at the monitor level:

yaml
monitors:
  - name: critical-api
    type: http
    url: "https://payments.example.com/health"
    conditions:
      - "[STATUS] == 200"
    failure_threshold: 3    # monitor-level default
    alerts:
      - name: slack-ops
        failure_threshold: 1  # alert fires after 1 failure for this provider

WARNING

Per-alert failure_threshold overrides are read by the alert state machine. The monitor still runs every interval, but the alert fires only when the overridden threshold is reached.

Multiple monitors example

yaml
defaults:
  interval: 60s
  timeout: 10s

monitors:
  - name: web
    type: http
    url: "https://example.com"
    conditions:
      - "[STATUS] == 200"
      - "[RESPONSE_TIME] < 2000"

  - name: api
    type: http
    url: "https://api.example.com/health"
    interval: 30s        # override global default
    conditions:
      - "[STATUS] == 200"
      - "[BODY].status == healthy"

  - name: database
    type: tcp
    host: "db.internal"
    port: 5432
    conditions:
      - "[CONNECTED] == true"

  - name: cache
    type: tcp
    host: "redis.internal"
    port: 6379
    conditions:
      - "[CONNECTED] == true"

Released under the MIT License.