Skip to content

Commit

Permalink
feat: shutdown race resilience
Browse files Browse the repository at this point in the history
A significant rewrite to ensure that we don't suffer from shutdown race
conditions as the prune condition is met and additional resources are
being created.

Previously this would remove resources that were still in use, now we
retry if we detect new resources have been created within a window of
the prune condition triggering.

This supports the following new environment configuration settings:
- RYUK_REMOVE_RETRIES - The number of times to retry removing a resource.
- RYUK_REQUEST_TIMEOUT - The timeout for any Docker requests.
- RYUK_RETRY_OFFSET - The offset added to the start time of the prune
  pass that is used as the minimum resource creation time.
- RYUK_SHUTDOWN_TIMEOUT - The duration after shutdown has been requested
  when the remaining connections are ignored and prune checks start.

Also bumps go to v1.22 and golangci-lint to v1.59.1 to avoid false lint
failures.

Update README to correct example, as health is only valid for containers
not the other resources, so would cause failures.
  • Loading branch information
stevenh committed Sep 3, 2024
1 parent 2035aab commit 2169978
Show file tree
Hide file tree
Showing 17 changed files with 1,650 additions and 916 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/golangci-lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,5 @@ jobs:
- name: golangci-lint
uses: golangci/golangci-lint-action@3cfe3a4abbb849e10058ce4af15d205b6da42804 # v4
with:
version: v1.55.2
version: v1.59.1
args: --timeout=3m
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,9 @@

vendor/
bin/

# Binary
moby-ryuk

# VS Code
.vscode
83 changes: 83 additions & 0 deletions .golangci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
run:
timeout: 2m

linters-settings:
gosec:
excludes:
- G601 ## Implicit memory aliasing of items from a range statement - not possible in go 1.22.
cyclop:
max-complexity: 15
nestif:
min-complexity: 10
govet:
settings:
shadow:
strict: true
enable-all: true
nolintlint:
require-explanation: true
godot:
scope: all

linters:
enable-all: true
disable:
# Spammy / low value
- varnamelen
- exhaustruct
- nlreturn
- wsl
- lll
- paralleltest
# Duplicate functionality.
- funlen
- gocognit
# Deprecated.
- execinquery
- gomnd
# Good but gets in the way too often.
- testpackage
# Unknown details about how Artemis works are flagged with TODO's.
- godox
# Seems to be broken.
- depguard
# Makes it messy for multiple optional tags.
- tagalign
# Not needed for go 1.22+.
- exportloopref
- errchkjson # Duplicate functionality for errcheck.

issues:
include:
- EXC0012
- EXC0014
exclude-rules:
# Exclude linters which aren't an issue in tests.
- path: _test\.go
linters:
- gochecknoglobals
- wrapcheck

# File mode permissions are fine for constants.
- text: "Magic number: 0o\\d+"
linters:
- mnd

# Field alignment in tests isn't a performance issue.
- text: fieldalignment
path: _test\.go

# Dynamic errors can provide useful context.
- text: "do not define dynamic errors, use wrapped static errors instead:"
linters:
- err113

# We need to use the `err` named return for error handling.
- text: 'named return "err" with type "error" found'
linters:
- nonamedreturns

# Interface casting is fine in mock.
- path: mock_test\.go
linters:
- forcetypeassert
44 changes: 33 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,36 @@
# Moby Ryuk

This project helps you to remove containers/networks/volumes/images by given filter after specified delay.
This project helps you to remove containers, networks, volumes and images by given filter after specified delay.

# Usage
## Building

To build the binary only run:

```shell
go build
```

To build the Linux docker container as the latest tag:

```shell
docker build -f linux/Dockerfile -t testcontainers/ryuk:latest .
```

## Usage

1. Start it:

$ RYUK_PORT=8080 ./bin/moby-ryuk
$ # You can also run it with Docker
$ docker run -v /var/run/docker.sock:/var/run/docker.sock -e RYUK_PORT=8080 -p 8080:8080 testcontainers/ryuk:0.6.0
RYUK_PORT=8080 ./bin/moby-ryuk
# You can also run it with Docker
docker run -v /var/run/docker.sock:/var/run/docker.sock -e RYUK_PORT=8080 -p 8080:8080 testcontainers/ryuk:0.6.0

1. Connect via TCP:

$ nc localhost 8080
nc localhost 8080

1. Send some filters:

label=testing=true&health=unhealthy
label=testing=true&label=testing.sessionid=mysession
ACK
label=something
ACK
Expand All @@ -37,7 +51,15 @@ This project helps you to remove containers/networks/volumes/images by given fil

## Ryuk configuration

- `RYUK_CONNECTION_TIMEOUT` - Environment variable that defines the timeout for Ryuk to receive the first connection (default: 60s). Value layout is described in [time.ParseDuration](https://golang.org/pkg/time/#ParseDuration) documentation.
- `RYUK_PORT` - Environment variable that defines the port where Ryuk will be bound to (default: 8080).
- `RYUK_RECONNECTION_TIMEOUT` - Environment variable that defines the timeout for Ryuk to reconnect to Docker (default: 10s). Value layout is described in [time.ParseDuration](https://golang.org/pkg/time/#ParseDuration) documentation.
- `RYUK_VERBOSE` - Environment variable that defines if Ryuk should print debug logs (default: false).
The following environment variables can be configured to change the behaviour:

| Environment Variable | Default | Format | Description |
| - | - | - | - |
| `RYUK_CONNECTION_TIMEOUT` | `60s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The duration without receiving any connections which will trigger a shutdown |
| `RYUK_PORT` | `8080` | `uint16` | The port to listen on for connections |
| `RYUK_RECONNECTION_TIMEOUT` | `10s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The duration after the last connection closes which will trigger resource clean up and shutdown |
| `RYUK_REQUEST_TIMEOUT` | `10s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The timeout for any Docker requests |
| `RYUK_REMOVE_RETRIES` | `10` | `int` | The number of times to retry removing a resource |
| `RYUK_RETRY_OFFSET` | `-1s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The offset added to the start time of the prune pass that is used as the minimum resource creation time. Any resource created after this calculated time will trigger a retry to ensure in use resources are not removed |
| `RYUK_VERBOSE` | `false` | `bool` | Whether to enable verbose aka debug logging |
| `RYUK_SHUTDOWN_TIMEOUT` | `10m` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The duration after shutdown has been requested when the remaining connections are ignored and prune checks start |
66 changes: 66 additions & 0 deletions config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
package main

import (
"fmt"
"log/slog"
"time"

"github.com/caarlos0/env/v11"
)

// config represents the configuration for the reaper.
type config struct {
// ConnectionTimeout is the duration without receiving any connections which will trigger a shutdown.
ConnectionTimeout time.Duration `env:"RYUK_CONNECTION_TIMEOUT" envDefault:"60s"`

// ReconnectionTimeout is the duration after the last connection closes which will trigger
// resource clean up and shutdown.
ReconnectionTimeout time.Duration `env:"RYUK_RECONNECTION_TIMEOUT" envDefault:"10s"`

// RequestTimeout is the timeout for any Docker requests.
RequestTimeout time.Duration `env:"RYUK_REQUEST_TIMEOUT" envDefault:"10s"`

// RemoveRetries is the number of times to retry removing a resource.
RemoveRetries int `env:"RYUK_REMOVE_RETRIES" envDefault:"10"`

// RetryOffset is the offset added to the start time of the prune pass that is
// used as the minimum resource creation time. Any resource created after this
// calculated time will trigger a retry to ensure in use resources are not removed.
RetryOffset time.Duration `env:"RYUK_RETRY_OFFSET" envDefault:"-1s"`

// ShutdownTimeout is the maximum amount of time the reaper will wait
// for once signalled to shutdown before it terminates even if connections
// are still established.
ShutdownTimeout time.Duration `env:"RYUK_SHUTDOWN_TIMEOUT" envDefault:"10m"`

// Port is the port to listen on for connections.
Port uint16 `env:"RYUK_PORT" envDefault:"8080"`

// Verbose is whether to enable verbose aka debug logging.
Verbose bool `env:"RYUK_VERBOSE" envDefault:"false"`
}

// LogAttrs returns the configuration as a slice of attributes.
func (c config) LogAttrs() []slog.Attr {
return []slog.Attr{
slog.Duration("connection_timeout", c.ConnectionTimeout),
slog.Duration("reconnection_timeout", c.ReconnectionTimeout),
slog.Duration("request_timeout", c.RequestTimeout),
slog.Duration("shutdown_timeout", c.ShutdownTimeout),
slog.Int("remove_retries", c.RemoveRetries),
slog.Duration("retry_offset", c.RetryOffset),
slog.Int("port", int(c.Port)),
slog.Bool("verbose", c.Verbose),
}
}

// loadConfig loads the configuration from the environment
// applying defaults where necessary.
func loadConfig() (*config, error) {
var cfg config
if err := env.Parse(&cfg); err != nil {
return nil, fmt.Errorf("parse env: %w", err)
}

return &cfg, nil
}
80 changes: 80 additions & 0 deletions config_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
package main

import (
"os"
"reflect"
"testing"
"time"

"github.com/stretchr/testify/require"
)

// clearConfigEnv clears the environment variables for the config fields.
func clearConfigEnv(t *testing.T) {
t.Helper()

var cfg config
typ := reflect.TypeOf(cfg)
for i := range typ.NumField() {
field := typ.Field(i)
if name := field.Tag.Get("env"); name != "" {
if os.Getenv(name) != "" {
t.Setenv(name, "")
}
}
}
}

func Test_loadConfig(t *testing.T) {
tests := map[string]struct {
setEnv func(*testing.T)
expected config
}{
"defaults": {
setEnv: clearConfigEnv,
expected: config{
ConnectionTimeout: time.Minute,
Port: 8080,
ReconnectionTimeout: time.Second * 10,
RemoveRetries: 10,
RequestTimeout: time.Second * 10,
RetryOffset: -time.Second,
ShutdownTimeout: time.Minute * 10,
},
},
"custom": {
setEnv: func(t *testing.T) {
t.Helper()

clearConfigEnv(t)
t.Setenv("RYUK_PORT", "1234")
t.Setenv("RYUK_CONNECTION_TIMEOUT", "2s")
t.Setenv("RYUK_RECONNECTION_TIMEOUT", "3s")
t.Setenv("RYUK_REQUEST_TIMEOUT", "4s")
t.Setenv("RYUK_REMOVE_RETRIES", "5")
t.Setenv("RYUK_RETRY_OFFSET", "-6s")
t.Setenv("RYUK_SHUTDOWN_TIMEOUT", "7s")
},
expected: config{
Port: 1234,
ConnectionTimeout: time.Second * 2,
ReconnectionTimeout: time.Second * 3,
RequestTimeout: time.Second * 4,
RemoveRetries: 5,
RetryOffset: -time.Second * 6,
ShutdownTimeout: time.Second * 7,
},
},
}
for name, tc := range tests {
t.Run(name, func(t *testing.T) {
if tc.setEnv != nil {
tc.setEnv(t)
}

cfg, err := loadConfig()
require.NoError(t, err)
require.Equal(t, tc.expected, *cfg)
})
}
}
18 changes: 18 additions & 0 deletions consts.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
package main

const (
// labelBase is the base label for testcontainers.
labelBase = "org.testcontainers"

// ryukLabel is the label used to identify reaper containers.
ryukLabel = labelBase + ".ryuk"

// fieldError is the log field key for errors.
fieldError = "error"

// fieldAddress is the log field a client or listening address.
fieldAddress = "address"

// fieldClients is the log field used for client counts.
fieldClients = "clients"
)
Loading

0 comments on commit 2169978

Please sign in to comment.