crabcakes

A lightweight S3-compatible server that serves files from your filesystem.

Features

S3-compatible API
AWS Signature V4 authentication with IAM policy-based authorization
Path-style and virtual-hosted style requests
Streaming uploads with AWS chunked encoding support
Smart body buffering (memory/disk spillover)
Works with AWS CLI and SDKs

Quick Start

If you’re working in the repository, build the binary:

cargo build --release

Or install it with cargo install crabcakes (or use the docker container ghcr.io/yaleman/crabcakes:latest)

# Start server (default: https://localhost:9000, serving ./data)
crabcakes

# Custom base dir for data
crabcakes --host 0.0.0.0 --port 8080 --root-dir /path/to/files

# With debug logging
RUST_LOG=debug crabcakes

Usage with AWS CLI

# List buckets
aws s3 ls --endpoint-url https://localhost:9000

# Create bucket
aws s3 mb s3://mybucket --endpoint-url https://localhost:9000

# Upload object
aws s3 cp file.txt s3://mybucket/ --endpoint-url https://localhost:9000

# Download object
aws s3 cp s3://mybucket/file.txt . --endpoint-url https://localhost:9000

# Delete multiple objects
aws s3api delete-objects --bucket mybucket --delete '{"Objects":[{"Key":"file1.txt"},{"Key":"file2.txt"}]}' --endpoint-url https://localhost:9000

# Copy object (server-side)
aws s3api copy-object --bucket mybucket --key dest.txt --copy-source mybucket/source.txt --endpoint-url https://localhost:9000

Testing

cargo test       # Run all tests
./manual_test.sh # Test with AWS CLI, tends to find weirdness

Configuration

Crabcakes uses a filesystem-based configuration system that stores credentials, policies, and metadata in a configurable directory. This page covers the structure, requirements, and management of configuration files.

Configuration Directory

Default Location

By default, Crabcakes looks for configuration in the ./config directory relative to where the server is started. This can be customized using:

CLI flag: --config-dir <PATH>
Environment variable: CRABCAKES_CONFIG_DIR

Directory Structure

config/
├── credentials/          # Credential JSON files (one per credential)
│   ├── alice.json
│   └── bob.json
├── policies/            # Policy JSON files (one per policy)
│   ├── admin.json
│   └── read-only.json
└── crabcakes.sqlite3    # SQLite database for metadata

Automatic Creation: If the configuration directory or its subdirectories don’t exist, they will be created automatically when the server starts.

Credentials

Credentials are stored as individual JSON files in the credentials/ subdirectory. Each file represents one set of AWS-compatible access credentials.

Credential File Format

Each credential file must be a valid JSON file with exactly two fields, and secret_access_key must be 40 characters in length.

{
  "access_key_id": "alice",
  "secret_access_key": "alicesecret12345678901234567890123456712x"
}

Field Requirements

`access_key_id`

Type: String
Usage: Used as the username for authentication and authorization

`secret_access_key`

Type: String
Length: MUST be exactly 40 characters (AWS standard length)
Validation: Enforced at load time and creation time
Critical: Credentials with invalid secret length will be rejected with an error

Credential Loading Behavior

All .json files in the credentials/ directory are loaded at server startup
Files are processed asynchronously
Invalid credentials are logged but don’t prevent server startup
If no valid credentials are loaded, the server will start but no authentication will succeed
Credentials are cached in memory for fast signature verification

Duplicate Access Key Prevention

During Startup (File Loading):

If multiple credential files contain the same access_key_id, the first file processed wins
A warning is logged when duplicate access_key_id values are encountered: “Duplicate access_key_id found, ignoring this credential file (first credential loaded takes precedence)”
Subsequent credential files with the same access_key_id are ignored
Only the first credential loaded will be active

When Creating Credentials via Web UI:

The server explicitly checks if a credential with the same access_key_id already exists
If found, returns HTTP error with message: “Credential with the same identifier already exists”
Creation is blocked - you must delete the existing credential first

Best Practice: Use unique access_key_id values and avoid creating multiple credential files with the same identifier.

Security Considerations

Never commit production credentials to git - Add config/ to your .gitignore
Secret access keys are stored as SecretString in memory to prevent accidental logging
Credentials cannot use path traversal sequences in access_key_id (.., /, \ are blocked)

Policies

Policies define authorization rules using AWS IAM-compatible policy syntax. Policy files are stored in the policies/ subdirectory.

See Policies for more details.

Configuration Options

CLI Flags

crabcakes [OPTIONS]

Server Options:

--host <HOST> - Listener address (default: 127.0.0.1)
-p, --port <PORT> - Port number (default: 9000)
-r, --root-dir <PATH> - Root directory for file storage (default: ./data)

Configuration:

-c, --config-dir <PATH> - Configuration directory (default: ./config)
--region <REGION> - AWS region name (default: crabcakes)

TLS:

--tls-cert <PATH> - Path to TLS certificate file
--tls-key <PATH> - Path to TLS private key file

Authentication:

--oidc-client-id <ID> - OIDC client ID for OAuth2 authentication (required for admin UI)
--oidc-discovery-url <URL> - OIDC issuer URL (required for admin UI)
--frontend-url <URL> - Frontend URL for OIDC redirect URIs when behind reverse proxy

Environment Variables

All CLI flags can be set via environment variables:

CRABCAKES_LISTENER_ADDRESS - Listener address
CRABCAKES_PORT - Port number
CRABCAKES_ROOT_DIR - Root directory for files
CRABCAKES_CONFIG_DIR - Configuration directory
CRABCAKES_REGION - AWS region name
CRABCAKES_TLS_CERT - TLS certificate path
CRABCAKES_TLS_KEY - TLS key path
CRABCAKES_OIDC_CLIENT_ID - OIDC client ID
CRABCAKES_OIDC_DISCOVERY_URL - OIDC discovery URL
CRABCAKES_FRONTEND_URL - Frontend URL for reverse proxy

Examples

Basic setup:

crabcakes --config-dir /etc/crabcakes

Custom host and port:

crabcakes --host 0.0.0.0 --port 8080

Using environment variables:

export CRABCAKES_CONFIG_DIR=/etc/crabcakes
export CRABCAKES_PORT=8080
export CRABCAKES_OIDC_CLIENT_ID=your-client-id
export CRABCAKES_OIDC_DISCOVERY_URL=https://accounts.google.com
crabcakes

With TLS:

crabcakes \
  --tls-cert /path/to/cert.pem \
  --tls-key /path/to/key.pem \
  --frontend-url https://s3.example.com

Database

Crabcakes uses SQLite to store metadata including object tags, OAuth PKCE state, temporary credentials, and bucket website configurations.

Database Location: {config_dir}/crabcakes.sqlite3

Features:

Automatically created on first startup
Migrations run automatically on startup
WAL mode enabled for better concurrency
Auto-vacuum enabled for disk space management

For complete database schema and details, see the Database Documentation.

Reserved Names

The following bucket names are reserved for the admin UI and cannot be created as S3 buckets:

admin
api
login
logout
oauth2
.well-known
config
oidc
crabcakes
docs
help
.multipart

Best Practices

Security

Never commit credentials to git: Add config/ to .gitignore
Use strong secrets: Generate random 40-character secret access keys
Principle of least privilege: Grant minimum permissions needed
Test policies: Use the Policy Troubleshooter before deploying

Organization

Naming conventions: Use descriptive names for credentials and policies
One policy per use case: Create separate policy files for different roles
Document policies: Use meaningful Sid values in policy statements
Regular audits: Review credentials and policies periodically

Production Deployment

Use TLS: Always enable TLS in production with --tls-cert and --tls-key
Restrict host: Use --host 127.0.0.1 or specific IP, not 0.0.0.0
Configure OIDC: Set up proper OIDC provider for admin UI authentication
Set frontend URL: Use --frontend-url when behind reverse proxy
Monitor logs: Use RUST_LOG environment variable for logging control

Troubleshooting

Credentials not loading

Symptoms: Authentication fails, logs show “No credentials loaded”

Solutions:

Verify credential files are in {config_dir}/credentials/
Check files have .json extension
Verify JSON is valid (use jq or JSON validator)
Ensure secret_access_key is exactly 40 characters
Check file permissions (must be readable by server process)

Policies not taking effect

Symptoms: Authorization denied unexpectedly

Solutions:

Verify policy files are in {config_dir}/policies/
Check JSON syntax is valid
Use Policy Troubleshooter to test evaluation
Check principal ARN matches credential’s access_key_id
Verify resource ARN matches bucket/key being accessed
Remember: explicit Deny wins over Allow

Database errors

Symptoms: Errors related to SQLite or migrations

Solutions:

Check {config_dir} directory is writable
Verify disk space is available
Delete crabcakes.sqlite3* files and restart (data will be lost)
Check for file permission issues

Admin UI not accessible

Symptoms: Cannot access /admin URL

Solutions:

Verify OIDC is configured (--oidc-client-id and --oidc-discovery-url)
Check OIDC discovery URL is correct and accessible
Verify redirect URI is registered with OIDC provider
Use --frontend-url if behind reverse proxy
Check browser console for errors

Policies

Policy File Format

Policies follow standard AWS IAM policy format:

{
    "Version": "2012-10-17",
    "Id": "S3BucketPolicy",
    "Statement": [
        {
            "Sid": "AllowS3All",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam:::user/testuser"
            },
            "Action": [
                "s3:*"
            ],
            "Resource": "arn:aws:s3:::bucket1/testuser/*"
        }
    ]
}

Policy Components

Version

Standard AWS IAM policy version: "2012-10-17"

Statement Array

Each policy contains one or more statements with the following fields:

Sid (optional)

Statement identifier for documentation purposes

Effect (required)

"Allow" - Grants permission
"Deny" - Explicitly denies permission (takes precedence over Allow)

Principal (required)

Specifies who the policy applies to
AWS user: {"AWS": "arn:aws:iam:::user/username"}
Wildcard (anonymous): "*"

Action (required)

S3 action or actions to allow/deny
Single action: "s3:GetObject"
Multiple actions: ["s3:GetObject", "s3:PutObject"]
Wildcard: "s3:*"

Resource (required)

S3 resource ARN or ARNs
Specific object: "arn:aws:s3:::bucket/key"
Bucket objects: "arn:aws:s3:::bucket/*"
Multiple resources: ["arn:aws:s3:::bucket1", "arn:aws:s3:::bucket1/*"]
Wildcard: "*"

Supported S3 Actions

Crabcakes supports the following S3 actions in policies:

Object Operations:

s3:GetObject - Read objects
s3:PutObject - Write objects
s3:DeleteObject - Delete objects
s3:GetObjectTagging - Read object tags
s3:PutObjectTagging - Write object tags
s3:DeleteObjectTagging - Delete object tags
s3:GetObjectAttributes - Read object metadata

Bucket Operations:

s3:ListBucket - List bucket contents
s3:CreateBucket - Create new buckets
s3:DeleteBucket - Delete buckets
s3:HeadBucket - Check bucket existence
s3:GetBucketLocation - Get bucket region
s3:ListAllMyBuckets - List all buckets
s3:GetBucketWebsite - Get website configuration
s3:PutBucketWebsite - Set website configuration
s3:DeleteBucketWebsite - Delete website configuration

Multipart Upload Operations:

s3:AbortMultipartUpload - Cancel multipart upload
s3:ListBucketMultipartUploads - List in-progress uploads
s3:ListMultipartUploadParts - List parts of an upload

Wildcards:

s3:* - All S3 actions

Policy Name Validation

Policy filenames must meet the following requirements:

Pattern: ^[a-zA-Z0-9]{1}[a-zA-Z0-9-_]*[a-zA-Z0-9]{1}$
Must start and end with alphanumeric characters
Can contain letters, numbers, hyphens (-), and underscores (_)
Minimum 2 characters
Cannot contain .., /, or \ (path traversal protection)

Valid examples: admin-policy, read_only, testUser123

Invalid examples: -admin, policy-, a, ../etc/passwd

Policy Evaluation

Crabcakes uses the iam-rs library for AWS-compatible policy evaluation:

Default deny: All requests denied unless explicitly allowed
Explicit deny wins: Deny statements override Allow statements
Evaluation caching: Results cached for 5 minutes using SHA256 hash of request
Cache invalidation: Cleared when policies are added, updated, or deleted
Wildcard principals: Supports anonymous access with "Principal": "*"

Policy Loading Behavior

All .json files in the policies/ directory are loaded at server startup
Invalid policies are logged and skipped
Policies can be hot-reloaded via the admin UI
If a policy file is removed from disk, it’s removed from memory on next reload

Example Policies

Allow all operations for a specific user:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam:::user/alice"
            },
            "Action": "s3:*",
            "Resource": "*"
        }
    ]
}

Read-only access to a specific bucket:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam:::user/bob"
            },
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::public",
                "arn:aws:s3:::public/*"
            ]
        }
    ]
}

User-specific prefix access:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam:::user/charlie"
            },
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::shared/charlie/*"
        }
    ]
}

Web-Based Policy Management

Crabcakes provides web-based tools for managing and troubleshooting policies. These tools are available in the admin UI at /admin (requires OIDC authentication).

Policy Editor

Access: Navigate to /admin/policies in your browser after authenticating.

The Policy Editor provides a full-featured interface for managing IAM policies:

Operations:

List Policies: View all loaded policies with their details
Create Policy: Form-based policy creation with JSON editor and syntax highlighting
Edit Policy: Modify existing policy JSON with validation
View Policy: See policy details and permissions breakdown
Delete Policy: Remove policies from the system

How to Use:

Log in to the admin UI at /admin using OIDC authentication
Click “Policies” in the navigation menu
Use the interface to:
- View the list of all policies
- Click “New Policy” to create a policy
- Click “Edit” next to a policy to modify it
- Click “View” to see detailed permissions
- Click “Delete” to remove a policy

Editor Features:

JSON syntax highlighting using Prism.js
Real-time validation before saving
Principal permissions breakdown view
Automatic policy cache refresh after changes
Direct filesystem integration (changes persist to policies/ directory)

Policy Troubleshooter

Access: Navigate to /admin/policy_troubleshooter in your browser after authenticating.

The Policy Troubleshooter helps debug authorization issues by simulating policy evaluation without making actual S3 requests.

How to Use:

Log in to the admin UI at /admin
Click “Policy Troubleshooter” in the navigation menu
Fill in the evaluation form:
- User: Principal username (e.g., “alice”)
- Action: S3 action from dropdown (e.g., “s3:GetObject”)
- Bucket: Bucket name
- Key: Object key (optional, for object-level actions)
- Policy Name: Specific policy to test (optional, tests all policies if empty)
Click “Test Policy” to see the result

Output:

Decision: Allow, Deny, or NotApplicable
Matched Statements: Which policy statements applied
Evaluation Context: Detailed information about the evaluation

Use Cases:

Debug why a user can’t access a resource
Verify policy changes before deploying to production
Understand which policies are granting/denying access
Test new policies before creating credentials

Example:

To test if user “alice” can read bucket1/test.txt:

User: alice
Action: s3:GetObject
Bucket: bucket1
Key: test.txt

The troubleshooter will show whether the request would be allowed based on loaded policies and which policy statements matched.

Development

Everything needs to pass cargo clippy which is set fairly aggressively, also fmt and test.

There are manual/integration tests which use the AWS CLI to test “real world” usage (./manual_test.sh and scripts/integration/*.sh).

Database Design

Overview

Crabcakes uses SQLite for storing metadata, sessions, and temporary credentials. The database is located at {config_dir}/crabcakes.sqlite3 (default: ./config/crabcakes.sqlite3) and is automatically created on first startup.

Database migrations are managed using SeaORM’s migration framework and run automatically on server startup.

Entity Relationship Diagram

erDiagram
    object_tags {
        INTEGER id PK
        TEXT bucket
        TEXT key
        TEXT tag_key
        TEXT tag_value
        DATETIME created_at
    }

    bucket_website_configs {
        TEXT bucket PK
        TEXT index_document_suffix
        TEXT error_document_key
        DATETIME created_at
        DATETIME updated_at
    }

    oauth_pkce_state {
        TEXT state PK
        TEXT code_verifier
        TEXT nonce
        TEXT pkce_challenge
        TEXT redirect_uri
        DATETIME expires_at
        DATETIME created_at
    }

    temporary_credentials {
        TEXT access_key_id PK
        TEXT secret_access_key
        TEXT session_id FK
        TEXT user_email
        TEXT user_id
        DATETIME expires_at
        DATETIME created_at
    }

    tower_sessions {
        TEXT id PK
        BLOB data
        DATETIME expiry_date
    }

    temporary_credentials ||--o| tower_sessions : "session_id"

Tables

object_tags

Stores S3 object tags with validation and indexing for efficient lookups.

classDiagram
    class object_tags {
        +INTEGER id
        +TEXT bucket
        +TEXT key
        +TEXT tag_key
        +TEXT tag_value
        +DATETIME created_at
    }

Constraints:

Primary key: id
Unique index: (bucket, key, tag_key) - ensures one value per tag key per object
Lookup index: (bucket, key) - optimizes tag retrieval for objects

Validation:

Maximum 10 tags per object
Tag keys: maximum 128 characters
Tag values: maximum 256 characters

Purpose: Supports S3 tagging operations (PutObjectTagging, GetObjectTagging, DeleteObjectTagging)

bucket_website_configs

Configuration for S3 static website hosting mode per bucket.

classDiagram
    class bucket_website_configs {
        +TEXT bucket
        +TEXT index_document_suffix
        +TEXT error_document_key
        +DATETIME created_at
        +DATETIME updated_at
    }

Constraints:

Primary key: bucket
index_document_suffix is required (NOT NULL)
error_document_key is optional (nullable)

Purpose:

Enables S3-compatible static website hosting per bucket
Configures index document suffix (e.g., “index.html”) for directory requests
Optionally configures error document (e.g., “error.html”) for 404 responses
Updated via PutBucketWebsite, GetBucketWebsite, DeleteBucketWebsite operations

Behavior:

When configured, GET /bucket/ automatically serves bucket/index.html (or configured suffix)
Directory paths ending with / append the index document suffix
404 errors automatically serve the error document if configured
Error document served with 404 status code and proper headers

oauth_pkce_state

Temporary storage for OAuth 2.0 PKCE flow state during OIDC authentication.

classDiagram
    class oauth_pkce_state {
        +TEXT state
        +TEXT code_verifier
        +TEXT nonce
        +TEXT pkce_challenge
        +TEXT redirect_uri
        +DATETIME expires_at
        +DATETIME created_at
    }

Constraints:

Primary key: state (OAuth state parameter)
Index: expires_at - optimizes cleanup operations

Purpose:

Stores PKCE (Proof Key for Code Exchange) parameters during OAuth flow
Validates callback requests from OIDC provider
Automatically cleaned up by background task after expiration

temporary_credentials

AWS-style temporary credentials generated for authenticated web UI users.

classDiagram
    class temporary_credentials {
        +TEXT access_key_id
        +TEXT secret_access_key
        +TEXT session_id
        +TEXT user_email
        +TEXT user_id
        +DATETIME expires_at
        +DATETIME created_at
    }

Constraints:

Primary key: access_key_id
Index: session_id - links to tower-sessions for session management
Index: expires_at - optimizes cleanup operations

Purpose:

Generated on successful OIDC login
Allows web UI users to make S3 API calls
Linked to user session for lifecycle management
Automatically cleaned up after expiration

tower_sessions

Auto-managed session store for the admin web UI (created by tower-sessions library).

Purpose:

Manages user sessions for admin UI
Stores session data including authentication state
Referenced by temporary_credentials.session_id

Database Operations

DBService API

The DBService struct (src/db/service.rs) provides all database operations:

Tag Operations:

#![allow(unused)]
fn main() {
put_tags(bucket: &str, key: &str, tags: &[(String, String)])
get_tags(bucket: &str, key: &str) -> Vec<(String, String)>
delete_tags(bucket: &str, key: &str)
}

Bucket Website Configuration Operations:

#![allow(unused)]
fn main() {
put_website_config(bucket: &str, index_suffix: &str, error_key: Option<&str>)
get_website_config(bucket: &str) -> Option<BucketWebsiteConfig>
delete_website_config(bucket: &str)
}

OAuth PKCE Operations:

#![allow(unused)]
fn main() {
store_pkce_state(state, code_verifier, nonce, pkce_challenge, redirect_uri, expires_at)
get_pkce_state(state: &str) -> Option<PkceState>
delete_pkce_state(state: &str)
cleanup_expired_pkce_states() -> u64
}

Temporary Credentials Operations:

#![allow(unused)]
fn main() {
store_temporary_credentials(access_key_id, secret_access_key, session_id, user_email, user_id, expires_at)
get_temporary_credentials(access_key_id: &str) -> Option<TemporaryCredential>
get_credentials_by_session(session_id: &str) -> Vec<TemporaryCredential>
delete_temporary_credentials(access_key_id: &str)
delete_credentials_by_session(session_id: &str)
cleanup_expired_credentials() -> u64
}

Background Cleanup

A background task (CleanupTask in src/cleanup.rs) runs every 5 minutes to remove expired data:

sequenceDiagram
    participant Server
    participant CleanupTask
    participant DBService
    participant SQLite

    Server->>CleanupTask: spawn on startup
    loop Every 5 minutes
        CleanupTask->>DBService: cleanup_expired_pkce_states()
        DBService->>SQLite: DELETE WHERE expires_at < NOW
        SQLite-->>DBService: count
        DBService-->>CleanupTask: records deleted
        CleanupTask->>DBService: cleanup_expired_credentials()
        DBService->>SQLite: DELETE WHERE expires_at < NOW
        SQLite-->>DBService: count
        DBService-->>CleanupTask: records deleted
    end

Cleanup Operations:

Removes expired OAuth PKCE states
Removes expired temporary credentials
Logs info messages when records are cleaned
Continues running on errors (with error logging)

Migrations

Location

Migrations are stored in src/db/migration/:

Format: mYYYYMMDD_HHMMSS_description.rs
Each migration implements up() and down() methods
Registered in src/db/migration/mod.rs

Adding a New Migration

Create migration file:

#![allow(unused)]
fn main() {
// src/db/migration/m20250119_000001_create_my_table.rs
use sea_orm_migration::prelude::*;

pub struct Migration;

impl MigrationName for Migration {
    fn name(&self) -> &str {
        "m20250119_000001_create_my_table"
    }
}

#[async_trait::async_trait]
impl MigrationTrait for Migration {
    async fn up(&self, manager: &SchemaManager) -> Result<(), DbErr> {
        manager.create_table(/* ... */).await
    }

    async fn down(&self, manager: &SchemaManager) -> Result<(), DbErr> {
        manager.drop_table(/* ... */).await
    }
}
}

#![allow(unused)]
fn main() {
vec![
    // ...existing migrations...
    Box::new(m20250119_000001_create_my_table::Migration),
]
}

Migration runs automatically on next server startup

Existing Migrations

Current migrations create:

object_tags table with indexes
oauth_pkce_state table with indexes
temporary_credentials table with indexes
bucket_website_configs table

Testing Recommendations

Unit Tests:

Use in-memory database: sqlite::memory:
Fastest option for testing business logic
No cleanup required

Integration Tests:

Use temporary directory for database file
Tests full migration and persistence
Cleanup temp directory after test

Example:

#![allow(unused)]
fn main() {
// In-memory for unit tests
let db = Database::connect("sqlite::memory:").await?;

// Temp directory for integration tests
let temp_dir = tempdir()?;
let db_path = temp_dir.path().join("test.sqlite3");
let db = Database::connect(format!("sqlite:{}", db_path.display())).await?;
}

Keyboard shortcuts

Crabcakes