crabcakes
A lightweight S3-compatible server that serves files from your filesystem.
Features
- S3-compatible API
- AWS Signature V4 authentication with IAM policy-based authorization
- Path-style and virtual-hosted style requests
- Streaming uploads with AWS chunked encoding support
- Smart body buffering (memory/disk spillover)
- Works with AWS CLI and SDKs
Quick Start
If you’re working in the repository, build the binary:
cargo build --release
Or install it with cargo install crabcakes (or use the docker container ghcr.io/yaleman/crabcakes:latest)
# Start server (default: https://localhost:9000, serving ./data)
crabcakes
# Custom base dir for data
crabcakes --host 0.0.0.0 --port 8080 --root-dir /path/to/files
# With debug logging
RUST_LOG=debug crabcakes
Usage with AWS CLI
# List buckets
aws s3 ls --endpoint-url https://localhost:9000
# Create bucket
aws s3 mb s3://mybucket --endpoint-url https://localhost:9000
# Upload object
aws s3 cp file.txt s3://mybucket/ --endpoint-url https://localhost:9000
# Download object
aws s3 cp s3://mybucket/file.txt . --endpoint-url https://localhost:9000
# Delete multiple objects
aws s3api delete-objects --bucket mybucket --delete '{"Objects":[{"Key":"file1.txt"},{"Key":"file2.txt"}]}' --endpoint-url https://localhost:9000
# Copy object (server-side)
aws s3api copy-object --bucket mybucket --key dest.txt --copy-source mybucket/source.txt --endpoint-url https://localhost:9000
Testing
cargo test # Run all tests
./manual_test.sh # Test with AWS CLI, tends to find weirdness
Credits
- Syntax highlighting powered by Prism.js
Configuration
Crabcakes uses a filesystem-based configuration system that stores credentials, policies, and metadata in a configurable directory. This page covers the structure, requirements, and management of configuration files.
Configuration Directory
Default Location
By default, Crabcakes looks for configuration in the ./config directory relative to where the server is started. This can be customized using:
- CLI flag:
--config-dir <PATH> - Environment variable:
CRABCAKES_CONFIG_DIR
Directory Structure
config/
├── credentials/ # Credential JSON files (one per credential)
│ ├── alice.json
│ └── bob.json
├── policies/ # Policy JSON files (one per policy)
│ ├── admin.json
│ └── read-only.json
└── crabcakes.sqlite3 # SQLite database for metadata
Automatic Creation: If the configuration directory or its subdirectories don’t exist, they will be created automatically when the server starts.
Credentials
Credentials are stored as individual JSON files in the credentials/ subdirectory. Each file represents one set of AWS-compatible access credentials.
Credential File Format
Each credential file must be a valid JSON file with exactly two fields, and secret_access_key must be 40 characters in length.
{
"access_key_id": "alice",
"secret_access_key": "alicesecret12345678901234567890123456712x"
}
Field Requirements
access_key_id
- Type: String
- Usage: Used as the username for authentication and authorization
secret_access_key
- Type: String
- Length: MUST be exactly 40 characters (AWS standard length)
- Validation: Enforced at load time and creation time
- Critical: Credentials with invalid secret length will be rejected with an error
Credential Loading Behavior
- All
.jsonfiles in thecredentials/directory are loaded at server startup - Files are processed asynchronously
- Invalid credentials are logged but don’t prevent server startup
- If no valid credentials are loaded, the server will start but no authentication will succeed
- Credentials are cached in memory for fast signature verification
Duplicate Access Key Prevention
During Startup (File Loading):
- If multiple credential files contain the same
access_key_id, the first file processed wins - A warning is logged when duplicate
access_key_idvalues are encountered: “Duplicate access_key_id found, ignoring this credential file (first credential loaded takes precedence)” - Subsequent credential files with the same
access_key_idare ignored - Only the first credential loaded will be active
When Creating Credentials via Web UI:
- The server explicitly checks if a credential with the same
access_key_idalready exists - If found, returns HTTP error with message: “Credential with the same identifier already exists”
- Creation is blocked - you must delete the existing credential first
Best Practice: Use unique access_key_id values and avoid creating multiple credential files with the same identifier.
Security Considerations
- Never commit production credentials to git - Add
config/to your.gitignore - Secret access keys are stored as
SecretStringin memory to prevent accidental logging - Credentials cannot use path traversal sequences in access_key_id (
..,/,\are blocked)
Policies
Policies define authorization rules using AWS IAM-compatible policy syntax. Policy files are stored in the policies/ subdirectory.
See Policies for more details.
Configuration Options
CLI Flags
crabcakes [OPTIONS]
Server Options:
--host <HOST>- Listener address (default:127.0.0.1)-p, --port <PORT>- Port number (default:9000)-r, --root-dir <PATH>- Root directory for file storage (default:./data)
Configuration:
-c, --config-dir <PATH>- Configuration directory (default:./config)--region <REGION>- AWS region name (default:crabcakes)
TLS:
--tls-cert <PATH>- Path to TLS certificate file--tls-key <PATH>- Path to TLS private key file
Authentication:
--oidc-client-id <ID>- OIDC client ID for OAuth2 authentication (required for admin UI)--oidc-discovery-url <URL>- OIDC issuer URL (required for admin UI)--frontend-url <URL>- Frontend URL for OIDC redirect URIs when behind reverse proxy
Environment Variables
All CLI flags can be set via environment variables:
CRABCAKES_LISTENER_ADDRESS- Listener addressCRABCAKES_PORT- Port numberCRABCAKES_ROOT_DIR- Root directory for filesCRABCAKES_CONFIG_DIR- Configuration directoryCRABCAKES_REGION- AWS region nameCRABCAKES_TLS_CERT- TLS certificate pathCRABCAKES_TLS_KEY- TLS key pathCRABCAKES_OIDC_CLIENT_ID- OIDC client IDCRABCAKES_OIDC_DISCOVERY_URL- OIDC discovery URLCRABCAKES_FRONTEND_URL- Frontend URL for reverse proxy
Examples
Basic setup:
crabcakes --config-dir /etc/crabcakes
Custom host and port:
crabcakes --host 0.0.0.0 --port 8080
Using environment variables:
export CRABCAKES_CONFIG_DIR=/etc/crabcakes
export CRABCAKES_PORT=8080
export CRABCAKES_OIDC_CLIENT_ID=your-client-id
export CRABCAKES_OIDC_DISCOVERY_URL=https://accounts.google.com
crabcakes
With TLS:
crabcakes \
--tls-cert /path/to/cert.pem \
--tls-key /path/to/key.pem \
--frontend-url https://s3.example.com
Database
Crabcakes uses SQLite to store metadata including object tags, OAuth PKCE state, temporary credentials, and bucket website configurations.
Database Location: {config_dir}/crabcakes.sqlite3
Features:
- Automatically created on first startup
- Migrations run automatically on startup
- WAL mode enabled for better concurrency
- Auto-vacuum enabled for disk space management
For complete database schema and details, see the Database Documentation.
Reserved Names
The following bucket names are reserved for the admin UI and cannot be created as S3 buckets:
adminapiloginlogoutoauth2.well-knownconfigoidccrabcakesdocshelp.multipart
Best Practices
Security
- Never commit credentials to git: Add
config/to.gitignore - Use strong secrets: Generate random 40-character secret access keys
- Principle of least privilege: Grant minimum permissions needed
- Test policies: Use the Policy Troubleshooter before deploying
Organization
- Naming conventions: Use descriptive names for credentials and policies
- One policy per use case: Create separate policy files for different roles
- Document policies: Use meaningful
Sidvalues in policy statements - Regular audits: Review credentials and policies periodically
Production Deployment
- Use TLS: Always enable TLS in production with
--tls-certand--tls-key - Restrict host: Use
--host 127.0.0.1or specific IP, not0.0.0.0 - Configure OIDC: Set up proper OIDC provider for admin UI authentication
- Set frontend URL: Use
--frontend-urlwhen behind reverse proxy - Monitor logs: Use
RUST_LOGenvironment variable for logging control
Troubleshooting
Credentials not loading
Symptoms: Authentication fails, logs show “No credentials loaded”
Solutions:
- Verify credential files are in
{config_dir}/credentials/ - Check files have
.jsonextension - Verify JSON is valid (use
jqor JSON validator) - Ensure
secret_access_keyis exactly 40 characters - Check file permissions (must be readable by server process)
Policies not taking effect
Symptoms: Authorization denied unexpectedly
Solutions:
- Verify policy files are in
{config_dir}/policies/ - Check JSON syntax is valid
- Use Policy Troubleshooter to test evaluation
- Check principal ARN matches credential’s
access_key_id - Verify resource ARN matches bucket/key being accessed
- Remember: explicit Deny wins over Allow
Database errors
Symptoms: Errors related to SQLite or migrations
Solutions:
- Check
{config_dir}directory is writable - Verify disk space is available
- Delete
crabcakes.sqlite3*files and restart (data will be lost) - Check for file permission issues
Admin UI not accessible
Symptoms: Cannot access /admin URL
Solutions:
- Verify OIDC is configured (
--oidc-client-idand--oidc-discovery-url) - Check OIDC discovery URL is correct and accessible
- Verify redirect URI is registered with OIDC provider
- Use
--frontend-urlif behind reverse proxy - Check browser console for errors
Policies
Policy File Format
Policies follow standard AWS IAM policy format:
{
"Version": "2012-10-17",
"Id": "S3BucketPolicy",
"Statement": [
{
"Sid": "AllowS3All",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:::user/testuser"
},
"Action": [
"s3:*"
],
"Resource": "arn:aws:s3:::bucket1/testuser/*"
}
]
}
Policy Components
Version
Standard AWS IAM policy version: "2012-10-17"
Statement Array
Each policy contains one or more statements with the following fields:
Sid (optional)
- Statement identifier for documentation purposes
Effect (required)
"Allow"- Grants permission"Deny"- Explicitly denies permission (takes precedence overAllow)
Principal (required)
- Specifies who the policy applies to
- AWS user:
{"AWS": "arn:aws:iam:::user/username"} - Wildcard (anonymous):
"*"
Action (required)
- S3 action or actions to allow/deny
- Single action:
"s3:GetObject" - Multiple actions:
["s3:GetObject", "s3:PutObject"] - Wildcard:
"s3:*"
Resource (required)
- S3 resource ARN or ARNs
- Specific object:
"arn:aws:s3:::bucket/key" - Bucket objects:
"arn:aws:s3:::bucket/*" - Multiple resources:
["arn:aws:s3:::bucket1", "arn:aws:s3:::bucket1/*"] - Wildcard:
"*"
Supported S3 Actions
Crabcakes supports the following S3 actions in policies:
Object Operations:
s3:GetObject- Read objectss3:PutObject- Write objectss3:DeleteObject- Delete objectss3:GetObjectTagging- Read object tagss3:PutObjectTagging- Write object tagss3:DeleteObjectTagging- Delete object tagss3:GetObjectAttributes- Read object metadata
Bucket Operations:
s3:ListBucket- List bucket contentss3:CreateBucket- Create new bucketss3:DeleteBucket- Delete bucketss3:HeadBucket- Check bucket existences3:GetBucketLocation- Get bucket regions3:ListAllMyBuckets- List all bucketss3:GetBucketWebsite- Get website configurations3:PutBucketWebsite- Set website configurations3:DeleteBucketWebsite- Delete website configuration
Multipart Upload Operations:
s3:AbortMultipartUpload- Cancel multipart uploads3:ListBucketMultipartUploads- List in-progress uploadss3:ListMultipartUploadParts- List parts of an upload
Wildcards:
s3:*- All S3 actions
Policy Name Validation
Policy filenames must meet the following requirements:
- Pattern:
^[a-zA-Z0-9]{1}[a-zA-Z0-9-_]*[a-zA-Z0-9]{1}$ - Must start and end with alphanumeric characters
- Can contain letters, numbers, hyphens (
-), and underscores (_) - Minimum 2 characters
- Cannot contain
..,/, or\(path traversal protection)
Valid examples: admin-policy, read_only, testUser123
Invalid examples: -admin, policy-, a, ../etc/passwd
Policy Evaluation
Crabcakes uses the iam-rs library for AWS-compatible policy evaluation:
- Default deny: All requests denied unless explicitly allowed
- Explicit deny wins: Deny statements override Allow statements
- Evaluation caching: Results cached for 5 minutes using SHA256 hash of request
- Cache invalidation: Cleared when policies are added, updated, or deleted
- Wildcard principals: Supports anonymous access with
"Principal": "*"
Policy Loading Behavior
- All
.jsonfiles in thepolicies/directory are loaded at server startup - Invalid policies are logged and skipped
- Policies can be hot-reloaded via the admin UI
- If a policy file is removed from disk, it’s removed from memory on next reload
Example Policies
Allow all operations for a specific user:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:::user/alice"
},
"Action": "s3:*",
"Resource": "*"
}
]
}
Read-only access to a specific bucket:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:::user/bob"
},
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::public",
"arn:aws:s3:::public/*"
]
}
]
}
User-specific prefix access:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:::user/charlie"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::shared/charlie/*"
}
]
}
Web-Based Policy Management
Crabcakes provides web-based tools for managing and troubleshooting policies. These tools are available in the admin UI at /admin (requires OIDC authentication).
Policy Editor
Access: Navigate to /admin/policies in your browser after authenticating.
The Policy Editor provides a full-featured interface for managing IAM policies:
Operations:
- List Policies: View all loaded policies with their details
- Create Policy: Form-based policy creation with JSON editor and syntax highlighting
- Edit Policy: Modify existing policy JSON with validation
- View Policy: See policy details and permissions breakdown
- Delete Policy: Remove policies from the system
How to Use:
- Log in to the admin UI at
/adminusing OIDC authentication - Click “Policies” in the navigation menu
- Use the interface to:
- View the list of all policies
- Click “New Policy” to create a policy
- Click “Edit” next to a policy to modify it
- Click “View” to see detailed permissions
- Click “Delete” to remove a policy
Editor Features:
- JSON syntax highlighting using Prism.js
- Real-time validation before saving
- Principal permissions breakdown view
- Automatic policy cache refresh after changes
- Direct filesystem integration (changes persist to
policies/directory)
Policy Troubleshooter
Access: Navigate to /admin/policy_troubleshooter in your browser after authenticating.
The Policy Troubleshooter helps debug authorization issues by simulating policy evaluation without making actual S3 requests.
How to Use:
- Log in to the admin UI at
/admin - Click “Policy Troubleshooter” in the navigation menu
- Fill in the evaluation form:
- User: Principal username (e.g., “alice”)
- Action: S3 action from dropdown (e.g., “s3:GetObject”)
- Bucket: Bucket name
- Key: Object key (optional, for object-level actions)
- Policy Name: Specific policy to test (optional, tests all policies if empty)
- Click “Test Policy” to see the result
Output:
- Decision: Allow, Deny, or NotApplicable
- Matched Statements: Which policy statements applied
- Evaluation Context: Detailed information about the evaluation
Use Cases:
- Debug why a user can’t access a resource
- Verify policy changes before deploying to production
- Understand which policies are granting/denying access
- Test new policies before creating credentials
Example:
To test if user “alice” can read bucket1/test.txt:
- User:
alice - Action:
s3:GetObject - Bucket:
bucket1 - Key:
test.txt
The troubleshooter will show whether the request would be allowed based on loaded policies and which policy statements matched.
Development
Everything needs to pass cargo clippy which is set fairly aggressively, also fmt and test.
There are manual/integration tests which use the AWS CLI to test “real world” usage (./manual_test.sh and scripts/integration/*.sh).
Database Design
Overview
Crabcakes uses SQLite for storing metadata, sessions, and temporary credentials. The database is located at {config_dir}/crabcakes.sqlite3 (default: ./config/crabcakes.sqlite3) and is automatically created on first startup.
Database migrations are managed using SeaORM’s migration framework and run automatically on server startup.
Entity Relationship Diagram
erDiagram
object_tags {
INTEGER id PK
TEXT bucket
TEXT key
TEXT tag_key
TEXT tag_value
DATETIME created_at
}
bucket_website_configs {
TEXT bucket PK
TEXT index_document_suffix
TEXT error_document_key
DATETIME created_at
DATETIME updated_at
}
oauth_pkce_state {
TEXT state PK
TEXT code_verifier
TEXT nonce
TEXT pkce_challenge
TEXT redirect_uri
DATETIME expires_at
DATETIME created_at
}
temporary_credentials {
TEXT access_key_id PK
TEXT secret_access_key
TEXT session_id FK
TEXT user_email
TEXT user_id
DATETIME expires_at
DATETIME created_at
}
tower_sessions {
TEXT id PK
BLOB data
DATETIME expiry_date
}
temporary_credentials ||--o| tower_sessions : "session_id"
Tables
object_tags
Stores S3 object tags with validation and indexing for efficient lookups.
classDiagram
class object_tags {
+INTEGER id
+TEXT bucket
+TEXT key
+TEXT tag_key
+TEXT tag_value
+DATETIME created_at
}
Constraints:
- Primary key:
id - Unique index:
(bucket, key, tag_key)- ensures one value per tag key per object - Lookup index:
(bucket, key)- optimizes tag retrieval for objects
Validation:
- Maximum 10 tags per object
- Tag keys: maximum 128 characters
- Tag values: maximum 256 characters
Purpose: Supports S3 tagging operations (PutObjectTagging, GetObjectTagging, DeleteObjectTagging)
bucket_website_configs
Configuration for S3 static website hosting mode per bucket.
classDiagram
class bucket_website_configs {
+TEXT bucket
+TEXT index_document_suffix
+TEXT error_document_key
+DATETIME created_at
+DATETIME updated_at
}
Constraints:
- Primary key:
bucket index_document_suffixis required (NOT NULL)error_document_keyis optional (nullable)
Purpose:
- Enables S3-compatible static website hosting per bucket
- Configures index document suffix (e.g., “index.html”) for directory requests
- Optionally configures error document (e.g., “error.html”) for 404 responses
- Updated via PutBucketWebsite, GetBucketWebsite, DeleteBucketWebsite operations
Behavior:
- When configured,
GET /bucket/automatically servesbucket/index.html(or configured suffix) - Directory paths ending with
/append the index document suffix - 404 errors automatically serve the error document if configured
- Error document served with 404 status code and proper headers
oauth_pkce_state
Temporary storage for OAuth 2.0 PKCE flow state during OIDC authentication.
classDiagram
class oauth_pkce_state {
+TEXT state
+TEXT code_verifier
+TEXT nonce
+TEXT pkce_challenge
+TEXT redirect_uri
+DATETIME expires_at
+DATETIME created_at
}
Constraints:
- Primary key:
state(OAuth state parameter) - Index:
expires_at- optimizes cleanup operations
Purpose:
- Stores PKCE (Proof Key for Code Exchange) parameters during OAuth flow
- Validates callback requests from OIDC provider
- Automatically cleaned up by background task after expiration
temporary_credentials
AWS-style temporary credentials generated for authenticated web UI users.
classDiagram
class temporary_credentials {
+TEXT access_key_id
+TEXT secret_access_key
+TEXT session_id
+TEXT user_email
+TEXT user_id
+DATETIME expires_at
+DATETIME created_at
}
Constraints:
- Primary key:
access_key_id - Index:
session_id- links to tower-sessions for session management - Index:
expires_at- optimizes cleanup operations
Purpose:
- Generated on successful OIDC login
- Allows web UI users to make S3 API calls
- Linked to user session for lifecycle management
- Automatically cleaned up after expiration
tower_sessions
Auto-managed session store for the admin web UI (created by tower-sessions library).
Purpose:
- Manages user sessions for admin UI
- Stores session data including authentication state
- Referenced by
temporary_credentials.session_id
Database Operations
DBService API
The DBService struct (src/db/service.rs) provides all database operations:
Tag Operations:
#![allow(unused)]
fn main() {
put_tags(bucket: &str, key: &str, tags: &[(String, String)])
get_tags(bucket: &str, key: &str) -> Vec<(String, String)>
delete_tags(bucket: &str, key: &str)
}
Bucket Website Configuration Operations:
#![allow(unused)]
fn main() {
put_website_config(bucket: &str, index_suffix: &str, error_key: Option<&str>)
get_website_config(bucket: &str) -> Option<BucketWebsiteConfig>
delete_website_config(bucket: &str)
}
OAuth PKCE Operations:
#![allow(unused)]
fn main() {
store_pkce_state(state, code_verifier, nonce, pkce_challenge, redirect_uri, expires_at)
get_pkce_state(state: &str) -> Option<PkceState>
delete_pkce_state(state: &str)
cleanup_expired_pkce_states() -> u64
}
Temporary Credentials Operations:
#![allow(unused)]
fn main() {
store_temporary_credentials(access_key_id, secret_access_key, session_id, user_email, user_id, expires_at)
get_temporary_credentials(access_key_id: &str) -> Option<TemporaryCredential>
get_credentials_by_session(session_id: &str) -> Vec<TemporaryCredential>
delete_temporary_credentials(access_key_id: &str)
delete_credentials_by_session(session_id: &str)
cleanup_expired_credentials() -> u64
}
Background Cleanup
A background task (CleanupTask in src/cleanup.rs) runs every 5 minutes to remove expired data:
sequenceDiagram
participant Server
participant CleanupTask
participant DBService
participant SQLite
Server->>CleanupTask: spawn on startup
loop Every 5 minutes
CleanupTask->>DBService: cleanup_expired_pkce_states()
DBService->>SQLite: DELETE WHERE expires_at < NOW
SQLite-->>DBService: count
DBService-->>CleanupTask: records deleted
CleanupTask->>DBService: cleanup_expired_credentials()
DBService->>SQLite: DELETE WHERE expires_at < NOW
SQLite-->>DBService: count
DBService-->>CleanupTask: records deleted
end
Cleanup Operations:
- Removes expired OAuth PKCE states
- Removes expired temporary credentials
- Logs info messages when records are cleaned
- Continues running on errors (with error logging)
Migrations
Location
Migrations are stored in src/db/migration/:
- Format:
mYYYYMMDD_HHMMSS_description.rs - Each migration implements
up()anddown()methods - Registered in
src/db/migration/mod.rs
Adding a New Migration
-
Create migration file:
#![allow(unused)] fn main() { // src/db/migration/m20250119_000001_create_my_table.rs use sea_orm_migration::prelude::*; pub struct Migration; impl MigrationName for Migration { fn name(&self) -> &str { "m20250119_000001_create_my_table" } } #[async_trait::async_trait] impl MigrationTrait for Migration { async fn up(&self, manager: &SchemaManager) -> Result<(), DbErr> { manager.create_table(/* ... */).await } async fn down(&self, manager: &SchemaManager) -> Result<(), DbErr> { manager.drop_table(/* ... */).await } } } -
Register in
src/db/migration/mod.rs:#![allow(unused)] fn main() { vec![ // ...existing migrations... Box::new(m20250119_000001_create_my_table::Migration), ] } -
Migration runs automatically on next server startup
Existing Migrations
Current migrations create:
object_tagstable with indexesoauth_pkce_statetable with indexestemporary_credentialstable with indexesbucket_website_configstable
Testing Recommendations
Unit Tests:
- Use in-memory database:
sqlite::memory: - Fastest option for testing business logic
- No cleanup required
Integration Tests:
- Use temporary directory for database file
- Tests full migration and persistence
- Cleanup temp directory after test
Example:
#![allow(unused)]
fn main() {
// In-memory for unit tests
let db = Database::connect("sqlite::memory:").await?;
// Temp directory for integration tests
let temp_dir = tempdir()?;
let db_path = temp_dir.path().join("test.sqlite3");
let db = Database::connect(format!("sqlite:{}", db_path.display())).await?;
}