ObjectStackObjectStack

Seed Data & Fixtures

Populate ObjectStack objects with bootstrap data, reference records, and demo fixtures using defineDataset()

Seed Data & Fixtures

defineDataset() is the canonical way to define seed data in ObjectStack. It provides compile-time type safety by inferring valid field keys directly from your object definition, so typos in record field names are caught before the code runs.

Use seed data for:

  • System bootstrap — default roles, admin users, system configuration
  • Reference data — countries, currencies, ISO codes, standard picklist values
  • Demo / test fixtures — realistic sample records for development and CI

Quick Start

import { defineDataset } from '@objectstack/spec/data';
import { Account } from './objects/account.object';

export const accountsSeed = defineDataset(Account, {
  externalId: 'name',      // field used as the upsert / idempotency key
  mode: 'upsert',          // create if new, update if found
  env: ['dev', 'test'],    // only load in dev and test environments
  records: [
    {
      name: 'Acme Corporation',
      type: 'customer',
      industry: 'technology',
      annual_revenue: 5000000,
    },
    {
      name: 'Globex Industries',
      type: 'prospect',
      industry: 'manufacturing',
      annual_revenue: 12000000,
    },
  ],
});

The first argument is the object definition (the exported constant from your object file), not a string. This lets TypeScript validate every field name in records against the object's fields map at compile time.


Import Modes

The mode field controls how the seed runner behaves when it encounters an existing record (matched by externalId).

ModeBehaviorUse Case
upsertCreate if new, update if foundDefault — idempotent for most data
insertCreate only, throw on duplicateAppend-only tables, audit logs
updateUpdate only, skip if not foundMigration patches on existing rows
ignoreCreate if new, silently skip duplicatesBootstrap data that must not overwrite user edits
replaceDelete ALL records then insertCache / lookup tables rebuilt on each run
defineDataset(Currency, {
  externalId: 'code',
  mode: 'upsert',
  records: [
    { code: 'USD', name: 'US Dollar',    symbol: '$' },
    { code: 'EUR', name: 'Euro',         symbol: '€' },
    { code: 'GBP', name: 'British Pound', symbol: '£' },
  ],
});

ignore — Bootstrap Without Overwriting

defineDataset(SystemRole, {
  externalId: 'code',
  mode: 'ignore',
  records: [
    { code: 'admin',  label: 'Administrator' },
    { code: 'viewer', label: 'Viewer' },
  ],
});

replace — Full Table Rebuild

// ⚠️ Deletes ALL records in the object before inserting.
// Only use for cache or lookup tables with no user-generated data.
defineDataset(ExchangeRateCache, {
  externalId: 'key',
  mode: 'replace',
  env: ['dev'],
  records: [
    { key: 'USD_EUR', rate: 0.92 },
    { key: 'USD_GBP', rate: 0.79 },
  ],
});

Environment Scoping

The env array controls which deployment environments receive the records. The default is ['prod', 'dev', 'test'] — all environments.

// Reference data — safe for all environments (default)
defineDataset(Country, {
  // env omitted → defaults to ['prod', 'dev', 'test']
  records: [
    { code: 'US', name: 'United States' },
    { code: 'GB', name: 'United Kingdom' },
  ],
});

// Demo data — never reaches production
defineDataset(Account, {
  env: ['dev', 'test'],
  records: [
    { name: 'Demo Corp', type: 'customer' },
  ],
});

// Automated test fixtures — CI/CD only
defineDataset(TestUser, {
  env: ['test'],
  records: [
    { email: 'ci-admin@example.com', role: 'admin' },
  ],
});

Type Safety

defineDataset() infers valid field keys from the object definition you pass as the first argument. If you reference a field that does not exist on the object, TypeScript reports an error immediately.

import { Account } from './objects/account.object';

defineDataset(Account, {
  records: [
    {
      name: 'Test Corp',
      typo_fild: 'value',
      //  ^^^^^^^^^
      //  TS Error: Object literal may only specify known properties,
      //  and 'typo_fild' does not exist in type 'Partial<Record<keyof ...>>'
    },
  ],
});

This is a major advantage over writing plain JSON — always use defineDataset() over the raw DatasetSchema.parse() call.


Relationship Fields

For lookup fields that reference another object, supply the natural key value of the related record (such as name, email, or code) — not its UUID. The seed runner resolves natural keys to database IDs automatically at load time.

// Step 1 — seed the parent object first
const accountsSeed = defineDataset(Account, {
  externalId: 'name',
  records: [
    { name: 'Acme Corporation', type: 'customer' },
  ],
});

// Step 2 — seed the child object, referencing the parent by natural key
const contactsSeed = defineDataset(Contact, {
  externalId: 'email',
  records: [
    {
      email: 'john.smith@acme.example.com',
      first_name: 'John',
      last_name: 'Smith',
      account: 'Acme Corporation',   // natural key, not a UUID
    },
  ],
});

// Export in dependency order — parents before children
export const SeedData = [accountsSeed, contactsSeed];

Dynamic Values (CEL)

Any field value may be a CEL expression evaluated at install time against a single per-load pinned now. This is the only correct way to author time-based or identity-derived seed values — a literal new Date() would ship the package author's clock to every customer and break build determinism.

import { defineDataset, cel } from '@objectstack/spec';

defineDataset(Opportunity, {
  records: [{
    name:            'Acme Q3 Renewal',
    close_date:      cel`daysFromNow(45)`,
    created_at:      cel`now()`,
    owner_id:        cel`os.user.id`,   // the seed identity
    organization_id: cel`os.org.id`,
  }],
});

Available in the seed CEL context:

  • Functions: now(), today(), daysFromNow(n), daysAgo(n), isBlank(v), coalesce(v, fallback)
  • Scope: os.user, os.org, os.env

Binding records to a user (os.user)

Many objects have a required owner lookup — owner_id, created_by, assigned_to. To seed such a record, bind it to a user with cel\os.user.id`. This is the single canonical convention; there is no currentUser(), @admin`, or similar special syntax.

defineDataset(Project, {
  externalId: 'code',
  records: [{
    code:     'bootstrap',
    name:     'Bootstrap Project',
    owner_id: cel`os.user.id`,   // ← bound to the seed identity
  }],
});

Where does os.user come from? On a fresh boot there are no human users yet — seeding runs before the first sign-up. So the runtime provisions a deterministic, non-loginable system user (usr_system, role system) before any seed runs and binds it to os.user. It owns seeded data the way Salesforce's "Automated Process" user does — it has no credential and cannot sign in.

  • The human login admin is created separately (CLI sign-up / first-signup promotion) through better-auth and need not be the seed owner.
  • os.org.id resolves to the current organization; during a per-tenant replay it is that tenant's id, falling back to the load's organizationId.

This ordering guarantee means cel\os.user.id`/cel`os.org.id`` always resolve at boot — you never have to sequence seeds around user creation.

Failure is loud, not silent

If a record uses a CEL value that cannot be resolved — e.g. cel\os.user.id`` when the system identity could not be provisioned — the record is not silently dropped. The loader counts it as an error, marks the load unsuccessful, and logs an actionable message:

[SeedLoader] Cannot resolve dynamic seed values for project record #0:
  ... Records using cel`os.user.id` / cel`os.org.id` require a seed identity —
  ensure a system/admin user exists before seeding.

Write failures (e.g. a required field still missing after resolution) are surfaced the same way. Tooling should check result.success / result.errors.


Organising Multiple Datasets

For applications with several objects, co-locate seed files under src/data/ and export a single aggregate array.

src/
  data/
    index.ts              ← exports SeedData array in dependency order
    accounts.seed.ts
    contacts.seed.ts
    leads.seed.ts
    products.seed.ts
// src/data/index.ts
import { accountsSeed }  from './accounts.seed';
import { contactsSeed }  from './contacts.seed';
import { leadsSeed }     from './leads.seed';
import { productsSeed }  from './products.seed';

/** All seed datasets — order determines load sequence */
export const SeedData = [
  accountsSeed,   // no dependencies
  productsSeed,   // no dependencies
  contactsSeed,   // depends on accounts
  leadsSeed,      // no dependencies
];

Best Practices

Choose a stable externalId

The externalId field must be a stable natural key that does not change between environments. Avoid using the auto-generated id (UUID) because UUIDs differ between databases.

ScenarioRecommended externalId
Named entities (countries, currencies)'code' or 'slug'
User records'email'
Generic named records'name' (default)
Externally sourced data'external_id'

Scope demo data with env

Keep demo and test-only records out of production by setting env: ['dev', 'test']. System bootstrap data that must exist in production should omit env (or explicitly set ['prod', 'dev', 'test']).

Use upsert by default

upsert is idempotent and the safest default. Only change the mode when the use case requires it — for example, ignore when you must not overwrite user edits to system defaults, or replace for ephemeral cache tables.

Seed in dependency order

Always export parent datasets before child datasets. If contact has a lookup to account, accountsSeed must appear before contactsSeed in the exported array.

Keep records realistic

Demo data appears in screenshots, documentation, and live demos. Use realistic company names, email addresses, and values — not foo, bar, or test123.

One file per object

Split large seed payloads into {object}.seed.ts files. A single index.ts that re-exports and orders them keeps the entry point clean.


defineDataset() API Reference

function defineDataset<
  const TObj extends { name: string; fields: Record<string, unknown> }
>(
  objectDef: TObj,
  config: {
    externalId?: string;                    // default: 'name'
    mode?: 'insert' | 'update' | 'upsert' | 'replace' | 'ignore'; // default: 'upsert'
    env?: Array<'prod' | 'dev' | 'test'>;  // default: ['prod','dev','test']
    records: Array<Partial<Record<keyof TObj['fields'], unknown>>>;
  }
): Dataset

The returned Dataset object is a plain serialisable value — pass it to your stack's seed runner or store it in an export array.


Next: Data Modeling Guide

On this page