Test Data Generator
Overview
Generate realistic, type-safe test data including fixtures, factory functions, seed datasets, and edge case values. Supports Faker.js, Factory Bot patterns, Fishery (TypeScript factories), pytest fixtures, and database seed scripts.
Prerequisites
- Data generation library installed (Faker.js/@faker-js/faker, Fishery, factory-boy for Python, or JavaFaker)
- Database schema or TypeScript/Python type definitions for the data models
- Test framework with fixture support (Jest, pytest, JUnit)
- Seed management for reproducible random data (
faker.seed())
- Database client for seed data insertion (if generating database fixtures)
Instructions
- Read the project's data models, TypeScript interfaces, database schemas, or ORM definitions to understand the shape of all entities.
- For each entity, create a factory function that produces a valid default instance:
- Use Faker methods matched to field semantics (e.g.,
faker.person.fullName() for names, faker.internet.email() for emails).
- Provide sensible defaults for required fields.
- Allow overrides via a partial parameter for test-specific customization.
- Set a deterministic seed for reproducibility (
faker.seed(12345)).
- Generate edge case data variants for each entity:
- Empty values: Empty strings, null, undefined, empty arrays.
- Boundary values: Maximum string length, integer overflow, zero, negative numbers.
- Unicode and i18n: Names with accents, CJK characters, RTL text, emoji.
- Adversarial inputs: SQL injection strings, XSS payloads, excessively long strings.
- Temporal edge cases: Leap years, timezone boundaries, epoch zero, far-future dates.
- Create relationship factories that build connected entity graphs:
- A user factory that also creates associated addresses and orders.
- Configurable depth to avoid infinite recursion.
- Lazy evaluation for optional relationships.
- Generate database seed files for integration tests:
- SQL insert scripts or ORM seed functions.
- Idempotent operations (use
ON CONFLICT or INSERT IF NOT EXISTS).
- Separate seed sets for different test scenarios (empty state, populated state, edge cases).
- Write fixture files in JSON, YAML, or TypeScript for static test data:
- Group fixtures by test scenario.
- Include both valid and invalid data sets.
- Validate generated data against the schema to ensure factories remain in sync with model changes.
Output
- Factory function files (one per entity)