
Cron Jobs with Feature Flags: Safe Scheduled Task Rollouts
Deploy scheduled tasks confidently using feature flags, deduplication, and production-ready error handling. Learn battle-tested patterns for safe cron job rollouts.
Cron Jobs with Feature Flags: Safe Scheduled Task Rollouts
You've built a beautiful feature that sends weekly digest emails to your users. It works perfectly in staging. You deploy to production, the cron job starts running, and within minutes you realize it's sending emails to all 10,000 users instead of the intended 100 beta testers.
There's no easy way to stop it without a full rollback.
This scenario plays out more often than you'd think. Scheduled tasks are particularly risky because they trigger automatically, often processing large batches of data without direct user interaction. A bug in production can affect thousands of records before you even notice.
In this guide, we'll explore production-ready patterns for deploying cron jobs safely using feature flags, preventing concurrent execution, managing state, and handling errors gracefully. These techniques are battle-tested in high-scale SaaS applications where scheduled tasks run 24/7.
The Scheduled Task Deployment Problem
Traditional deployment workflows break down with scheduled tasks:
- No gradual rollout: A cron job either runs or it doesn't—there's no "rollout to 5% of users"
- Hard to stop: Once triggered, canceling mid-execution requires code changes or server restarts
- State management: Failed jobs can leave your system in an inconsistent state
- Testing gaps: Local testing doesn't catch timezone issues, scale problems, or race conditions
- Blind spots: Scheduled tasks run without user interaction, so bugs can go unnoticed for hours
The solution isn't to avoid scheduled tasks—they're essential for SaaS applications. Instead, we need deployment patterns that give us the same safety and control we have with user-facing features.
Setting Up node-cron for Production
Let's start with a robust cron setup using node-cron, a lightweight, zero-dependency scheduler that runs entirely in your Node.js process.
Install the package:
npm install node-cron
npm install --save-dev @types/node-cron
Here's a basic structure for a production cron service:
// services/cron/weekly-digest.ts
import cron from 'node-cron';
import { logger } from '../lib/logger';
async function sendWeeklyDigests() {
const startTime = Date.now();
logger.info('Starting weekly digest job');
try {
// Job logic here
const users = await getActiveUsers();
for (const user of users) {
await sendDigestEmail(user);
}
const duration = Date.now() - startTime;
logger.info({ duration, userCount: users.length }, 'Weekly digest completed');
} catch (error) {
logger.error({ error }, 'Weekly digest failed');
throw error; // Re-throw for monitoring
}
}
export function initWeeklyDigestCron() {
// Run every Monday at 9 AM
cron.schedule('0 9 * * 1', async () => {
await sendWeeklyDigests();
});
logger.info('Weekly digest cron initialized');
}
Understanding Cron Expressions
The schedule '0 9 * * 1' uses standard cron syntax with five fields, read left to right: minute (0-59), hour (0-23), day of month (1-31), month (1-12), and day of week (0-7, where both 0 and 7 represent Sunday). An asterisk means "every value" for that field.
So '0 9 * * 1' translates to: minute 0, hour 9, any day of month, any month, Monday (day 1) - in other words, every Monday at 9:00 AM.
Common patterns:
// Every 4 hours
'0 */4 * * *';
// Daily at midnight
'0 0 * * *';
// Every weekday at 2 PM
'0 14 * * 1-5';
// First day of every month at midnight
'0 0 1 * *';
// Every 15 minutes
'*/15 * * * *';
Use crontab.guru to validate and test your expressions.
Feature Flags: Deploy Without Risk
Here's where it gets interesting. Instead of deploying a cron job and hoping it works, we use feature flags to control execution without code changes.
Understanding Feature Flags: Purpose and Pitfalls
Before diving into implementation, it's crucial to understand what feature flags are—and what they're not. Martin Fowler's comprehensive guide defines four distinct categories:
- Release Toggles - Hide incomplete features during trunk-based development. These are transient (days to weeks) and should be removed once the feature is complete.
- Ops Toggles - Enable rapid feature degradation during outages ("kill switches"). Most are short-lived, though some may persist for graceful degradation.
- Experiment Toggles - Support A/B testing with per-request decisions based on user cohorts. They persist for hours to weeks during testing periods.
- Permissioning Toggles - Manage feature access for specific user groups (premium subscribers, beta testers). These are typically long-lived.
For cron jobs, we're primarily using Ops Toggles—emergency stops and controlled rollouts. This distinction matters because different toggle types have different lifecycles and management requirements.
Common Misuse Patterns
Feature flags become problematic when misused:
Mixing business logic with feature flags is the most common anti-pattern. A feature flag should control whether something runs, not how it runs. Business rules (pricing tiers, user permissions, workflow logic) belong in your domain model, not behind feature flags.
// ❌ BAD: Business logic hidden behind a flag
if (process.env.FF_PREMIUM_PRICING === 'active') {
price = calculatePremiumPrice(user);
} else {
price = calculateStandardPrice(user);
}
// ✅ GOOD: Business logic in domain, flag controls feature availability
const price = pricingService.calculatePrice(user); // Handles tiers internally
if (process.env.FF_NEW_CHECKOUT === 'active') {
await newCheckoutFlow(price);
} else {
await legacyCheckoutFlow(price);
}
Flag proliferation creates exponential testing complexity. If you have flags A, B, C, and D, you potentially have 16 different code paths to test. Fowler notes that you should "only test combinations which should reasonably be expected to happen in production"—but this requires clear documentation of valid flag states.
Stale flags are technical debt. Every abandoned flag is a time bomb. They consume processing power, complicate testing, and make code harder to understand. Teams at Uber found that manually removing obsolete flags was "time-intensive" work that distracted from feature development.
Scattered toggle points make maintenance painful. When flag checks are sprinkled throughout your codebase, changing a flag's behavior requires hunting down every check.
Best Practices for Flag Management
Treat flags as inventory with carrying costs. Savvy teams keep their flag count as low as possible. Some enforce maximum toggle quotas—requiring removal before adding new flags.
Set expiration dates. Include metadata with every flag documenting its purpose, owner, and expected removal date:
/**
* FF_WEEKLY_DIGEST
* Type: Ops Toggle (kill switch)
* Owner: Platform Team
* Created: 2024-01-15
* Expected removal: After 2 weeks in production
* Purpose: Emergency stop for weekly digest emails
*/
const FF_WEEKLY_DIGEST = process.env.FF_WEEKLY_DIGEST;
Centralize toggle decisions. Create a dedicated feature flags module rather than scattering checks throughout your code:
// lib/feature-flags.ts
export const featureFlags = {
isWeeklyDigestEnabled: () => process.env.FF_WEEKLY_DIGEST === 'active',
isDataCleanupEnabled: () => process.env.FF_DATA_CLEANUP === 'active',
// Centralized location makes auditing and removal easier
};
// Usage
if (featureFlags.isWeeklyDigestEnabled()) {
await sendWeeklyDigests();
}
Use inversion of control. Inject toggle decisions at construction time rather than having code query the toggle system directly—this reduces coupling and improves testability.
For more on managing feature flag complexity, see the Feature Toggles article on martinfowler.com and featureflags.io for community patterns.
Simple Environment Variable Approach
This example uses environment variables for feature flags. It's the simplest approach and works well for small teams, but lacks real-time updates - you need to redeploy or restart your application to change flag values.
// services/cron/weekly-digest.ts
import cron from 'node-cron';
import { logger } from '../lib/logger';
/**
* Feature flag to control weekly digest emails.
* Values: 'active' | 'inactive'
*
* Set to 'inactive' to disable digest emails without code changes.
* Useful for emergency stops or gradual rollouts.
*/
const FF_WEEKLY_DIGEST = process.env.FF_WEEKLY_DIGEST;
async function sendWeeklyDigests() {
const startTime = Date.now();
logger.info('Weekly digest cron triggered');
// Check feature flag FIRST
if (FF_WEEKLY_DIGEST !== 'active') {
logger.info({ featureFlag: FF_WEEKLY_DIGEST }, 'Weekly digest disabled by feature flag, skipping execution');
return; // Exit early
}
try {
const users = await getActiveUsers();
logger.info({ userCount: users.length }, 'Processing weekly digests');
let successCount = 0;
let failureCount = 0;
for (const user of users) {
try {
await sendDigestEmail(user);
successCount++;
} catch (error) {
logger.error({ error, userId: user.id }, 'Failed to send digest');
failureCount++;
}
}
const duration = Date.now() - startTime;
logger.info(
{ duration, total: users.length, success: successCount, failed: failureCount },
'Weekly digest completed',
);
} catch (error) {
logger.error({ error }, 'Fatal error in weekly digest');
throw error;
}
}
export function initWeeklyDigestCron() {
logger.info({ schedule: 'Every Monday at 9 AM', featureFlag: FF_WEEKLY_DIGEST }, 'Initializing weekly digest cron');
cron.schedule('0 9 * * 1', async () => {
await sendWeeklyDigests();
});
logger.info('Weekly digest cron initialized');
}
// Manual trigger for testing
export async function triggerDigestManually() {
logger.info('Manual trigger invoked');
await sendWeeklyDigests();
}
Now you can control the job via environment variables:
# Disable the job without redeploying
FF_WEEKLY_DIGEST=inactive
# Re-enable it
FF_WEEKLY_DIGEST=active
This simple pattern gives you an emergency stop button. If something goes wrong in production, you can disable the job instantly without rolling back your entire deployment.
Production Feature Flag Solutions
Environment variables work for simple cases, but production systems often need real-time flag updates, user targeting, and audit trails. Here are your options:
Azure App Configuration - Microsoft's managed feature flag service with real-time updates, targeting filters, and Azure Key Vault integration. Pricing starts at ~$1.20/day for the Standard tier, with a free tier available for development.
import { AppConfigurationClient } from '@azure/app-configuration';
import { DefaultAzureCredential } from '@azure/identity';
const client = new AppConfigurationClient(
process.env.AZURE_APP_CONFIG_ENDPOINT,
new DefaultAzureCredential()
);
async function isFeatureEnabled(featureName: string): Promise<boolean> {
const setting = await client.getConfigurationSetting({ key: `.appconfig.featureflag/${featureName}` });
const flag = JSON.parse(setting.value);
return flag.enabled;
}
LaunchDarkly - Industry-leading feature management platform with sophisticated targeting, experimentation, and analytics. Enterprise pricing, but excellent for large teams needing advanced rollout strategies.
Unleash - Open-source feature flag solution you can self-host. Free to run on your own infrastructure, with a paid cloud option. Great balance between control and cost.
Flagsmith - Open-source with both self-hosted and cloud options. Offers a generous free tier and straightforward pricing for scaling.
ConfigCat - Simple, developer-friendly service with a permanent free tier (10 feature flags, unlimited users). Good for small teams getting started.
Database-backed flags - Store flags in your existing database for zero additional cost. Query on each execution for real-time updates, but adds database dependency to your cron jobs.
For most teams, we recommend starting with environment variables, then moving to Azure App Configuration or a self-hosted Unleash instance as your needs grow.
Gradual Rollouts with Percentage Targeting
For even more control, implement percentage-based rollouts:
const FF_WEEKLY_DIGEST_PERCENT = parseInt(process.env.FF_WEEKLY_DIGEST_PERCENT, 10);
async function sendWeeklyDigests() {
if (FF_WEEKLY_DIGEST !== 'active') {
logger.info('Weekly digest disabled by feature flag');
return;
}
const users = await getActiveUsers();
// Calculate subset based on percentage
const targetCount = Math.floor(users.length * (FF_WEEKLY_DIGEST_PERCENT / 100));
const selectedUsers = users.slice(0, targetCount);
logger.info(
{
totalUsers: users.length,
percentage: FF_WEEKLY_DIGEST_PERCENT,
selectedUsers: selectedUsers.length,
},
'Processing weekly digests',
);
// Process only the selected subset
for (const user of selectedUsers) {
await sendDigestEmail(user);
}
}
Now you can gradually roll out the feature:
# Start with 5% of users
FF_WEEKLY_DIGEST_PERCENT=5
# Increase to 25% after monitoring
FF_WEEKLY_DIGEST_PERCENT=25
# Full rollout
FF_WEEKLY_DIGEST_PERCENT=100
Preventing Concurrent Execution with Deduplication
One of the most dangerous cron job bugs is concurrent execution. Imagine this scenario:
- Weekly digest job starts at 9:00 AM, processing 10,000 users
- Job takes 15 minutes to complete
- But your deployment restarts the server at 9:10 AM
- A new instance starts, triggers the same cron job
- Now you have TWO jobs running, both sending emails to the same users
Users receive duplicate emails. Your email provider's rate limits get hit. Chaos ensues.
Database-Level Deduplication
The most robust solution uses your database to enforce uniqueness:
// Database schema (using Drizzle ORM as example)
import { pgTable, serial, timestamp, varchar, index } from 'drizzle-orm/pg-core';
export const digestJobs = pgTable(
'digest_jobs',
{
id: serial('id').primaryKey(),
scheduledFor: timestamp('scheduled_for').notNull(),
status: varchar('status', { length: 20 }).notNull().default('processing'),
startedAt: timestamp('started_at').notNull().defaultNow(),
completedAt: timestamp('completed_at'),
errorMessage: varchar('error_message', { length: 500 }),
},
(table) => ({
// Unique partial index: only one 'processing' job per scheduled time
uniqueProcessing: index('idx_unique_processing_digest')
.on(table.scheduledFor, table.status)
.where(eq(table.status, 'processing')),
}),
);
Now use atomic operations to prevent duplicates:
import { eq, and } from 'drizzle-orm';
import { db } from '../db/client';
import { digestJobs } from '../db/schema/digest-jobs';
async function sendWeeklyDigests() {
const scheduledFor = new Date();
// Round to the hour to group concurrent attempts
scheduledFor.setMinutes(0, 0, 0);
logger.info({ scheduledFor }, 'Starting weekly digest');
// Atomic insert with conflict detection
let job;
try {
[job] = await db
.insert(digestJobs)
.values({
scheduledFor,
status: 'processing',
})
.onConflictDoNothing() // Uses unique partial index
.returning({ id: digestJobs.id });
if (!job) {
// Conflict occurred - another instance is already running
logger.warn({ scheduledFor }, 'Digest job already in progress, skipping');
return;
}
} catch (error) {
logger.error({ error }, 'Failed to create digest job record');
return;
}
logger.info({ jobId: job.id }, 'Digest job started');
try {
// Process digests
const users = await getActiveUsers();
for (const user of users) {
await sendDigestEmail(user);
}
// Mark as completed
await db
.update(digestJobs)
.set({
status: 'completed',
completedAt: new Date(),
})
.where(eq(digestJobs.id, job.id));
logger.info({ jobId: job.id }, 'Digest job completed');
} catch (error) {
// Mark as failed
await db
.update(digestJobs)
.set({
status: 'failed',
errorMessage: error.message.substring(0, 500),
})
.where(eq(digestJobs.id, job.id));
logger.error({ error, jobId: job.id }, 'Digest job failed');
throw error;
}
}
This pattern guarantees that only one instance can process digests for a given scheduled time, even if you have 10 application servers running the same cron job.
Redis-Based Distributed Locks
For applications already using Redis, distributed locks provide another deduplication option:
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
async function sendWeeklyDigests() {
const lockKey = 'cron:weekly-digest:lock';
const lockValue = `${process.pid}-${Date.now()}`;
const lockTTL = 3600; // 1 hour max execution time
// Try to acquire lock
const acquired = await redis.set(lockKey, lockValue, 'EX', lockTTL, 'NX');
if (!acquired) {
logger.warn('Weekly digest already running on another instance');
return;
}
logger.info({ lockValue }, 'Acquired digest lock');
try {
// Process digests
const users = await getActiveUsers();
for (const user of users) {
await sendDigestEmail(user);
}
logger.info('Digest completed successfully');
} finally {
// Release lock only if we still own it
const script = `
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
end
`;
await redis.eval(script, 1, lockKey, lockValue);
logger.info('Released digest lock');
}
}
Choose database deduplication for critical jobs where you need audit trails. Choose Redis locks for simpler cases where speed matters more.
State Management: Tracking Execution History
Production cron jobs need to remember what they've done. This prevents re-processing data and helps with debugging when things go wrong.
Tracking Last Successful Execution
Store execution metadata in your database:
import { pgTable, serial, timestamp, integer, varchar } from 'drizzle-orm/pg-core';
export const cronExecutions = pgTable('cron_executions', {
id: serial('id').primaryKey(),
jobName: varchar('job_name', { length: 100 }).notNull(),
startedAt: timestamp('started_at').notNull().defaultNow(),
completedAt: timestamp('completed_at'),
status: varchar('status', { length: 20 }).notNull().default('running'),
recordsProcessed: integer('records_processed').default(0),
errorMessage: varchar('error_message', { length: 1000 }),
});
Use this to implement incremental processing:
import { desc, eq } from 'drizzle-orm';
import { db } from '../db/client';
import { cronExecutions } from '../db/schema/cron-executions';
import { users } from '../db/schema/users';
async function sendWeeklyDigests() {
const jobName = 'weekly-digest';
// Find last successful execution
const [lastExecution] = await db
.select()
.from(cronExecutions)
.where(and(eq(cronExecutions.jobName, jobName), eq(cronExecutions.status, 'completed')))
.orderBy(desc(cronExecutions.completedAt))
.limit(1);
const lastRunTime = lastExecution?.completedAt || new Date('2000-01-01');
logger.info({ lastRunTime }, 'Processing users since last run');
// Create execution record
const [execution] = await db
.insert(cronExecutions)
.values({ jobName, status: 'running' })
.returning({ id: cronExecutions.id });
try {
// Only process users who signed up since last run
const newUsers = await db.select().from(users).where(gt(users.createdAt, lastRunTime));
logger.info({ userCount: newUsers.length }, 'Found new users');
for (const user of newUsers) {
await sendDigestEmail(user);
}
// Mark as completed
await db
.update(cronExecutions)
.set({
status: 'completed',
completedAt: new Date(),
recordsProcessed: newUsers.length,
})
.where(eq(cronExecutions.id, execution.id));
logger.info({ executionId: execution.id }, 'Digest completed');
} catch (error) {
await db
.update(cronExecutions)
.set({
status: 'failed',
errorMessage: error.message.substring(0, 1000),
})
.where(eq(cronExecutions.id, execution.id));
throw error;
}
}
This pattern is especially valuable for data processing jobs that shouldn't re-process the same records repeatedly.
Error Handling and Alerting
Cron jobs fail silently. There's no user clicking a button to notice something's broken. You need proactive monitoring and alerts.
Structured Error Handling
Wrap your cron logic with comprehensive error tracking:
async function sendWeeklyDigests() {
const startTime = Date.now();
const context = {
jobName: 'weekly-digest',
timestamp: new Date().toISOString(),
};
try {
logger.info(context, 'Starting weekly digest');
const users = await getActiveUsers();
let successCount = 0;
let failureCount = 0;
const errors: Array<{ userId: string; error: string }> = [];
for (const user of users) {
try {
await sendDigestEmail(user);
successCount++;
} catch (error) {
failureCount++;
errors.push({
userId: user.id,
error: error.message,
});
// Log individual failure but continue processing
logger.error({ ...context, userId: user.id, error }, 'Failed to send digest to user');
}
}
const duration = Date.now() - startTime;
const summary = {
...context,
duration,
totalUsers: users.length,
successCount,
failureCount,
errorRate: failureCount / users.length,
};
// Alert if error rate is high
if (summary.errorRate > 0.1) {
logger.error({ ...summary, sampleErrors: errors.slice(0, 5) }, 'High error rate in weekly digest');
// Send alert to your monitoring system
await sendAlert({
severity: 'high',
message: `Weekly digest error rate: ${(summary.errorRate * 100).toFixed(1)}%`,
metadata: summary,
});
} else {
logger.info(summary, 'Weekly digest completed');
}
} catch (error) {
// Fatal error - job couldn't even start
const duration = Date.now() - startTime;
logger.error({ ...context, error, duration }, 'Fatal error in weekly digest');
await sendAlert({
severity: 'critical',
message: 'Weekly digest job failed completely',
error: error.message,
metadata: context,
});
throw error;
}
}
Integration with Monitoring Services
Connect your cron jobs to real-time monitoring:
// Using a monitoring service like Sentry or custom webhook
async function sendAlert(alert: {
severity: 'low' | 'medium' | 'high' | 'critical';
message: string;
error?: string;
metadata?: Record<string, any>;
}) {
// Send to Sentry
if (process.env.SENTRY_DSN) {
Sentry.captureException(new Error(alert.message), {
level: alert.severity === 'critical' ? 'error' : 'warning',
extra: alert.metadata,
});
}
// Send to Slack webhook
if (process.env.SLACK_WEBHOOK_URL && alert.severity === 'critical') {
await fetch(process.env.SLACK_WEBHOOK_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: `🚨 Cron Job Alert`,
blocks: [
{
type: 'section',
text: {
type: 'mrkdwn',
text: `*${alert.message}*\n\`\`\`${JSON.stringify(alert.metadata, null, 2)}\`\`\``,
},
},
],
}),
});
}
// Send to email for critical alerts
if (alert.severity === 'critical') {
await sendEmail({
to: process.env.ALERT_EMAIL,
subject: `Critical Cron Alert: ${alert.message}`,
body: JSON.stringify(alert, null, 2),
});
}
}
This multi-channel approach ensures critical failures never go unnoticed.
Graceful Startup: Waiting for Dependencies
Cron jobs often need external services (database, Redis, APIs) to be ready. Starting jobs before dependencies are available causes cascading failures.
Dependency Health Checks
Implement health checks before scheduling cron jobs:
// services/cron/index.ts
import { logger } from '../lib/logger';
import { initWeeklyDigestCron } from './weekly-digest';
import { initMonthlyReportCron } from './monthly-report';
import { db } from '../db/client';
// Generic retry helper - reusable for any async health check
async function waitForDependency(
name: string,
healthCheck: () => Promise<void>,
maxRetries = 10,
delayMs = 5000
) {
for (let i = 0; i < maxRetries; i++) {
try {
await healthCheck();
logger.info(`${name} connection verified`);
return true;
} catch (error) {
logger.warn({ attempt: i + 1, maxRetries, error: error.message }, `${name} not ready, retrying...`);
await new Promise((resolve) => setTimeout(resolve, delayMs));
}
}
throw new Error(`${name} connection failed after max retries`);
}
export async function initializeCronJobs() {
logger.info('Initializing cron jobs...');
try {
// Wait for all dependencies using the generic helper
await Promise.all([
waitForDependency('Database', () => db.execute('SELECT 1')),
waitForDependency('Redis', () => redis.ping()),
]);
logger.info('All dependencies ready, starting cron jobs');
// Initialize all cron jobs
initWeeklyDigestCron();
initMonthlyReportCron();
// ... additional cron jobs follow same pattern
logger.info('All cron jobs initialized successfully');
} catch (error) {
logger.error({ error }, 'Failed to initialize cron jobs');
if (process.env.NODE_ENV === 'production') {
process.exit(1);
}
}
}
Call this from your application startup:
// server.ts
import express from 'express';
import { initializeCronJobs } from './services/cron';
const app = express();
// Start HTTP server immediately
const server = app.listen(3000, () => {
console.log('Server running on port 3000');
});
// Initialize cron jobs asynchronously
initializeCronJobs().catch((error) => {
console.error('Cron initialization failed:', error);
});
This pattern allows your application to start serving HTTP requests immediately while cron jobs initialize in the background.
Testing Cron Jobs Locally
Testing scheduled tasks requires special patterns since you can't wait hours for the actual schedule.
Manual Triggers for Development
Expose manual trigger endpoints for testing:
// services/cron/weekly-digest.ts
// Export the core logic
export async function sendWeeklyDigests() {
// ... implementation
}
// Production cron schedule
export function initWeeklyDigestCron() {
if (process.env.NODE_ENV === 'production') {
cron.schedule('0 9 * * 1', async () => {
await sendWeeklyDigests();
});
} else {
logger.info('Skipping cron schedule in non-production environment');
}
}
// Manual trigger for development/testing
export async function triggerWeeklyDigestManually() {
logger.info('Manual trigger invoked');
await sendWeeklyDigests();
}
Create a development-only API endpoint:
// routes/dev/cron.ts
import { Router } from 'express';
import { triggerWeeklyDigestManually } from '../../services/cron/weekly-digest';
const router = Router();
// Only enable in development
if (process.env.NODE_ENV !== 'production') {
router.post('/trigger/weekly-digest', async (req, res) => {
try {
await triggerWeeklyDigestManually();
res.json({ success: true, message: 'Weekly digest triggered' });
} catch (error) {
res.status(500).json({ success: false, error: error.message });
}
});
}
export default router;
Now you can test locally:
# Trigger the job manually
curl -X POST http://localhost:3000/dev/cron/trigger/weekly-digest
Fast Schedules for Integration Testing
For integration tests, use faster schedules:
export function initWeeklyDigestCron() {
// Use environment variable to override schedule
const schedule = process.env.CRON_DIGEST_SCHEDULE || '0 9 * * 1';
cron.schedule(schedule, async () => {
await sendWeeklyDigests();
});
logger.info({ schedule }, 'Weekly digest cron initialized');
}
In your test environment:
# Run every minute for testing
CRON_DIGEST_SCHEDULE='*/1 * * * *'
Unit Testing Cron Logic
Test the business logic independently of the schedule:
// services/cron/__tests__/weekly-digest.test.ts
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { sendWeeklyDigests } from '../weekly-digest';
import { getActiveUsers, sendDigestEmail } from '../../email';
// Mock dependencies
vi.mock('../../email');
describe('Weekly Digest Cron', () => {
beforeEach(() => {
vi.clearAllMocks();
});
it('should process all active users', async () => {
const mockUsers = [
{ id: '1', email: 'user1@example.com' },
{ id: '2', email: 'user2@example.com' },
];
vi.mocked(getActiveUsers).mockResolvedValue(mockUsers);
vi.mocked(sendDigestEmail).mockResolvedValue(undefined);
await sendWeeklyDigests();
expect(getActiveUsers).toHaveBeenCalledTimes(1);
expect(sendDigestEmail).toHaveBeenCalledTimes(2);
expect(sendDigestEmail).toHaveBeenCalledWith(mockUsers[0]);
expect(sendDigestEmail).toHaveBeenCalledWith(mockUsers[1]);
});
it('should respect feature flag', async () => {
process.env.FF_WEEKLY_DIGEST = 'inactive';
await sendWeeklyDigests();
expect(getActiveUsers).not.toHaveBeenCalled();
expect(sendDigestEmail).not.toHaveBeenCalled();
});
it('should continue processing even if one email fails', async () => {
const mockUsers = [
{ id: '1', email: 'user1@example.com' },
{ id: '2', email: 'user2@example.com' },
{ id: '3', email: 'user3@example.com' },
];
vi.mocked(getActiveUsers).mockResolvedValue(mockUsers);
vi.mocked(sendDigestEmail)
.mockResolvedValueOnce(undefined) // Success
.mockRejectedValueOnce(new Error('Email failed')) // Failure
.mockResolvedValueOnce(undefined); // Success
await sendWeeklyDigests();
// Should have attempted all three
expect(sendDigestEmail).toHaveBeenCalledTimes(3);
});
});
This isolates your business logic from scheduling concerns, making tests fast and reliable.
Complete Production-Ready Example
Let's put it all together in a comprehensive example:
// services/cron/data-cleanup.ts
import cron from 'node-cron';
import { eq, lt } from 'drizzle-orm';
import { db } from '../db/client';
import { cronExecutions } from '../db/schema/cron-executions';
import { sessions } from '../db/schema/sessions';
import { logger } from '../lib/logger';
const FF_DATA_CLEANUP = process.env.FF_DATA_CLEANUP;
// Helper to update job status - reduces repetitive db.update calls
async function updateJobStatus(
executionId: number,
status: 'completed' | 'failed',
extra: { recordsProcessed?: number; errorMessage?: string } = {}
) {
await db
.update(cronExecutions)
.set({ status, completedAt: new Date(), ...extra })
.where(eq(cronExecutions.id, executionId));
}
async function cleanupExpiredData() {
const jobName = 'data-cleanup';
const startTime = Date.now();
logger.info({ jobName }, 'Starting data cleanup job');
if (FF_DATA_CLEANUP !== 'active') {
logger.info({ featureFlag: FF_DATA_CLEANUP }, 'Data cleanup disabled');
return;
}
// Create execution record for tracking (same pattern as previous examples)
const [execution] = await db
.insert(cronExecutions)
.values({ jobName, status: 'running' })
.returning({ id: cronExecutions.id });
try {
const thirtyDaysAgo = new Date();
thirtyDaysAgo.setDate(thirtyDaysAgo.getDate() - 30);
const deletedSessions = await db
.delete(sessions)
.where(lt(sessions.expiresAt, thirtyDaysAgo))
.returning({ id: sessions.id });
// ... additional cleanup tasks follow same pattern
await updateJobStatus(execution.id, 'completed', { recordsProcessed: deletedSessions.length });
logger.info({ executionId: execution.id, duration: Date.now() - startTime, recordsDeleted: deletedSessions.length }, 'Data cleanup completed');
} catch (error) {
await updateJobStatus(execution.id, 'failed', { errorMessage: error.message.substring(0, 1000) });
logger.error({ error, executionId: execution.id, duration: Date.now() - startTime }, 'Data cleanup failed');
// Alert using sendAlert from previous examples
await sendAlert({ severity: 'high', message: 'Data cleanup job failed', error: error.message, metadata: { executionId: execution.id } });
throw error;
}
}
export function initDataCleanupCron() {
const schedule = process.env.CRON_CLEANUP_SCHEDULE;
logger.info({ schedule, featureFlag: FF_DATA_CLEANUP }, 'Initializing data cleanup cron');
cron.schedule(schedule, () => cleanupExpiredData());
}
// Manual trigger follows same pattern as weekly digest
export const triggerCleanupManually = () => cleanupExpiredData();
Key Takeaways
Deploying scheduled tasks safely requires more than just writing a cron schedule. Here's what we've covered:
- Feature flags give you an emergency stop button without code changes
- Deduplication prevents disasters from concurrent execution across multiple servers
- State management ensures jobs don't re-process data and provides audit trails
- Structured error handling makes silent failures impossible with comprehensive logging and alerts
- Graceful startup prevents cascade failures by waiting for dependencies
- Testing patterns enable rapid local development and reliable automated tests
Production cron jobs are not fire-and-forget. They're critical infrastructure that requires the same engineering rigor as your user-facing features. With these patterns, you can deploy scheduled tasks confidently, knowing you have safety mechanisms in place.
Ready to level up your Node.js skills? Check out our guide on building resilient background jobs or explore production-ready logging strategies.
Further Reading
- Feature Toggles (aka Feature Flags) - Martin Fowler's canonical guide on toggle categories, best practices, and anti-patterns
- featureflags.io - Community hub for feature flag patterns and literature
- node-cron documentation - Official package docs
- crontab.guru - Cron expression validator and explainer
- Drizzle ORM - Type-safe SQL query builder used in examples
- Redis distributed locks - Redlock algorithm for distributed locking