Advanced Features

Retries and Timeout

Configure automatic retries and a timeout for each job:

$schedule->command('unstable:task')->maxRetries(3)->timeout(60); // Retries up to 3 times, 60s timeout
  • maxRetries(int) sets how many times the job will be retried if it fails.

  • timeout(int) sets the maximum execution time in seconds (enforced at the job logic level).

Hooks: Before and After Job Execution

Override the beforeJob and afterJob methods in a custom JobRunner:

class MyJobRunner extends \Daycry\CronJob\JobRunner {
    protected function beforeJob($job) { /* ... */ }
    protected function afterJob($job, $result, $error) { /* ... */ }
}

Dependency Validation

The scheduler validates that all dependencies exist and that there are no circular dependencies:

$schedule->validateDependencies();

Utility Methods for Scheduler

  • removeTaskByName($name): Remove a job by name.

  • hasTask($name): Check if a job exists by name.

  • getTaskNames(): Get all job names.

Configuration Options (New)

Below are additional configuration properties you can set in Config\CronJob.

Property

Type

Default

Description

maxOutputLength

int|null

null

Truncates stored output (and error) to this many characters, appending a truncation marker. Null = unlimited.

notificationMode

string

always

One of always, on_error, never. Controls email sending when notification=true. (Current implementation behaves as always; on_error/never reserved for future logic.)

defaultTimeout

int|null

null

Global fallback timeout (seconds) applied when a job has no explicit timeout().

retryBackoffStrategy

string

none

One of none, fixed, exponential. Governs delay between failed attempts.

retryBackoffBase

int

5

Base delay (seconds) used for fixed and as the starting point for exponential.

retryBackoffMultiplier

float

2.0

Exponential growth factor: delay = base * multiplier^(attempt-2).

retryBackoffMax

int

300

Maximum cap (seconds) for any computed backoff delay.

retryBackoffJitter

bool

true

Adds ±15% random jitter to the computed delay (helps avoid thundering herd).

enableEvents

bool

true

Enables internal lifecycle events (cronjob.*). Disable for absolute minimal overhead.

lockPath

string

WRITEPATH . ‘cronJob/locks/’

Directory where single-run lock files are stored.

lockTTL

int|null

3600

Seconds before a lock is considered stale and eligible for reclaim. Null disables TTL expiration.

enableSignals

bool

true

Enables POSIX signal handling (SIGTERM/SIGINT/SIGQUIT) to allow graceful shutdown mid-run (requires CLI + ext-pcntl).

Backoff Example

// config/CronJob.php
public string $retryBackoffStrategy = 'exponential';
public int $retryBackoffBase = 3;      // first retry waits ~3s
public float $retryBackoffMultiplier = 2.0; // next ~6s, then ~12s ... capped
public int $retryBackoffMax = 60;      // cap at 60s
public bool $retryBackoffJitter = true; // add jitter
$schedule->command('fragile:sync')->maxRetries(4); // up to 4 attempts with backoff

Output Truncation Example

// config/CronJob.php
public ?int $maxOutputLength = 2000; // keep logs light

Stored output beyond 2000 characters will end with:

[truncated 5234 -> 2000 chars]

Timeout Behavior

Timeouts are “soft”: the job’s runtime is measured; if it exceeds the configured seconds, a TimeoutException is thrown after completion of the execution block. For hard termination you would need to externalize the process (not implemented yet).

$schedule->command('reports:build')->timeout(120); // overrides defaultTimeout

If neither per-job timeout() nor defaultTimeout are set, no timeout check is applied.

Disabling Events

public bool $enableEvents = false; // turn off all cronjob.* dispatches

This is useful in high-throughput or test scenarios where you need the absolute minimum overhead.

Enhanced Locking & Single-Run Jobs

Single-run jobs (those whose getRunType() returns single) use a file lock to prevent concurrent execution. The locking mechanism now stores JSON metadata inside each lock file:

{
    "flag": true,
    "time": "2025-09-15 12:34:56",
    "job": "my:example:job",
    "pid": 12345,
    "heartbeat": "2025-09-15T12:34:56+00:00",
    "stolen": true
}

Fields:

  • flag: Internal marker (true while held).

  • time: Acquisition timestamp (server local time).

  • job: Resolved job name that owns the lock.

  • pid: Process ID that created the lock (not set on Windows for liveness probing, but still recorded if available).

  • heartbeat: ISO-8601 timestamp written at acquisition time (future versions may periodically refresh it to detect hung processes more precisely).

  • stolen (optional): Present and set to true when a new process reclaimed an expired or dead lock.

Reclaim Logic

When a process attempts to acquire a lock and it is already held, it will inspect the existing file. The lock is considered reclaimable when ANY of the following is true:

  1. TTL Expired: Current time - file modification time > lockTTL.

  2. Dead PID: On POSIX systems (non-Windows) if the stored PID does not respond to posix_kill($pid, 0).

  3. Heartbeat Stale: (Reserved) If heartbeat age > lockTTL (currently equivalent to TTL since heartbeat is only written once).

If reclaimable, the runner force-acquires the lock, truncates the file, and writes new metadata including stolen: true.

Operational Guidance

  • Keep lockTTL comfortably larger than your longest expected execution time to avoid premature steals.

  • Set lockTTL to null to disable automatic stealing (a crashed process may then require manual cleanup).

  • You can introspect locks by viewing the JSON files under lockPath.

Graceful Shutdown

Long-running batches can now exit cleanly on demand.

Two mechanisms set an internal stop flag checked between jobs:

  1. Signals (if enableSignals=true, CLI + ext-pcntl): SIGTERM, SIGINT, SIGQUIT.

  2. Programmatic call: $runner->requestStop(); from user code (tests, admin command, etc.).

When triggered:

  • The current job continues until completion (no hard kill).

  • No further jobs from the schedule are started.

  • Event cronjob.shutdown is fired with payload: when (DateTimeImmutable) and executed (array of job names already run).

Example:

$runner = new \Daycry\CronJob\JobRunner();
$runner->run(); // Press Ctrl+C or send SIGTERM to stop after current job

Or programmatically:

$runner = (new \Daycry\CronJob\JobRunner());
// elsewhere, perhaps another thread/test hook
$runner->requestStop();

Disable completely (no signal handlers) by setting in config/CronJob.php:

public bool $enableSignals = false;

Future enhancements may include per-job cooperative cancellation checks.

Event System

When enableEvents is true, the runner dispatches lifecycle events through CodeIgniter’s Events system. Each event receives a single associative array payload.

Event Name

Fired When

Payload Keys

cronjob.beforeJob

Before each attempt

job, attempt

cronjob.afterJob

After an attempt finishes (success or failure)

job, result, error, attempt, duration

cronjob.retryScheduled

A retry will occur after backoff

job, attempt, delay

cronjob.failed

Final failure (no retries left)

job, exception, attempts

cronjob.skipped

Job filtered out (env/frequency/only filter)

job, reason

cronjob.timeout

Timeout exceeded (soft)

job, timeoutSeconds

cronjob.metrics.flush

After all jobs executed

metrics, generatedAt

cronjob.shutdown

Graceful stop requested (after loop ends)

when, executed

Listener Examples

use CodeIgniter\Events\Events;

Events::on('cronjob.beforeJob', static function(array $data) {
    log_message('debug', 'Starting job ' . $data['job']->getName() . ' attempt=' . $data['attempt']);
});

Events::on('cronjob.afterJob', static function(array $data) {
    $status = $data['error'] ? 'FAILED' : 'OK';
    log_message('info', sprintf(
        'Finished %s status=%s duration=%.4fs attempt=%d',
        $data['job']->getName(),
        $status,
        $data['duration'] ?? 0,
        $data['attempt']
    ));
});

Events::on('cronjob.retryScheduled', static function(array $data) {
    log_message('warning', 'Retrying ' . $data['job']->getName() . ' in ' . $data['delay'] . 's (attempt ' . $data['attempt'] . ')');
});

Events::on('cronjob.failed', static function(array $data) {
    log_message('error', 'Job failed: ' . $data['job']->getName() . ' attempts=' . $data['attempts'] . ' error=' . $data['exception']->getMessage());
});

Metrics Payload Structure

metrics is an associative array: jobName => [runDurationAttempt1, runDurationAttempt2, ...] (each element is seconds as float). You can aggregate averages, percentiles, or export to an external monitoring system.

Notes

  • Listeners should be fast; offload heavy processing to queues.

  • Exceptions inside listeners are caught and logged (they will not break the scheduler).

  • Disable events (enableEvents=false) if micro‑optimizing throughput or running in a constrained environment.

Metrics Exporting

The scheduler can record per-attempt execution metrics via a pluggable exporter implementing Daycry\\CronJob\\Interfaces\\MetricsExporterInterface.

Interface

interface MetricsExporterInterface
{
    public function recordAttempt(string $jobName, bool $success, float $duration, int $attempt, bool $final): void;
    public function flush(): mixed; // publish / snapshot
}

Built-in In-Memory Exporter

For testing or debugging you can use Daycry\\CronJob\\Metrics\\InMemoryExporter:

$exporter = new \Daycry\CronJob\Metrics\InMemoryExporter();
$runner   = (new \Daycry\CronJob\JobRunner())
    ->withMetricsExporter($exporter);
$runner->run();
$snapshot = $exporter->flush();
// $snapshot structure:
// [
//   'jobName' => [
//       'attempts' => 2,
//       'successes' => 1,
//       'failures' => 1,
//       'total_duration' => 0.1534,
//       'attempts_rows' => [
//           ['success' => false, 'duration' => 0.05, 'attempt' => 1, 'final' => false],
//           ['success' => true,  'duration' => 0.10, 'attempt' => 2, 'final' => true],
//       ],
//   ],
// ]

Each attempt (including failures and retries) is captured with:

  • success: Whether the attempt ended without exception.

  • duration: Seconds (float) taken by the attempt.

  • attempt: 1-based attempt number.

  • final: True if this attempt ends the retry cycle (success or max retries reached).

Events vs Exporter

The legacy cronjob.metrics.flush event still delivers an array of raw durations per job. The exporter is richer (captures per-attempt success flags) and is ideal for structured backends.

Prometheus (Optional)

If you install promphp/prometheus_client_php (see composer suggest) you can implement a custom exporter that maps:

  • Counter: total attempts by job & result

  • Counter: failures by job

  • Histogram/Summary: execution duration seconds

Skeleton:

use Daycry\CronJob\Interfaces\MetricsExporterInterface;
use Prometheus\CollectorRegistry;
use Prometheus\Storage\InMemory;

class PrometheusExporter implements MetricsExporterInterface
{
    private CollectorRegistry $registry;
    private $attemptCounter;
    private $durationHistogram;

    public function __construct(CollectorRegistry $registry)
    {
        $this->registry = $registry;
        $this->attemptCounter = $registry->getOrRegisterCounter('cronjob', 'attempts_total', 'Job attempts', ['job','result']);
        $this->durationHistogram = $registry->getOrRegisterHistogram('cronjob', 'duration_seconds', 'Job attempt duration', ['job']);
    }

    public function recordAttempt(string $jobName, bool $success, float $duration, int $attempt, bool $final): void
    {
        $this->attemptCounter->inc([$jobName, $success ? 'success' : 'failure']);
        $this->durationHistogram->observe($duration, [$jobName]);
    }

    public function flush(): mixed
    {
        // Expose metrics endpoint elsewhere (e.g. HTTP controller)
        return null;
    }
}

Then wire it:

$runner = (new \Daycry\CronJob\JobRunner())
    ->withMetricsExporter(new PrometheusExporter($registry));
$runner->run();

Choosing an Approach

Use Case

Recommendation

Local debugging

InMemoryExporter

Unit / CI assertions

InMemoryExporter snapshot

Production monitoring

Custom Prometheus / external exporter

Minimal overhead

Disable events + lightweight exporter or none

If no exporter is provided, the system still logs average duration and fires the metrics event—no breaking change.

Troubleshooting (Locks & Execution)

Problem

Symptoms

Likely Cause

Solution

Lock not released

Single-run job never re-executes

Previous process died and lockTTL = null

Remove the file under lockPath or configure a lockTTL.

Frequent stolen locks (stolen:true)

Logs show many steals

lockTTL too short for real duration

Increase lockTTL (e.g. 3600 -> 10800).

PID present but process gone

Lock persists with dead PID

Abrupt restart / kill -9

Enable TTL so it can be reclaimed.

Unexpected timeouts

TimeoutException thrown

defaultTimeout too low

Raise defaultTimeout or per-job timeout().

Too many retries

Many retryScheduled entries

Very low backoff base with exponential

Increase retryBackoffBase or reduce maxRetries.

Slow events

Total run length grows

Heavy listener logic

Move heavy work to queue or disable events.

Additional tips:

  • Keep lockPath on fast local storage (avoid high-latency network mounts when possible).

  • In ephemeral containers, ensure lockPath is persisted to avoid parallel re-runs after restarts.

  • For debugging, open the JSON file and inspect time, pid, stolen.

Shutdown Listener Example (cronjob.shutdown)

use CodeIgniter\Events\Events;

Events::on('cronjob.shutdown', static function(array $data) {
    log_message(
        'info',
        'Scheduler clean stop at ' . $data['when']->format('c') .
        ' (jobs executed: ' . implode(', ', $data['executed']) . ')'
    );
});

CLI Lock Inspection Modes

Command: php spark cronjob:locks

Flags:

  • --force: Run even if the CronJob system is currently disabled (bypasses status check).

  • --silent: Suppress human-readable table output; only returns data to the caller (useful in tests).

  • --json: Return a structured array (and suppress table) containing:

    • When no locks: ['message' => 'No active locks.', 'locks' => []]

    • When locks exist: ['locks' => [ {job,file,pid,stolen,age_s,heartbeat,time}, ... ], 'count' => N]

Example output (table mode):

+-----+-------------------------------+------+--------+-------+-----------+----------+
| Job | File                          | PID  | Stolen | Age(s)| Heartbeat | Acquired |
+-----+-------------------------------+------+--------+-------+-----------+----------+
| foo | 851abaafd3a69.lock            | 1234 | no     | 2     | 2025-09.. | 2025-09..|
+-----+-------------------------------+------+--------+-------+-----------+----------+

JSON mode (programmatic):

$result = command('cronjob:locks --json --force');
// $result example:
// [
//   'locks' => [
//       [
//           'job' => 'foo',
//           'file' => '851abaafd3a69.lock',
//           'pid' => 1234,
//           'stolen' => 'no',
//           'age_s' => 2,
//           'heartbeat' => '2025-09-22T10:11:12+00:00',
//           'time' => '2025-09-22 10:11:12'
//       ]
//   ],
//   'count' => 1
// ]