Remove CI4 integration examples to focus on service documentation Changes: - Remove CI4 Integration section (113 lines of PHP/JS examples) - Remove CI4 Controller from Architecture diagram - Remove all ReportController, curl, fetch code references - Condense Quick Start and Troubleshooting sections - Focus README on pure node_spooler service documentation Reduction: 429 → 308 lines (-121 lines, 28% smaller) Scope: - All API endpoints documented - Error handling and cleanup procedures preserved - Monitoring and troubleshooting guides retained - Deployment instructions maintained - No CI4 integration code examples
PDF Spooler v2.0
Bismillahirohmanirohim.
Overview
Node.js Express service with internal queue for HTML to PDF conversion using Chrome DevTools Protocol.
Architecture
Client Application
↓ POST {html, filename}
Node.js Spooler (port 3030)
↓ queue
Internal Queue (max 5 concurrent)
↓ process
PDF Generator (Chrome CDP port 42020)
↓ save
data/pdfs/{filename}.pdf
Features
- HTTP API for PDF generation (no file watching)
- Internal queue with max 5 concurrent processing
- Max 100 jobs in queue
- In-memory job tracking (auto-cleanup after 60 min)
- Chrome crash detection & restart (max 3 attempts)
- Comprehensive logging (info, error, metrics)
- Automated cleanup with dry-run mode
- Admin dashboard for monitoring
- Manual error review required (see
data/error/)
API Endpoints
POST /api/pdf/generate
Generate PDF from HTML content.
Request:
{
"html": "<html>...</html>",
"filename": "1234567890.pdf"
}
Response (Success):
{
"success": true,
"jobId": "job_1738603845123_abc123xyz",
"status": "queued",
"message": "Job added to queue"
}
Response (Error):
{
"success": false,
"error": "Queue is full, please try again later"
}
GET /api/pdf/status/:jobId
Check job status.
Response (Queued/Processing):
{
"success": true,
"jobId": "job_1738603845123_abc123xyz",
"status": "queued|processing",
"progress": 0|50,
"pdfUrl": null,
"error": null
}
Response (Completed):
{
"success": true,
"jobId": "job_1738603845123_abc123xyz",
"status": "completed",
"progress": 100,
"pdfUrl": "/node_spooler/data/pdfs/1234567890.pdf",
"error": null
}
Response (Error):
{
"success": true,
"jobId": "job_1738603845123_abc123xyz",
"status": "error",
"progress": 0,
"pdfUrl": null,
"error": "Chrome timeout"
}
GET /api/queue/stats
Queue statistics.
{
"success": true,
"queueSize": 12,
"processing": 3,
"completed": 45,
"errors": 2,
"avgProcessingTime": 0.82,
"maxQueueSize": 100
}
Error Handling
Chrome Crash Handling
- Chrome crash detected (CDP connection lost or timeout)
- Stop processing current jobs
- Move queue jobs back to "queued" status
- Attempt to restart Chrome (max 3 attempts)
- Resume processing
Failed Jobs
- Failed jobs logged to
data/error/{jobId}.json - Never auto-deleted (manual review required)
- Review
logs/errors.logfor details - Error JSON contains full job details including error message
Cleanup
Manual Execution
# Test cleanup (dry-run)
npm run cleanup:dry-run
# Execute cleanup
npm run cleanup
Retention Policy
| Directory | Retention | Action |
|---|---|---|
data/pdfs/ |
7 days | Move to archive |
data/archive/YYYYMM/ |
45 days | Delete |
data/error/ |
Manual | Never delete |
logs/ |
30 days | Delete (compress after 7 days) |
Cleanup Tasks
- Archive PDFs older than 7 days to
data/archive/YYYYMM/ - Delete archived PDFs older than 45 days
- Compress log files older than 7 days
- Delete log files older than 30 days
- Check disk space (alert if > 80%)
Monitoring
Admin Dashboard
Open admin.html in browser for:
- Real-time queue statistics
- Processing metrics
- Error file list
- Disk space visualization
URL: http://localhost:3030/admin.html
Key Metrics
- Average PDF time: < 2 seconds
- Success rate: > 95%
- Queue size: < 100 jobs
- Disk usage: < 80%
Log Files
logs/spooler.log- All API events (info, warn, error)logs/errors.log- PDF generation errors onlylogs/metrics.log- Performance stats (per job)logs/cleanup.log- Cleanup execution logs
Troubleshooting
Spooler Not Starting
- Check if Chrome is running on port 42020
- Check logs:
logs/spooler.log - Verify directories exist:
data/pdfs,data/archive,data/error,logs - Check Node.js version:
node --version(need 14+) - Verify dependencies installed:
npm install
Start Chrome manually:
"C:/Program Files/Google/Chrome/Application/chrome.exe"
--headless
--disable-gpu
--remote-debugging-port=42020
PDF Not Generated
- Check job status via API:
GET /api/pdf/status/{jobId} - Review error logs:
logs/errors.log - Verify Chrome connection: Check logs for CDP connection errors
- Check HTML content: Ensure valid HTML
Queue Full
- Wait for current jobs to complete
- Check admin dashboard for queue size
- Increase
maxQueueSizeinspooler.js(default: 100) - Check if jobs are stuck (processing too long)
Chrome Crashes Repeatedly
- Check system RAM (need minimum 2GB available)
- Reduce
maxConcurrentinspooler.js(default: 5) - Check for memory leaks in Chrome
- Restart Chrome manually and monitor
- Check system resources: Task Manager > Performance
High Disk Usage
- Run cleanup:
npm run cleanup - Check
data/archive/for old folders - Check
logs/for old logs - Check
data/pdfs/for large files - Consider reducing PDF retention time in
cleanup-config.json
Deployment
Quick Start
# 1. Create directories
cd node_spooler
mkdir -p logs data/pdfs data/archive data/error
# 2. Install dependencies
npm install
# 3. Start Chrome (if not running)
"C:/Program Files/Google/Chrome/Application/chrome.exe"
--headless
--disable-gpu
--remote-debugging-port=42020
# 4. Start spooler
npm start
# 5. Test API
curl -X POST http://localhost:3030/api/pdf/generate \
-H "Content-Type: application/json" \
-d "{\"html\":\"<html><body>Test</body></html>\",\"filename\":\"test.pdf\"}"
# 6. Open admin dashboard
# http://localhost:3030/admin.html
Production Setup
1. Create batch file wrapper:
@echo off
cd /d D:\data\www\gdc_cmod\node_spooler
C:\node\node.exe spooler.js
2. Create Windows service:
sc create PDFSpooler binPath= "D:\data\www\gdc_cmod\node_spooler\spooler-start.bat" start=auto
sc start PDFSpooler
3. Create scheduled task for cleanup:
schtasks /create /tn "PDF Cleanup Daily" /tr "C:\node\node.exe D:\data\www\gdc_cmod\node_spooler\cleanup.js" /sc daily /st 01:00
schtasks /create /tn "PDF Cleanup Weekly" /tr "C:\node\node.exe D:\data\www\gdc_cmod\node_spooler\cleanup.js weekly" /sc weekly /d MON /st 01:00
Version History
- 2.0.0 (2025-02-03): Migrated from file watching to HTTP API queue
- Removed file watching (chokidar)
- Added Express HTTP API
- Internal queue with max 5 concurrent
- Max 100 jobs in queue
- Job auto-cleanup after 60 minutes
- Enhanced error handling with Chrome restart
- Admin dashboard for monitoring
- Automated cleanup system
License
Internal use only.