Chapterize

ProductionData & Intelligence

A document processing engine that converts static PDFs into structured, web-ready HTML chapters. Chapterize uses a rules-based engine with strict regex patterns and DOM analysis to detect logical breaks, clean content, and make legacy documentation accessible, searchable, and mobile-friendly.

Key Features

Rules-Based Chapter & TOC Detection with Regex Patterns
Rich HTML & Text Export with Image Retention
Regex-Driven Content Cleaning & Transformation
Geometric Feature-Based Table Detection & Scoring
Accessible, Distraction-free Reading Mode
CLI for Batch Processing & Automation

API Endpoints

Method	Path	Description
POST	`/api/upload`	Upload a PDF document for processing
GET	`/api/books/{id}/chapters`	Retrieve structured chapters for a book
POST	`/api/export/html`	Export a book as a navigable HTML package
GET	`/api/jobs/{id}`	Check the status of a document processing job

Usage Example

python

import requests
# Example interaction
response = requests.post(
    url="https://api.arcore.internal/api/upload",
    headers={"Authorization": "Bearer <token>"}
)
print(response.json())

Tech Stack

PythonDjangoNext.js/ReactCeleryRedisPostgreSQLPDFMiner

Authentication

•**Header:** `Authorization: Bearer <token>`
•**Scopes:** RBAC is enforced at the object level via `ArcoreCodex` policies.

Compliance & Security

Compliance

✓Data: Ephemeral storage for processing
✓Access: Upload limits and throttling

Security

✓Access: Upload limits and throttling

Coming Soon

1 planned

LMS Package Generation (SCORM/xAPI)
Target: Q2 2025

Related Products

Production

Arcore Maestro

Arcore Maestro is a hybrid, agent-based orchestration conductor for AI and data workflows. It intelligently routes tasks to efficient local LLMs or secure, sandboxed worker tools, reserving large external models for planning, creative generation, and self-healing analysis. It serves as the central nervous system for autonomous agents within the Arcore ecosystem.

Data & Intelligence

Production

local-llm-server

A robust, production-ready API for managing and serving local language models with comprehensive performance monitoring. It provides an OpenAI-compatible API layer over local inference engines (llama.cpp, etc.), enabling secure, air-gapped AI capabilities for the enterprise.

Data & Intelligence

Production

Arcore Career

A Career Knowledge Graph System that treats individual career data (skills, achievements, roles) as a queryable database. ArcoreCareer uses rules-based parsing to extract job details, enables dynamic resume generation, and facilitates structured career planning.

Data & Intelligence