Chapterize icon

Chapterize

ProductionData & Intelligence

An intelligent document processing engine that converts static PDFs into structured, web-ready HTML chapters. Chapterize uses AI to detect logical breaks, clean content, and make legacy documentation accessible, searchable, and mobile-friendly.

Key Features

  • AI-driven Chapter & TOC Detection
  • Rich HTML & Text Export with Image Retention
  • LMS-ready Packages (SCORM/xAPI Compatible)
  • Regex-Driven Content Cleaning & Transformation
  • Accessible, Distraction-free Reading Mode
  • CLI for Batch Processing & Automation

API Endpoints

MethodPathDescription
POST`/api/upload`Upload a PDF document for processing
GET`/api/books/{id}/chapters`Retrieve structured chapters for a book
POST`/api/export/html`Export a book as a navigable HTML package
GET`/api/jobs/{id}`Check the status of a document processing job

Usage Example

python
import requests
# Example interaction
response = requests.post(
    url="https://api.arcore.internal/api/upload",
    headers={"Authorization": "Bearer <token>"}
)
print(response.json())

Tech Stack

PythonDjangoReactCeleryRedisPostgreSQLPDFMiner

Authentication

  • **Header:** `Authorization: Bearer <token>`
  • **Scopes:** RBAC is enforced at the object level via `ArcoreCodex` policies.

Compliance & Security

Compliance

  • Data: Ephemeral storage for processing
  • Access: Upload limits and throttling

Security

  • Access: Upload limits and throttling

Related Services