ArcoreConduit
A robust, scalable data ingestion and ETL (Extract, Transform, Load) platform that serves as the central "conduit" for the Arcore ecosystem. ArcoreConduit extracts data from diverse external sources (REST APIs, websites, RSS feeds), stores raw data, and transforms it into normalized, structured formats for downstream consumption. It combines traditional ETL capabilities with modern AI/LLM processing for intelligent data transformation.
Key Features
- Multi-Source Data Extraction (REST API, Web Scraping, JavaScript Rendering, RSS/Atom)
- AI-Powered Data Transformation (Gemini, GPT, Claude)
- Automated Scheduling with Celery Beat (Cron-based)
- Health Monitoring & Change Detection
- Pre-configured Template Library (50+ data sources)
- Real-Time Task Queue Processing
- Dynamic Schema Support & Parsing Rules Engine
- Alert System (Email, Slack, Discord, Webhooks)
API Endpoints
| Method | Path | Description |
|---|---|---|
| GET | `/api/datasources/` | List all data sources |
| POST | `/api/datasources/` | Create new data source |
| POST | `/api/datasources/{id}/run_now/` | Trigger immediate extraction |
| GET | `/api/datasources/active/` | List active sources only |
| GET | `/api/normalized/crypto-prices/` | Cryptocurrency price data |
| GET | `/api/normalized/stock-quotes/` | Stock market data |
| GET | `/api/normalized/llm-outputs/` | LLM processing results |
| GET | `/api/monitoring/job-logs/` | Task execution history |
| GET | `/api/monitoring/health-metrics/` | Aggregated health metrics |
| GET | `/api/discovery/templates/` | Data source templates |
| POST | `/api/discovery/test-template/{id}/` | Test template connectivity |
| GET | `/api/health/` | System health check endpoint |
Usage Example
import requests
# Example interaction: Create data source
response = requests.post(
url="https://api.arcore.internal/api/datasources/",
headers={"Authorization": "Bearer <token>"},
json={
"name": "CoinGecko BTC Price",
"source_type": "API",
"endpoint": "https://api.coingecko.com/api/v3/simple/price?ids=bitcoin&vs_currencies=usd",
"parser_function": "parsers.crypto_parser",
"schedule_cron": "*/15 * * * *"
}
)
print(response.json())
# Trigger immediate extraction
response = requests.post(
url="https://api.arcore.internal/api/datasources/123/run_now/",
headers={"Authorization": "Bearer <token>"}
)
print(response.json())Tech Stack
Authentication
- •**Header:** `Authorization: Bearer <token>`
- •**Scopes:** RBAC is enforced at the object level via `ArcoreCodex` policies.
- •**Infisical Integration:** External API keys stored in Infisical vault
Compliance & Security
Compliance
- ✓Authentication: Django session-based with CSRF protection
- ✓Secrets Management: Infisical vault integration for API credentials
- ✓Data Integrity: Raw data preserved in JSONB before transformation (no data loss)
- ✓Health Monitoring: Automated health checks with alert rules
- ✓AI Guardrails: LLM processing with retry logic and token tracking
- ✓Audit Logging: Complete task execution history with Celery task IDs
- ✓Encryption: TLS 1.3 transit, AES-256 rest
Security
- ✓Encryption: TLS 1.3 transit, AES-256 rest
Related Products
Arcore Maestro
Arcore Maestro is a hybrid, agent-based orchestration conductor for AI and data workflows. It intelligently routes tasks to efficient local LLMs or secure, sandboxed worker tools, reserving large external models for planning, creative generation, and self-healing analysis. It serves as the central nervous system for autonomous agents within the Arcore ecosystem.
local-llm-server
A robust, production-ready API for managing and serving local language models with comprehensive performance monitoring. It provides an OpenAI-compatible API layer over local inference engines (llama.cpp, etc.), enabling secure, air-gapped AI capabilities for the enterprise.
Chapterize
A document processing engine that converts static PDFs into structured, web-ready HTML chapters. Chapterize uses a rules-based engine with strict regex patterns and DOM analysis to detect logical breaks, clean content, and make legacy documentation accessible, searchable, and mobile-friendly.