Skip to main content

System Overview – R2 Mechanics

A detailed technical overview describing the architecture, objectives, workflows, and downloadable documentation of the

R2 Mechanics offline transcription system.

Additional Details

R2 Mechanics is designed with strict data protection principles in mind. The entire processing workflow operates fully offline, ensuring that no sensitive audio materials are uploaded to any cloud service. All steps are audit-ready and can be transparently documented for institutional review, meeting GDPR and other privacy requirements.

  This documentation describes the full system architecture and methodology in detail to establish transparency and priority protection. While the operational code remains private to ensure data security, all processing stages are verifiably implemented and available for review or institutional pilot projects.

 R2 Mechanics follows a modular, methodical approach that ensures each stage of transcription, structuring, annotation, and visualization is transparent and logically separated. The system is intended to provide clear provenance, documentary-quality outputs, and a rigorous framework for long-term archival preservation.

 R2 Mechanics is particularly suitable for oral history projects, cultural heritage archives, academic research collections, journalistic investigations, and any institutional context where data privacy and transparent documentation are essential. The system also supports public administration and parliamentary archives needing GDPR-compliant, locally controlled workflows.

  Outputs are generated as clean, human-readable HTML files that require no proprietary software to view, ensuring accessibility decades into the future. This format supports sustainable digital preservation strategies and is suitable for static site hosting, institutional repositories, or offline archival storage.

R2 Mechanics supports fully offline workflows and is available under NDA upon request for sensitive or unpublished materials.

Project Resources & Downloads

 R2 M – Technical Whitepaper v1.0 – 2025 (EN, PDF)

R2 M – Technisches Whitepaper v1.0 – 2025  (DE, PDF)

All downloads and documentation are intended for transparency and auditability. No personal data is collected or processed via this site.

R2-Mechanics Achitecture


R2 Mechanics
 is a modular, fully offline system for structured, privacy-compliant transcription, annotation, and interactive visualization of audio recordings. It is designed for research institutions, archives, and cultural organizations that require transparent, audit-ready, and locally controlled processing.


Objectives and Principles

The system was developed to support fully local, license-compliant, and audit-ready processing of sensitive or historical audio content. Goals include 100% offline operation without cloud dependencies, full auditability of all processing steps, and human-readable, long-term archivable HTML outputs.


Use Cases and Target Groups

Typical use cases include oral history and witness interviews, cultural heritage preservation in archives and museums, university research collections, media documentation, investigative journalism, public administration, and any institution with high data protection requirements.


Example Live Demos

Explore structured, audit-ready HTML outputs:
JFK Moon Speech (1962) – Single-speaker archival transcription with chapter headings.
Apollo 11 Press Conference – Multi-speaker technical documentation with structured chapters.
UAP Congressional Hearing (2024) – Highly segmented, multi-hour hearing with speaker separation and AI-generated notes.


Contact & Cooperation

For pilot projects or cooperation inquiries:
David Thiry
📧 office@r2-mechanics.com
🌐 Project Website

Technical Architecture


R2 Mechanics is designed as a fully offline, modular pipeline that runs locally without any cloud dependencies. Each stage is carefully separated for transparency and auditability.


Input Formats

The system accepts standard audio files such as MP3 or WAV, with optional metadata for speaker lists, timecodes, or annotations. This flexible approach ensures compatibility with archival materials, oral history recordings, and institutional collections.


Processing Steps

The pipeline includes GPU-accelerated transcription using WhisperX with built-in speaker diarization for GDPR-compliant local processing. Structuring is achieved through the generation or parsing of chapter files that include titles, timestamps, and descriptions. An LLM-assisted analysis step produces detailed chapter summaries and precise semantic notes, fully aligned with timecodes. Visual enrichment is handled via local Stable Diffusion generation, creating chapter-based illustrations for better contextual understanding.


Processing Logs & Offline Audit Trail

R2 Mechanics includes an integrated offline protocol to document all processing steps with time-stamped logs for each transcription project. This ensures full local traceability and supports archival standards for audit-ready documentation.


Output Format

The result is a fully interactive HTML transcript featuring an embedded audio player with jump links, an auto-generated table of contents, clear chapter headings with optional SD-generated images, time-aligned summaries and notes, and precise speaker segmentation. It can also be exported as a static website for long-term hosting or archiving, including GitHub Pages compatibility.


Typical Directory Layout

A typical project structure includes folders for original audio files (input_audio), generated interactive transcripts (output_html) such as project_transcript.html, chapter structure files (chapters/project_chapters.txt), generated summaries (summaries), semantic LLM-based notes (notes), locally generated illustrations (images), and private processing modules or scripts (scripts).


Key Features at a Glance

R2 Mechanics offers fully local GPU-accelerated transcription with WhisperX, speaker diarization with precise timestamps, automated LLM-based semantic annotations, Stable Diffusion–generated chapter images, and structured HTML export complete with audio playback and intuitive navigation. All processing is 100% offline, ensuring auditability and long-term preservation.


Further Documentation

For detailed documentation and technical whitepapers, please see:
Whitepaper (EN, PDF)
Whitepaper (DE, PDF)
README (EN, Markdown)