What This Is
SSCA (Structured Semantic Compression Algorithm) is a multi-layer, lossless text compression platform built on semantic lookup tables — designed to cut data infrastructure costs at scale while preserving perfect semantic fidelity.
The inventor, R. Claude Armstrong, is an 80-year-old independent engineer from Everett, Washington, with 75+ years of pattern recognition experience across industrial crane systems, wastewater treatment, welding processes, and medical equipment maintenance. SSCA grew from observing how mission-critical systems achieved remarkable efficiency through symbolic abstraction rather than raw compute power.
The 8-Layer Compression Stack
Layers 1–4 form the open-source foundation. Layers 5–7 are proprietary. Layer 8 is an optional specialty module for literary/marketing content.
| Layer | Name | Mechanism | Status |
|---|---|---|---|
| L1 | Symbol Substitution | 550-entry word/phrase → §symbol dictionary. 20–30% on general text. | ✓ Production |
| L2 | Contextual Compression | Bigrams, trigrams, collocations (~200 patterns). +7–12% additional. | ✓ Production |
| L3 | Hierarchical Abstraction | Hypernym substitution, redundant modifier removal. +10–15%. Semantic lossless. | ⚠ Bug — see §04 |
| L4 | Predictive Inference | Omission-map: drops highly predictable words, stores position map. +8–15%. | ✓ Production |
| L5 | Data-Driven Dictionary | Learns optimal symbols from corpus. Domain detection built in. +10–25%. | ✓ Production |
| L6 | Template Repetition | Detects structural templates (logs, forms). Stores template once + variables. | ✓ Production |
| L7 | Cross-Reference | Replaces repeated long strings with short §RN reference IDs. +10–25%. | ✓ Production |
| L8 | Metaphor Compression | 25+ metaphor families (Lakoff/Johnson). Useful for literary/marketing content only. | Optional module |
Compression Claims — Corrected
Original claims were additive. Corrected below (multiplicative gains):
| Scope | Claimed | Realistic |
|---|---|---|
| L1–L4 General text | 50–60% | ~34–48% |
| L1–L5 Domain-specific | 65–75% | ~41–61% |
| L1–L6 Logs / structured | 80–90% | ~70–85% |
| L1–L7 Legal / technical | 80–95% | ~65–80% |
Current Technical State
Solid
Layers 1, 2, 4, 5, 6, 7 run correctly. Cascade integration straightforward. Losslessness verified. L6 on logs is impressive.
Needs Fix
Layer 3 symbol format bug: §H:building (11 chars) > "building" (8 chars) → negative compression.
Needs Repositioning
Layer 8 mislabeled as core — should be optional specialty module.
Not Yet Built
DNA/P³ Router, OCR/PDF pre-processing pipeline.