An Architectural Design Space for Internal Ethical Counterweights in AI Systems

Janer Tittarelli, Javier Ignacio

doi:10.5281/zenodo.18508163

External regulation of AI systems addresses behavior at the output layer — what systems produce, not how they produce it. This paper maps the architectural design space for internal ethical counterweights: mechanisms embedded within AI systems that constrain, redirect, or flag outputs before they reach the external interface. The analysis draws on control theory, organizational design, and institutional architecture to identify the structural positions where counterweight mechanisms can be placed, the failure modes each position is susceptible to, and the conditions under which internal constraints complement rather than substitute for external governance.