Personal Research Lab

Andrii Petruk

Systems Researcher · Infrastructure Architect · AI Systems Builder

Building trustworthy intelligent systems at the intersection of AI, infrastructure, and architecture.

Focused on LLM reasoning safety, cloud-native systems, distributed systems, and autonomous infrastructure.

Thesis

Researching how intelligent systems should reason, not just generate.

My work connects practical infrastructure engineering with research questions around model reasoning, reliability, and safe automation. The goal is not to make systems look intelligent, but to make them trustworthy under real operational constraints.

I am interested in AI systems that can understand architecture constraints, evaluate trade-offs, explain uncertainty, and support engineers without hiding failure modes behind fluent language.

What I Work On

Core areas

LLM Reasoning Safety

Studying how language models reason about software architecture, where confident recommendations become unsafe, and how to evaluate AI-assisted design decisions before they reach production.

Cloud-Native Architecture

Designing resilient platforms based on Kubernetes, GitOps, CI/CD, observability, policy-as-code, and distributed systems principles.

Autonomous Infrastructure

Exploring infrastructure tools that can explain, validate, and safely assist operational decisions while keeping deterministic engineering controls in place.

Reliability Engineering

Thinking about failure modes, graceful degradation, incident response, and the operational practices that turn complex systems into reliable systems.

Selected Work

Projects and systems

Research-oriented systems, prototypes, and tools around AI infrastructure and architecture reasoning.

KubX

AI infrastructure / research project

A research-driven platform for intelligent cloud-native infrastructure, focused on safe automation, architecture reasoning, and operational reliability.

Architecture Reasoning Benchmark

paper / evaluation

A benchmark-style research effort for evaluating how LLMs reason about cloud-native architecture, quality attributes, trade-offs, and impossible system designs.

Infrastructure Knowledge Systems

experiments

Experiments around turning operational knowledge, runbooks, and architecture constraints into usable reasoning context for engineers and AI systems.

Research

Current research directions

Papers, preprints, notes, and ongoing research will live here as the research portfolio grows.

LLM-Augmented Architecture Style Selection for Cloud-Native Systems

Preprint direction · LLM evaluation · cloud-native architecture

Evaluating whether language models can correctly map system requirements and quality attributes to feasible architecture styles.

Safety Analysis of LLM Architectural Reasoning

Ongoing research · AI safety · software architecture

Studying cases where models produce confident but unsafe or infeasible recommendations for reliability-critical systems.

Autonomous Infrastructure and Human-in-the-Loop Reliability Controls

Research notes · autonomous infrastructure · operations

Exploring how AI-assisted infrastructure can remain explainable, bounded, auditable, and useful to engineers.

Writing

Essays and technical notes

Long-form writing on AI reasoning, infrastructure, reliability, and systems thinking.

What LLMs Get Wrong About Architecture

A practical essay on confident but invalid architectural recommendations, hidden feasibility gaps, and why AI-generated system designs need validation.

Reliability Lessons from Distributed Systems

Notes on failure modes, operational discipline, observability, and building systems that degrade gracefully under pressure.

Neural Networks Generating Neural Networks

A research direction for systems that produce, evaluate, and evolve model architectures under explicit safety and reliability constraints.

Why Infrastructure Needs Reasoning Models

An argument for tools that understand constraints, dependencies, and trade-offs instead of simply generating configuration files.

Operating Principles

How I think about systems

Reliability is a property of system design, not a feature added later.

AI systems should reason about constraints, not only generate plausible text.

Autonomy in infrastructure must remain observable, explainable, and bounded.

Good architecture is the discipline of trade-offs under real constraints.

About

From SRE to systems research and intelligent infrastructure.

Background

I’m a systems engineer and researcher focused on building reliable, interpretable, and architecturally sound intelligent systems. My work sits at the intersection of distributed systems, cloud-native infrastructure, and artificial intelligence — with a particular focus on how large language models reason about complex systems and make architectural decisions.

I started my career in infrastructure and site reliability engineering, working on production systems at scale. That experience shaped how I think about systems: reliability is not a feature, but a property that must be designed, measured, and continuously enforced. Real systems fail in complex ways, and good engineering requires understanding constraints, dependencies, failure modes, and operational behavior over time.

Transition to Research

Over time, I became increasingly interested in the deeper questions behind system design — not only how systems operate, but how they should be designed, how architecture decisions are made, and how we can build systems that are both powerful and trustworthy.

This naturally led me toward research-oriented work. I began exploring how intelligent systems, particularly large language models, reason about architecture and where their limitations appear in practice. I observed a gap between confident model outputs and actual architectural feasibility, especially in reliability-critical environments. That gap raises important questions about safety, reasoning, evaluation, and trust in AI-assisted system design.

Current Focus

Today, I work on problems that sit between theory and real-world systems: LLM reasoning and architectural decision-making, safety and reliability of AI-generated system designs, cloud-native architecture at scale, autonomous infrastructure, and intelligent tooling for engineers.

I’m particularly interested in moving from systems that generate answers to systems that can reason, validate, and operate reliably in complex environments. For infrastructure, this means AI tools should not only produce code or diagrams — they should understand trade-offs, identify impossible assumptions, expose uncertainty, and remain bounded by observable engineering controls.

Projects and Perspective

Alongside research, I build systems that explore these ideas in practice. I’m currently working on KubX, a project focused on intelligent infrastructure and AI-assisted system design, as well as smaller experimental systems that investigate how reasoning models can interact with real-world architectures.

I tend to approach problems from a systems-first perspective: understand the constraints, failure modes, trade-offs, and long-term behavior before optimizing for surface-level performance. This perspective comes from years of working with production systems where correctness, reliability, and observability matter more than idealized assumptions.

Now

At this stage, I’m actively exploring research directions and preparing for a potential PhD path while continuing to build and experiment with intelligent infrastructure systems. I’m especially interested in how intelligent systems should reason about architecture, what “safe” system design means in the context of AI, and how to build systems that are both autonomous and reliable.

I’m based in California, originally from Ukraine. Outside of work, I’m interested in long-form writing, systems thinking, and ideas at the intersection of engineering and research.

Now

Currently

Building KubX and exploring AI-native infrastructure.
Writing about LLM architectural reasoning, reliability, and safety.
Preparing a stronger research profile for future PhD and publication paths.
Developing essays and technical notes on systems thinking and autonomous infrastructure.