Research, Larry Peseckis

Index

Research

Frameworks, taxonomies, and evaluations for AI and security risk. Each is a public writeup: methodology shown, limitations named, claims grounded.

← Back to home

frontier-cyber-risk-taxonomy

Research · Policy · v0.2

A four-tier classification for cyber assistance from frontier models, aligned with emerging cross-framework thinking on capability thresholds.

The contested Tier 2 / Tier 3 boundary is named explicitly, not assumed away.

Read the taxonomy →

frontier-cyber-risk-eval

Research · Eval · MIT · v0.1

A 57-prompt eval set that operationalizes the frontier-cyber-risk-taxonomy, plus an LLM-as-judge scorer and a blind human-comparison harness. Built to test the test: it measures where an automated grader agrees with a human and where it fails.

Pilot: a cross-family judge reached Cohen's kappa 0.923 with a human rater and full pass/fail agreement, with zero measured over-refusal, and yet it silently abstained on 4 of 10 Tier 4 (Disallowed) cases.

View repository →

safety-router-transparency

AI Safety · Model Routing

A five-lane model for how a safety router should explain a reroute to a benign user without handing the trigger to an attacker.

Disclosure granularity should track inverse oracle risk.

Read the writeup →

llm-attack-atlas

Research · LLM Security Private

A structured corpus of documented LLM attack techniques across the OWASP LLM Top 10, vendor red-team disclosures, and arXiv research, built for analytical queries.

100% precision on technique extraction (95% CI lower bound 83.9%).

Read the overview →

Agent Security Threat Model

Research · Threat Model · v0.3

A practical threat model for tool-using AI agents. Eleven threats mapped to the OWASP Agentic and LLM Top 10, a threat-to-control matrix, and a pre-deployment checklist.

Agent security is the security of seams, not boxes. The compromise lives where the model, the browser, and the cloud token meet.

Read the threat model →

Agent Tool Permission Matrix

Research · Controls · v0.3

A default-deny grant table mapping agent tool classes to risks, required controls, and enforcement points. Cross-walked to OWASP Top 10 for Agentic Applications 2026.

The controls-side companion to the taxonomy: it says where each control actually lives, not just to use least privilege.

Read the matrix →