AI safety research from Australia

We take the possibility of transformative AI seriously, and conduct research to help understand and prepare for the most consequential risks.

Our Mission

Grow Australia’s capacity for technical AI safety research. Provide talented researchers with the resources, direction, and institutional support to do work that matters.

What We're Working On

Offensive Cyber Time Horizons

Measuring the rate at which AI offensive cybersecurity capability is increasing, using IRT methodology across seven benchmarks with professional human baselines.

Covert Capability Evaluation

Developing structured methods to understand how effectively AI models can pursue covert objectives while evading detection. Drawing on AI control research and human-grounded performance comparison.

Interpretability via Activation Oracles

Exploring cycle consistency as an unsupervised training method to make the internal representations of language models legible, removing the need for labelled activation datasets.