We are announcing new grants for research into systemic AI safety.
Initially backed by up to £8.5 million, this program will fund researchers to advance the science underpinning AI safety.
Read more:
Jade Leung (our CTO) and
@geoffreyirving
(our Research Director) have been nominated in the
@TIME
top 100 most influential people in AI 2024
We're incredibly proud of this team. They're proof of the immense impact technologists can have by joining the government.
📢 We're hosting a conference in November on frontier AI safety commitments.
With
@GovAI_
, we'll bring together AI companies and researchers from all over the world in San Francisco to discuss the safer development of AI.
We're opening an office in San Francisco!
This will enable us to hire more top talent, collaborate closely with the US AI Safety Institute, and engage even more with the wider AI research community.
Read more:
Safety cases - clear, evidenced arguments for how new models are safe - can help to build confidence in the safe application of these fast-moving technologies.
@geoffreyirving
explains how we're building safety cases into our work with developers.
What’s more important than a free lunch? 🍔
Our Chief Scientist,
@GeoffreyIrving
, on why he joined the UK AI Safety Institute and why other technical folk should do the same 👇
A common technique for quickly assessing an AI model's capabilities is prompting it to answer hundreds of questions, then automatically scoring its answers.
Here are our key insights from a year of using this technique at AISI.
1 year ago today, we started out as the Foundation Model Taskforce with £100m investment from the UK government.
Today, we're the
@AISafetyInst
.
In the last year, we've built our team, evaluated new models & enhanced global AI safety.
The UK-US agreement on AI safety is a significant moment for the AI Safety Institute and for the development of global safety standards on AI.
Read below to find out more about how we will collaborate.
(1/4)
The UK and US have signed a landmark agreement on AI safety.
This will see the UK's
@AISafetyInst
join forces with the US AI Safety Institute on testing models and sharing research 🇬🇧🇺🇸
Find out more 👇
Should AI systems behave like people?
AI systems that can interact with us naturally are getting better. Humanlike AI systems could be more engaging, but pose safety risks and raise ethical questions.
Our new study asks the UK public what they think.
1/
We have published our latest progress report.
Read more below on our fast-paced work delivering on our mission to evaluate AI models and advance the science on AI risk.
We open-sourced Inspect, our framework for large language model evaluations. We're excited to see the research community use and build upon this work!
(1/3)
This partnership is a key moment in the development of an international network of AI safety institutes.
Together, we will:
▪️ collaborate on model evaluation
▪️ share information and resources
▪️ catalyse new research on AI safety
The United Kingdom and Canada have signed an agreement to work closely together on AI safety 🇬🇧🇨🇦
The
@AISafetyInst
will collaborate closely with its Canadian counterpart as part of the growing network of AI safety institutes following the first AI Safety Summit at Bletchley.
AISI is partnering with the
@BritishAcademy_
to support researchers working on technical and governance approaches to AI Safety.
UK-based individuals at universities and independent research orgs can apply for Innovation Fellowships.
Find out more:
The legacy of Bletchley Park will continue with two days of talks in May at the AI Seoul Summit.
AI safety is a shared global challenge, and these continued discussions will ensure we can deliver a safe, responsible approach to AI development.
The next edition of the AI Safety Summit, the AI Seoul Summit, will be taking place 21-22 May 🇬🇧 🇰🇷
More on how this will build on the legacy of November's summit at Bletchley Park 👇🏻
By working together, the UK and US can minimise the risks of AI and harness its potential to help everyone live happier, healthier and more productive lives.
Find out more:
(4/4)
Our historic AI safety alliance has been strengthened by
@SecRaimondo
today as the US AI Safety Institute grows its team.
We look forward to continuing to share expertise and insights to lead the safe development of AI across the globe.
Together, we will develop shared frameworks for testing advanced AI.
This will help to develop international standards and best practices that other countries and organisations can adopt.
(3/4)
If you work at the intersection of AI and security and are passionate about the safety of leading-edge AI systems, you should consider bringing your research talent to AISI's cyber and safeguards analysis teams!
More here:
5/5
Trustworthy Multi-Modal Models & AI Agents
Agentic Markets
Models of Human Feedback for AI Alignment
Humans, Algorithmic Decision-Making & Society
Gen AI + Law
The US and UK AI Safety Institutes will jointly test advanced AI models.
We will share research insights, share model access, and enable expert secondments between the Institutes.
(2/4)
Before joining us as CTO, Jade Leung worked at
@FHIOxford
and led the Governance team at
@OpenAI
.
"I’ve been really inspired by my time at AISI so far. We are building a unique organisation that is purely public interest motivated, with an important role in frontier AI safety."
We’re hiring ML research scientists & engineers, a technical lead/programme manager, cybersecurity researchers & more. We may also soon open roles in operations & policy. Message us to express general interest and ask any questions.
AISI’s hiring across seniority levels for ML RS/REs to drive our cybersec and safeguards evals, as well as cybersec researchers. Our role specs are not prescriptive – we’d love to talk even if you’re looking for something slightly different to what’s written down!
(5/6)
Sharing Inspect through open source means our approach to AI safety evaluations is now available to anyone to use and improve, leading to high-quality evaluations across the board and boosting collaboration on AI safety testing.
(3/3)
Inspect enables researchers to easily create simple benchmark-style evaluations, scale up to more sophisticated evaluations, and build interactive workflows.
(2/3)
Geoffrey Irving previously worked at
@OpenAI
and
@GoogleDeepMind
before joining us as Chief Scientist.
“I moved to AISI because the salience of AI risks increased among the public and governments, and the UK have been uniquely proactive in leading the conversation.”
The three-day challenge will ask hackers to find failure modes of
@allenai_org
's latest large language model, OLMo, using Inspect, AISI's open-source evaluations framework, and a slick interface developed by
@dreadnode
.
(2/6)
We'll be presenting at How far are we from AGI? (9:45-10am), Generative and Experimental Perspectives for Biomolecular Design (10:45-11am), Privacy Reg & Protection in ML (2-3pm), and Reliable and Responsible Foundation Models (time tbd).
@fly_upside_down
gave a Q&A on Inspect, which was used by participants in this year's generative red-teaming challenge to evaluate
@allenai_org
's new OLMo model
Find Inspect on github:
2/5
The GRT with
@aivillage_dc
was a huge hit, surfacing lots of the difficulties that are faced in assessing and reporting on the failure modes of large language models.
3/5
We'll also be attending the workshops on LLM agents, Secure and Trustworthy LLMs, and Data Problems for Foundation Models and the socials for ML safety and Women in ML. Hope to see everyone there!
If you're at DEF CON or in Las Vegas for Blackhat and want to talk security of AI or cyber evaluations of large language models, be sure to reach out to the folks going!
You can message them at
@alxndrdavies
@stochastictalk
@alexandrasouly
@yaringal
(4/6)
4/
These views on humanlike AI help ensure that what counts as “safe” AI behaviour isn’t decided by researchers or policymakers alone.
This is key as we work with the wider AI community to minimise potential harm to the public from AI.
Read more:
3/
Here’s our findings:
➔ Most people think AI should reveal itself not to be human
➔ Most don't want AI to express emotions except idioms like “I’m happy to help”
➔ Most people do not think people can form personal relationships with AI systems