Conjecture Profile Banner
Conjecture Profile
Conjecture

@ConjectureAI

3,453
Followers
3
Following
14
Media
100
Statuses

We are building Cognitive Emulation, an alternative AI paradigm that is controllable, transparent, and safe. Get in touch at hello @conjecture .dev.

London
Joined June 2022
Don't wanna be here? Send us removal request.
@ConjectureAI
Conjecture
2 years
Large transformer language models are extremely poorly understood. Even the most basic statistics about their internal weights and activations are unknown. In this research, we harvest some of the extremely low hanging fruit.
Tweet media one
1
11
98
@ConjectureAI
Conjecture
1 year
We surveyed the Conjecture team about advanced AI: - Respondents estimated a 70% chance of human extinction from advanced AI getting out of control. - Respondents estimated an 80% chance of human extinction from advanced AI in general (loss of control + misuse risk)
Tweet media one
16
13
83
@ConjectureAI
Conjecture
2 years
Direct interpretability of transformer weights, especially MLP layers, has yielded poor results until now: we found the principal directions of weights of MLP layers and output of attention layers in transformers are highly interpretable and monosemantic.
2
13
86
@ConjectureAI
Conjecture
11 months
Our CEO Connor Leahy [ @NPCollapse ] gave his 'AGI in Sight' talk at @CogX_Festival . He spoke about how AGI companies are in a multi-way standoff - racing towards the precipice. But it doesn't have to be this way. We can just stop, and build the institutions for a good future.
9
18
81
@ConjectureAI
Conjecture
2 years
Our research also considers issues in the context of self-supervised learning. Early alignment theory focused on optimized agents, but this frame seems confused when applied to modern LLMs. @repligate proposed simulators as an alternative frame.
0
5
48
@ConjectureAI
Conjecture
2 years
In our second basic facts about LLMs post, we extend our analysis to the distributions of weights, activations, & gradients through training based on the open-source checkpoints of the @AiEleuther Pythia model suite.
2
3
39
@ConjectureAI
Conjecture
1 year
Conjecture and @ArayaPress are co-hosting the first Japan AI Alignment Conference, in Tokyo this weekend on March 11 and 12. We’ll be discussing interpretability, conceptual alignment, AI applications, strategy, and field building. See the agenda here:
0
5
28
@ConjectureAI
Conjecture
10 months
We've changed our look! Check out our new website here:
Tweet media one
0
4
25
@ConjectureAI
Conjecture
7 months
2 years ago, a small group of hackers came together to form Conjecture, and we've been busy! We are incredibly excited for the progress we have been making on our CoEm agenda and think 2024 will be an exciting year to say the least. Read our short update post here:
3
2
24
@ConjectureAI
Conjecture
2 years
Last year, we focused on improving our models of ML systems and what’s blocking us from reliably understanding and controlling them. One central obstacle to understanding neural nets is polysemanticity – the phenomenon of individual neurons representing multiple concepts.
Tweet media one
1
2
22
@ConjectureAI
Conjecture
10 months
CEO Connor Leahy ( @NPCollapse ) attended the AI Safety Summit at @BletchleyPark on behalf of @Conjecture . In an interview with @SciTechgovuk Connor spoke about how the US and China were now “addressing risks from AI as an international global priority.”
@SciTechgovuk
Department for Science, Innovation and Technology
10 months
“I think AI can do things that we are just dreaming now.” On Day 1 of the #AISafetySummit we asked some of the influential decisionmakers in attendance to share their thoughts on the Summit, Frontier AI & what the future could look like if we safely harness its power for good 👇
4
22
50
2
7
22
@ConjectureAI
Conjecture
2 years
Another obstacle to understanding neural nets is nonlinearity. We looked into LayerNorm and found that although it is supposed to just stabilize training, it can be used to implement non-trivial functions just like a typical non-linearity (e.g. ReLU).
1
0
20
@ConjectureAI
Conjecture
1 year
Respondents expected AGI to be developed in 2030 (7-year timelines), one year sooner than Metaculus users predicted. Predictions ranged from 2027 to 2035.
Tweet media one
2
0
19
@ConjectureAI
Conjecture
2 years
We’ve been heads-down in research for a while, and are excited to share some updates about our past work and where we’re headed in 2023
0
0
18
@ConjectureAI
Conjecture
1 year
Our CEO @NPCollapse will give his talk "AGI in Sight" at 2023's @CogX_Festival next week. Alongside other great speakers including @harari_yuval , Professor Stuart Russell, Jaan Tallinn, and more. Book your tickets here:
Tweet media one
1
2
16
@ConjectureAI
Conjecture
2 years
We find many interesting behaviors emerge during training and study how the model shifts from its Gaussian initialisation to its final, quite different, endpoint.
1
2
14
@ConjectureAI
Conjecture
1 year
. @NPCollapse added: "They'd create an intergovernmental project, fund it, and have the best minds come together to work on the hard safety problem until we're sure we've solved it.
1
2
16
@ConjectureAI
Conjecture
2 years
LLMs thus shift their representations from Gaussian initialisation to closely match the basic distribution of the input data and maintain this throughout the deep network.
0
0
11
@ConjectureAI
Conjecture
2 years
We're still far from reliably interpretable AI. Unaligned AI may have incentives to make its thoughts difficult for us to interpret, and there are many ways sufficiently capable AIs could circumvent interpretability methods, as we wrote about last June.
0
1
11
@ConjectureAI
Conjecture
2 years
An especially interesting result is we show the spectrum of the weight matrices is clearly power-law with clear structure across blocks which precisely match the statistics of the data distribution.
1
0
10
@ConjectureAI
Conjecture
1 year
There were 22 survey respondents for the probability of extinction questions and 18 respondents for time-to-AGI questions (the vast majority of our team). The estimates above are medians.
1
1
10
@ConjectureAI
Conjecture
1 year
Our CEO, Connor Leahy, ( @NPCollapse ) at the @UKHouseofLords last week discussing the risks from the development of artificial general intelligence and the legislation needed to address them.
@NPCollapse
Connor Leahy
1 year
I had a great time addressing the House of Lords about extinction risk from AGI. They were attentive and discussed some parallels between where we are now and non-nuclear proliferation efforts during the Cold War. It certainly provided me with some food for thought, and some
Tweet media one
298
105
816
0
1
8
@ConjectureAI
Conjecture
1 year
Thanks to @sala_maris for running the survey and analyzing the results!
0
0
8
@ConjectureAI
Conjecture
2 years
We compile and analyze the distributions of weights, activations, gradients, & spectra of GPT2 series and make a number of novel discoveries including the prevailing power law nature of weight and data distribution, and consistent patterns of norms across depth and model-sizes.
1
0
6
@ConjectureAI
Conjecture
2 years
This means we can provide a kind of static analysis of transformer weights to determine the principal action of each block. Moreover, we show that we can use these directions to provide targeted semantic edits of specific concepts.
Tweet media one
0
0
5
@ConjectureAI
Conjecture
2 years
We found polytopes are less polysemantic and there are more between semantically different than semantically similar regions, suggesting that neural networks use polytope boundaries to define semantic boundaries.
0
0
5
@ConjectureAI
Conjecture
2 years
We first show the evolution of the weights, activation and gradients across training. We demonstrate the distribution shifts from a Gaussian initialisation to an extremely heavy tailed distribution. This happens early in training and often appears to be a rapid phase transition.
1
0
5
@ConjectureAI
Conjecture
10 months
Cubed automates the ticket writing and research process for developers, freeing up hours of time that can be spent on coding instead.
0
1
5
@ConjectureAI
Conjecture
2 years
We also investigate the formation of representations in the embedding across training. We observe a clear and rapid emergence of structure in the first 256 dimensions (corresponding to the ascii characters) of the correlation of the embedding matrix across training.
0
0
3
@ConjectureAI
Conjecture
1 year
Our Head of Strategy and Governance @_andreamiotti has published his Priorities for the UK Foundation Models Taskforce. This institution will be tasked with 'the greatest governance challenge of our times: keeping humanity in control of a future with increasingly powerful AIs.'
@_andreamiotti
Andrea Miotti
1 year
The new UK's Foundation Models Taskforce, led by @soundboy , has a chance to shape bold domestic AI policy & set examples to be emulated abroad. The enormous AI challenge needs ambitious policymaking! Here are a few ideas about how to make it a success:
2
14
53
1
1
4
@ConjectureAI
Conjecture
2 years
We also find phase transitions and regular patterns in weight and residual norms that are highly consistent over an order of magnitude in model scale.
1
0
3
@ConjectureAI
Conjecture
10 months
We’re working on charting a path to useful, controllable AI with Cognitive Emulation: a research agenda that makes use of task-specific models that do not require thousands of H100s to train.
0
0
3
@ConjectureAI
Conjecture
2 years
Given this, ignoring LayerNorm in interpretability can cause us to miss important parts of how a model solves a problem.
1
0
3
@ConjectureAI
Conjecture
1 year
Read more here:
0
2
2
@ConjectureAI
Conjecture
2 years
In some toy examples, we're able to understand what LayerNorm is doing via geometric visuals, and this could be promising for interpreting non-linearities in general.
0
0
2
@ConjectureAI
Conjecture
10 months
But understanding what the problem is only the first step. Now we must correctly remedy the problem. While the Summit provided a good starting point in terms of international coordination, it provided few concrete policy steps to put humanity on a safer long-term trajectory.
1
0
1