Raza Habib @RazRazcle Twitter profile

Last Seen Profiles

@j9s_t

@rh_tsunokei

@bokeplokalmalam

@kintoreofficial

@habibaatou

@DKlakun

@simhatam

@npsmz

@nyc_b2b

@Notaboutme_serv

@mayu__ma_yu

@FlacoOce

@TexitDarling

@Cmac5313

@SamuelCol1115

@webeside

@temithehuman

@IsabelPe19

@Gaziantepevli0

@bethreinhard

@Political_Aavaj

@itsjoeee__

@Dev_CronRblx

@stelzner_robert

@ecco_ntk

@BillsHelmetBar

@Kimlykesseh1

@vannabardot

@mcclure_jake

@MabelGalla58482

@mpummindlovu

@GunguryS

@GERARDO66913900

@GANGSTERS_1

@chicagorowing

@BATMAN83105649

Raza Habib

@RazRazcle

2 years

The ReAct Paper is next-level prompt engineering. If you understand how it works then you can start building LLM apps that are way more factual than chatGPT and can use external APIs and tools. Check-out the example at the end. To understand ReAct, let's think step-by-step:

49

234

2K

Raza Habib

@RazRazcle

2 years

Is LLM finetuning worth it? If you know what you're doing, finetuned models can be 30x smaller❗️without losing performance. This can unlock applications that would otherwise be too expensive or slow. STaR is a great example of doing it right. (DIY instructions at the end)

10

124

879

Raza Habib

@RazRazcle

2 years

Overheard in SF: "a few years from now the LLMs will be telling humans to think step-by-step"

20

66

542

Raza Habib

@RazRazcle

1 year

GPT-4's coding ability is just insane. I feel like I have super powers.

23

27

511

Raza Habib

@RazRazcle

2 years

Microsoft/OpenAI and Google/Anthropic's investment partnership is a comedy of sorts. They're basically just handing over loads of cash that they then get back through cloud compute.

30

25

420

Raza Habib

@RazRazcle

1 year

The Toolformer paper is underwhelming. Teaching an LLM to use tools is exciting! but the tools considered are disappointing. So much effort to choose between Wikipedia and a calculator?! There are a few compelling insights though! So, what should you take away?

10

59

405

Raza Habib

@RazRazcle

2 years

The rate of progress in AI is so fast right now that many companies have a late mover advantage.

10

12

302

Raza Habib

@RazRazcle

1 year

This graph from the GPT-4 tech report is a much bigger deal than most people seem to have realised. I think it allows us to predict with reasonably high confidence that the problem of LLM's making things will be quite easy to solve. Here's why:

13

35

315

Raza Habib

@RazRazcle

9 months

@deliprao Meanwhile LLMs:

17

13

260

Raza Habib

@RazRazcle

1 year

"No GPUs before PMF" should be the mantra of most applied AI start-ups. The cloud LLMs are so good now, most people should start here and optimise later.

7

16

250

Raza Habib

@RazRazcle

1 year

It feels like academia is just a few months behind LLM twitter now... Should I still do a summary of this paper?

14

18

237

Raza Habib

@RazRazcle

9 months

The main difference between safety folks and accelerationists is that the safety people actually believe AGI is possible soon. E/acc is actually the pessimistic position and mostly held by people who were surprised by recent ai progres or pivoted from crypto.

21

11

203

Raza Habib

@RazRazcle

2 years

Had a bit of a play with GPT-3 and @LangChainAI yesterday. With a bit of prompt magic from @humanloop and access to the Serpapi, it can do a decent stab at writing sales emails. @gojira what do you think? green is GPT-3, blue is Google search

13

24

200

Raza Habib

@RazRazcle

2 years

I really like ChatGPT but I want it to have the context of my recent thoughts/writing and access to things like my calendar and email.

23

1

180

Raza Habib

@RazRazcle

11 days

London in the summer is the best place on earth

27

15

184

Raza Habib

@RazRazcle

8 months

I received a great cold email today for a job app. The candidate had 1) tried Humanloop 2) had an idea for an improvement and had 3) created a loom of a mock solution. Probably only took 1-2 hours but was better than 95% of applications I receive.

6

7

160

Raza Habib

@RazRazcle

2 years

Surprisingly, the latest Open AI models: chatGPT and text-davinci-003 are actually finetuned from the code generation models, not pure text generation. There's a lot more detail on what base models are used in the model index for researchers:

4

15

155

Raza Habib

@RazRazcle

1 year

I'm making a list of different LLM apps and companies to feature in a blog post. What are the most interesting applications you've seen?

46

6

138

Raza Habib

@RazRazcle

2 years

An interesting takeaway from the HELM benchmark is that the @CohereAI base models outperform most other base-models (GPT-3 davinci-1 etc). The models that beat Cohere are instruction tuned. Very curious to see the evaluations that also include Cohere instruct models!

6

19

138

Raza Habib

@RazRazcle

2 years

LLMs and AI mean that you need less capital than ever to build a compelling product and start a company. The next decade is going to see many more WhatsApp-style companies. Tiny teams creating enormous value.

8

9

129

Raza Habib

@RazRazcle

1 year

Contrary to popular opinion, I see a lot of companies finetuning LLMs. They just tend to do it after things are working. Start with prompt engineering, solve a real customer need and then optimise.

8

13

127

Raza Habib

@RazRazcle

1 year

How can you tell if an LLM app is working well? Trad software relies on unit tests and trad machine learning used held-out datasets. With LLMs, neither of these is enough. There's a fantastic example in the HELM benchmark showing why (and what you might do instead):

2

15

114

Raza Habib

@RazRazcle

2 years

A lot of people believe that LLMs aren't agents but I think this is a mistake. 1/n

17

11

111

Raza Habib

@RazRazcle

1 year

Is there any evidence that governments might have their own secret LLM efforts at a scale that could rival gpt3/4? Feels like it would show up in hiring and compute

29

3

103

Raza Habib

@RazRazcle

2 years

Chain of thought prompting was really necessary for GPT-3 (text-davinci-002) but GPT-3.5 (text-davinci-003) seems to be able to do many of these tasks zero-shot:

5

6

100

Raza Habib

@RazRazcle

1 year

Super interesting insights into building with LLM apps in the co-pilot explorer blog. The engineering effort that goes into choosing the right context, fine-tuning and telemetry is immense! some details:

copilot-explorer

Hacky repo to see what the Copilot extension sends to the server

thakkarparth007.github.io

1

11

91

Raza Habib

@RazRazcle

2 years

The key point here is that GPT-3 doesn't just draft an email. It first does research on google and then uses that info to draft the email.

Raza Habib

@RazRazcle

2 years

Had a bit of a play with GPT-3 and @LangChainAI yesterday. With a bit of prompt magic from @humanloop and access to the Serpapi, it can do a decent stab at writing sales emails. @gojira what do you think? green is GPT-3, blue is Google search

13

24

200

3

10

88

Raza Habib

@RazRazcle

2 years

Claude from Anthropic only marginally better

3

6

76

Raza Habib

@RazRazcle

2 years

The OSS replication begins! "We are not going to stop at replicating ChatGPT. We want to build the assistant of the future, able to not only write email and cover letters, but do meaningful work, use APIs, dynamically research information, and much more"

GitHub - LAION-AI/Open-Assistant: OpenAssistant is a chat-based assistant that understands tasks,...

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so. - LAION-AI/Open-Assistant

github.com

1

8

79

Raza Habib

@RazRazcle

2 years

2 years after doing @ycombinator remotely, I finally made it to the offices! I'm often asked if YC was worth it and I can say unequivocally yes! The network, advice, partners and friends have undoubtedly changed @humanloop 's trajectory. If you're considering it, go for it!

1

5

76

Raza Habib

@RazRazcle

1 year

We used @Replit as the backend for our discord bot. The combination of GPT-4 + replit made it possible to spin up a working bot insanely fast.

Humanloop

@humanloop

1 year

You can now talk to GPT-4 in the Humanloop discord! The OpenAI live demo inspired us, so we used GPT-4 to create a GPT-4 bot! With the help of GPT-4 it only took @jordnb about 20 minutes to code this from scratch!

0

4

24

4

9

76

Raza Habib

@RazRazcle

2 years

In the early versions of Anthropic's Claude model they provided two outputs to each answer and got you to choose what you prefer. In just the past months I've seen it improve phenomenally. Data flywheels are real.

4

7

74

Raza Habib

@RazRazcle

6 months

Happy to share that I've officially moved to San Francisco! Today's my first day in our new SF office. Alongside @jordnb , I'll be growing @humanloop 's US team. Would love to meet more people in the city so please DM me!

7

1

74

Raza Habib

@RazRazcle

2 years

5/ Here's a concrete example of a ReAct style prompt I used to build an automatic sales email generator. You need to execute the "search" actions from the LLM and append the results from google. @humanloop tools makes this super easy.

11

4

73

Raza Habib

@RazRazcle

2 years

. @Aleph__Alpha is one of the most underrated generative AI start-ups I've come across. They're amongst the only public APIs to a multimodal model that can understand both images and language.

5

9

71

Raza Habib

@RazRazcle

2 years

Very neat project! Using GPT-3 to build a chat QA interface for LangChain's docs. I think this could be a pretty cool generic product/feature in one of the doc platforms.

LangChain Chat

Today we’re excited to announce and showcase an open source chatbot specifically geared toward answering questions about LangChain’s documentation. Key Links * Deployed Chatbot: chat.langchain.dev *...

blog.langchain.dev

3

8

71

Raza Habib

@RazRazcle

10 months

@shaunmmaguire Do not frame this as a conflict between all Muslims and all Jews. You are exacerbating the problem.

3

0

68

Raza Habib

@RazRazcle

1 year

We'll solve AGI before we solve package management for Python ...

7

2

65

Raza Habib

@RazRazcle

1 year

These three steps are an emerging pattern for LLM self-improvement used in all these recent papers: ‣ Toolformer ‣ STaR — LLMs that are better at reasoning ‣ Constitutional AI — Harmless AI from AI feedback and the whole process can be repeated multiple times!

3

10

62

Raza Habib

@RazRazcle

2 years

At our hackathon last week we used @Humanloop and GPT-3 to build an Obsidian ( @obsdmd ) plugin called ThoughtPartner.

3

60

Raza Habib

@RazRazcle

2 years

I've heard people say that @OpenAI are great at research but not product. I disagree. Both chatGPT and the playground were fantastic UX innovations that helped many more people realise what's possible. They're innovating in both fundamental research and UX.

4

2

58

Raza Habib

@RazRazcle

7 months

@garrytan @outerbase We've had other yc co's do this to us

5

2

57

Raza Habib

@RazRazcle

1 year

YC batch 1 had the founders of Reddit, Open AI and Twitch! The early vintage is always extraordinary because it's not a credential yet.

3

2

55

Raza Habib

@RazRazcle

2 years

One of the big deficiencies of LLM chatbots is they don't ask questions. They're not really conversing with you, they're just predicting one output at a time. I think it might be better if they were fine-tuned on longer dialogs

13

7

55

Raza Habib

@RazRazcle

2 years

4/ The simple but powerful idea in ReAct is to combing action-prompts with chain-of-thought. Taken together this helps the model to reason about what actions to take. It's much more powerful:

1

7

57

Raza Habib

@RazRazcle

2 years

On Tuesday I successfully defended my PhD thesis at the UCL AI Centre ( @ai_ucl ) and became Dr. Raza Habib! 🥳

9

1

55

Raza Habib

@RazRazcle

2 years

Can't believe it's been two months already! it's been amazing to see what people are building with LLMs. If you're building an LLM product come join the fun!

Humanloop

@humanloop

2 years

Today we're excited to announce public access to Humanloop for Large Language Models! We're making it easier than ever to build incredible products with GPT-3 Sign-up at

7

24

263

0

2

50

Raza Habib

@RazRazcle

2 years

LLMs don't need much or any annotated data but to be truly useful, they do need access to non-web data. LLM products are supercharged when they know your personal context. chatGPT that's read your emails, calendar and docs. Doing this whilst preserving privacy is critical.

4

2

52

Raza Habib

@RazRazcle

2 years

Great slide from @chrmanning showing the crazy rate of NLP progress. This is just the beginning.

0

12

49

Raza Habib

@RazRazcle

8 months

In office >> Remote. Setting up the new @humanloop office! We started as fully remote because of the pandemic. Remote definitely has benefits, but moving back to the office this year has felt amazing.

4

3

49

Raza Habib

@RazRazcle

1 year

I wish journalists would stop using Gary Marcus as a knee-jerk way to balance articles on AI. He seems impervious to evidence and there are many more interesting critics if you really need one.

6

0

49

Raza Habib

@RazRazcle

4 months

@rivkahbrown I support the protests but surely you can't think this is acceptable. A man peacefully standing wearing a yamulka should feel safe even if he's a counter protester you disagree with.

7

1

48

Raza Habib

@RazRazcle

1 year

Burning the midnight oil with the @Humanloop team. Gearing up for an exiting release this week!

2

0

48

Raza Habib

@RazRazcle

1 year

Just ask the model for its confidence and ignore low-confidence answers.

5

4

46

Raza Habib

@RazRazcle

2 years

You can expect the cost of serving LLMs to drop dramatically as we figure out better ways to quantise and prune the models. A cool recent example: SparseGPT They're able to prune 50% of the weights in OPT-175 with minimal performance loss on one GPU in 4 hours!

2

46

Raza Habib

@RazRazcle

5 months

@packyM I guess because the moment it turns 12 noon (the second after) it's now post-meridiem and the second after 12 midnight it's the morning or ante-meridiem.

3

1

45

Raza Habib

@RazRazcle

1 year

A perfectly calibrated model will get 10% of the answers correct when it has 10% confidence, 20% at 20% confidence etc. So on the graph above, perfect calibration corresponds to a straight line. GPT-4 is very well-calibrated! It knows what it doesn't know!

1

4

45

Raza Habib

@RazRazcle

1 year

Looks like meta finally trained on code: This could be huge!

Meta’s Next AI Attack on OpenAI: Free Code-Generating Software

Meta Platforms is preparing to launch software to help developers automatically generate programming code, a challenge to proprietary software from OpenAI, Google and others, according to two people...

www.theinformation.com

1

6

41

Raza Habib

@RazRazcle

2 years

In SF for the next week or so. If you're building with LLMs and would like to meet, dm me :)

4

0

44

Raza Habib

@RazRazcle

4 years

I used to feel frustrated by what I saw as the @DeepMind hype machine but reading @balajis 's piece on the purpose of technology () has made me realise that evangelising technological progress and building strong narratives is itself a valuable pursuit.

Demis Hassabis

@demishassabis

4 years

This year has been an incredible one for science. So it's a real honour for #AlphaFold to be included in @ScienceMagazine ’s top 10 breakthroughs of the year, among so many other significant discoveries.

16

88

585

2

5

43

Raza Habib

@RazRazcle

2 years

@paulg @stem_feed Its just condensed notation. The 'm' normally actually isn't the rest mass but the moving mass. The rest mass is written with a subscript 0. E = mc^2 = \gamma * m_0 * c^2 Where \gamma = 1/sqrt(1-v^2/v^2)

1

0

42

Raza Habib

@RazRazcle

1 year

One of the things that most excites me about LLMs is that you no longer need to be an ML expert to build really delightful AI-first products. Promptable focusses on JS devs and so will help open up access to many more developers. Excited to see the ecosystem of tools growing!

Promptable.ai

@PromptableAI

1 year

It's here. The world's first library for building AI apps in Typescript. 🔥Promptable.js 🔥 Use the full power of LLMs and Embeddings in your apps: Prompt🪄 Search 🔍 Chain ⛓️ Trace ➿ Get started -> npm i promptable Repo Docs

23

74

475

1

5

42

Raza Habib

@RazRazcle

1 year

Some questionable Claude performance here 😂

9

3

41

Raza Habib

@RazRazcle

9 months

Sam's taking the whole "next steve jobs" thing a bit literally

3

1

41

Raza Habib

@RazRazcle

2 years

ChaptGPT isn't a product and isn't a challenge to the existing GPT-3 companies. It's a demo of what's possible but it won't be useful for 90% of people. The real value comes from embedding AI into applications and workflows.

Chris Frantz

@frantzfries

2 years

So uh how are all the GPT3 companies doing now that you can do everything for free, faster and at a higher quality level with ChatGPT?

53

29

758

5

2

42

Raza Habib

@RazRazcle

1 year

Believing AGI is possible and chasing it doesn't mean you belong to a cult. Cynics get to sound smart and optimists actually build things. Wasn't that long ago that people were being made fun of working on neural networks.

Sara Hooker

@sarahookr

1 year

How often do you use the term AGI a week? And if it use it more than 5x, have you ever pondered whether you are inadvertently part of a cult rather than a scientific community.

50

7

97

2

3

41

Raza Habib

@RazRazcle

1 year

@deliprao Rather than sneering, can you provide your reasons for why you're so sure they're wrong? lets learn together

6

0

41

Raza Habib

@RazRazcle

2 years

2/You can improve factual accuracy by including external knowledge or APIs in the prompt. You can even let the model specify what extra info it needs through a very simple "domain specific language" . If the model outputs "google: a query", you append the results of that query.

2

40

Raza Habib

@RazRazcle

2 years

Grammarly is the OG generative AI company.

3

0

39

Raza Habib

@RazRazcle

2 years

3/ Action-prompting is ok for simple questions but if the question requires reasoning then it tends to fail. "Chain of Thought" prompting lets you over come this. In your prompt you include explicit reasoning and this gets the model to do the same. This increases accuracy:

2

1

39

Raza Habib

@RazRazcle

1 year

I think this is exactly the wrong take. The longer you've been working in AI the more impressive the generality of these models seem. If you still think they're spicy auto-complete you're not paying attention

Robin Allenson

@Sugarsteroni

1 year

The longer you've been working in AI the further along you are.

1

0

8

2

0

39

Raza Habib

@RazRazcle

2 years

If you're starting to work on AI-first-product make sure you take seriously the rate of progress. Build your product assuming that the AI models will be vastly more capable in the near future than they are now. As Sam says, trust the exponential.

Sam Altman

@sama

2 years

interesting to me how many of the ChatGPT takes are either "this is AGI" (obviously not close, lol) or "this approach can't really go that much further". trust the exponential. flat looking backwards, vertical looking forwards.

232

812

9K

1

2

39

Raza Habib

@RazRazcle

1 year

@Meaningness Started reading this. It's appallingly out of date. These criticisms might have seemed valid a decade ago, but many of the claims are just laughably wrong now.

2

0

35

Raza Habib

@RazRazcle

1 year

5/ The key insight in the paper is to use the LLM to generate its own training data! It's a three-step process: 1. Generate — Use GPT-J to annotate questions with possible API 2. Filter — keep only the examples that improve prediction 3. Finetune — retrain the model

1

36

Raza Habib

@RazRazcle

2 years

The secret sauce behind ChatGPT is RLHF and fine-tuning. If you want to go beyond cool demos and build differentiated products @humanloop can help you do this for your own applications.

Humanloop

@humanloop

2 years

RLHF – Reinforcement Learning from Human Preferences. Models are fine tuned using RL from human feedback. They become more helpful, less harmful and they show a huge leap in performance. An RLHF model was preferred over a 100x larger base GPT-3 model.

3

15

82

0

7

38

Raza Habib

@RazRazcle

1 year

I think the wider pattern of models generating their own training data is the most interesting aspect of the Toolformer paper and will likely become a common framework for continuously improving LLMs.

5

2

36

Raza Habib

@RazRazcle

2 months

I'm launching a new podcast! The first episode is out tomorrow and I wanted to share a sneak peek. Why do we need another podcast for AI? Over the last year at Humanloop, I've worked with a lot of different engineering leaders, CTOs and founders who are building AI products

4

7

38

Raza Habib

@RazRazcle

6 months

@paulg checks out

1

0

37

Raza Habib

@RazRazcle

3 years

I continue to be amazed by how little of academic ML research looks how we collect and label data, given that for almost any real application this is the biggest factor in performance.

2

5

37

Raza Habib

@RazRazcle

2 years

Claude from a few months ago was willing to chat in Urdu. Claude today refuses the same query. @AnthropicAI is this a result of RLHF?

3

2

36

Raza Habib

@RazRazcle

8 months

@RichardHanania "we lost our first amendment freedoms with respect to islam" he says whilst criticising islam 🤔

1

36

Raza Habib

@RazRazcle

1 year

I think there's at least 50% chance of AGI that can do anything a human does at a computer better than the average (trained) human by 2030. If you disagree, I'd be curious to know the simplest task you think AI won't be able to achieve at that date.

13

2

35

Raza Habib

@RazRazcle

2 years

"Constitutional AI" is a new research paper from Anthropic AI and is a step towards building AI systems that have more transparent and controllable values. 1/

2

5

34

Raza Habib

@RazRazcle

2 years

If anyone at AWS is listening, @humanloop stands ready to spend hundreds of millions of your $ too

Raza Habib

@RazRazcle

2 years

Microsoft/OpenAI and Google/Anthropic's investment partnership is a comedy of sorts. They're basically just handing over loads of cash that they then get back through cloud compute.

30

25

420

4

0

35

Raza Habib

@RazRazcle

1 year

Since GPT-4 has well-calibrated confidence we can use its confidence estimates to decide when to trust the model. If the calibration is good, then we don't need to worry about models making things up.

1

34

Raza Habib

@RazRazcle

2 months

. @sourcegraph has built the most popular open-source AI coding tool in the Fortune 500! A few weeks ago I sat down with @beyang liu their CTO and cofounder to find out how they did it. We dive into:

3

8

31

Raza Habib

@RazRazcle

2 years

More competition at the model/API layer can only be good for builders. This + OSS effort should bring the cost of the raw models way down over time.

Sumon Sadhu 🌏 🐯

@sharpshoot

2 years

OpenAI’s API has some competition.

0

3

8

1

3

32

Raza Habib

@RazRazcle

1 year

After the model is RLHF fine-tuned (made "safe"), the calibration is bad but generally under-confident. Why is this a big deal?

4

0

34

Raza Habib

@RazRazcle

9 months

@Peter_0_0_g Because they've been in the field for more than 5 minutes and are aware of the rate of progress

2

1

33

Raza Habib

@RazRazcle

1 year

The graph shows the GPT-4's "calibration" before and after RLHF. A well-calibrated model can accurately say how confident it is. The researchers compare GPT-4's probability for the answer to the fraction of the time it's correct.

1

0

32

Raza Habib

@RazRazcle

2 years

Something else I've found anecdotally is that Claude is much better at Urdu than chatGPT. I'm pretty amazed by its ability to handle udru transliterations.

6

4

30

Raza Habib

@RazRazcle

2 years

Level unlocked!

1

0

31

Raza Habib

@RazRazcle

5 months

I heard first hand testimony from British doctors who recently visited gaza that aligns with the reporting here. Very sad. We must not forget our shared humanity.

William Dalrymple

@DalrympleWill

5 months

Further evidence of hideous war crimes and torture by the Most Moral Army™

27

565

1K

0

10

29

Raza Habib

@RazRazcle

1 year

GPT-4 is an incredible learning tool! I've never done much front-end but always wanted to find time to build apps end-to-end. I prompted GPT-4 to be a coding teacher and then worked with it to start building a simple GPT-4 chat app. Here's the first React app I've ever built!

3

1

31

Raza Habib

@RazRazcle

2 years

Awesome to meet so many people interested in building with LLMs today!

2

0

29

Raza Habib

@RazRazcle

2 years

Has anyone done a comparison of retrieval augmented models (like RETRO) to hybrid methods that use embeddings to create a few-shot prompt?

5

0

29

Raza Habib

@RazRazcle

2 years

4/ After a few rounds of self-improvement finetuning they show that a 6Bn parameter LLM can match the 175Bn parameter GPT-3!

1

2

29

Raza Habib

@RazRazcle

2 years

@rebelemerald @Bossmustangfan @isabelleboemeke I don't think you understand what Nuclear power is. Nuclear is not digging stuff out of the ground and burning it. It's transmuting one material into another and in the process releasing millions of times more energy per gram than could possibly be released by any "burning"

2

0

27

Raza Habib

@RazRazcle

9 months

Never waste a good crisis 😂