Rohan @clusteredbytes Twitter profile

Last Seen Profiles

@Rudy20025

@aoi_twinkle_

@Bizquit05

@Milfkuu

@BinorRaja

@welytonavl

@bnt8005

@pfaffing_l

@mJ4urVrjY3FgTxs

@histeriico

@appujuthu

@moddum99x99

@abnlaboa8

@martin_ery43

@ptbrasil

@Ranti_Putri1

@plpin_75

@bgdseyc

@stw46

@ChapagainKrish9

@yvonne_nolan

@tokudomeer50229

@mJ4urVrjY3FgTxs

@abha_bnt85013

@penyukastw21

@OneLoveLulit

@ComudeLeon

@xphisiththirun1

@Nhoof69jaz

@tarbwee

@Dedemuzammil

@__Stephnny

@sexsturk

@blacxtae

@andadoggo

@bokeplokalmalam

Rohan

@clusteredbytes

11 months

Multi-Modal AI is rapidly taking over 🔥🚀 It’s truly amazing how fast @llama_index incorporated a robust pipeline for multi-modal RAG capabilities. Here’s a beginners-friendly guide to get started with multi-modal RAG using LlamaIndex 👇🧵

12

106

636

Rohan

@clusteredbytes

2 years

Using #ChatGPT to easily create ChatGPT plugins 🔥 #OpenAI #GPT4 #AI #python #fastapi A ChatGPT plugin consists of 3 things: 1. An HTTP server 2. An OpenAPI spec 3. A manifest file Steps to create a plugin from OpenAI's tutorial 👇

23

92

558

Rohan

@clusteredbytes

1 year

Multi Document Agent architecture (v0) in @llama_index , a step beyond naive top-k RAG. It allows answering broader set of questions over multiple documents, which weren't possible with basic RAG. Let's break down the agent architecture and see how it works 👇🧵

17

83

484

Rohan

@clusteredbytes

1 year

Previously we've seen how to improve retrieval by funetuning an embedding model. @llama_index also supports finetuning an adapter on top of existing models, which lets us improve retrieval without updating our existing embeddings. 🚀 Let's see how it works 👇🧵

11

89

464

Rohan

@clusteredbytes

1 year

We've seen that smaller chunks are good for capturing semantic meaning and larger ones are good for providing better context. @llama_index AutoMergingRetriever takes it one step further by keeping the chunks in a tree structure and dynamically choosing the chunk length. 🧵👇

8

67

412

Rohan

@clusteredbytes

1 year

While splitting the raw text for Retrieval Augmented Generation (RAG), what should be the ideal length of each chunk? What’s the sweet spot? Strike a balance between small vs large chunks using @LangChainAI ParentDocumentRetriever Let's see how to use it 👇🧵

18

66

371

Rohan

@clusteredbytes

10 months

Let's talk about FLARE - Forward Looking Active RAG and how to implement it using @llama_index FLAREInstructQueryEngine. Instead of doing retrieval once at the beginning, FLARE retrieves information dynamically multiple times during token generation 🚀 Details below 🧵👇

5

59

329

Rohan

@clusteredbytes

10 months

Ingestion Pipeline is a new and improved way to ingest and manage documents in @llama_index It supports: - applying a series of transformation on documents - caching those transformations - managing ever-changing documents etc. Let's see how to use it 👇🧵

4

55

313

Rohan

@clusteredbytes

1 year

Finetuning the embedding model can allow for more meaningful embedding representations, leading to better retrieval performance. @llama_index has abstraction for finetuning sentence transformers embedding models that makes this process quite seamless. Let's see how it works 👇

8

34

303

Rohan

@clusteredbytes

6 months

Open source AI Diagram Generator 🔥 Uses @llama_index Pydantic program with partial JSON parsing and @vercel AI SDK to send intermediate diagrams during generation for improved UX 🚀 Repo: Full tutorial under 2.5 minutes 👇

3

41

313

Rohan

@clusteredbytes

1 year

Extract tables from documents using @llama_index UnstructuredElementParser and then use RecursiveRetriever to enable hybrid tabular/semantic queries and also comparisons over multiple docs. Let's see how to use this advanced RAG technique 🧵👇

11

57

293

Rohan

@clusteredbytes

8 months

New Open Source, Full Stack RAG project 🔥🚀 Bootstrapped with @llama_index create-llama 🔥 It uses loads of amazing LlamaIndex goodies e.g. Ingestion Pipeline, multi-documents agents, custom callback handler, transformations and more. Repo: Demo 👇

12

39

268

Rohan

@clusteredbytes

1 year

Lost in the middle problem in RAG and how @LangChainAI LongContextReorder addresses it. In RAG, for really long context (10+ retrieved docs), turns out it's not the best way to just plug-in the docs in the descending order of vector similarity score.

8

35

247

Rohan

@clusteredbytes

2 years

Created FootballGPT using GPT-4 @LangChainAI and @ApiFootball . Taught GPT-4 how to use a complex API with many endpoints and numerous parameters. This is where GPT-4's advanced reasoning capability came in handy. Here's a demo 👇

21

30

224

Rohan

@clusteredbytes

1 year

One issue of using embeddings to retrieve relevant documents is that the results might vary with the slightest change in the wording of the query. @LangChainAI MultiQueryRetriever tries to address this issue with the help of LLMs. Let's see how to use it 👇🧵

6

46

226

Rohan

@clusteredbytes

1 year

Previously we've seen @LangChainAI ParentDocumentRetriever that creates smaller chunks from a document and links them back to the initial documents during retrieval. MultiVectorRetriever is a more customizable version of that. Let's see how to use it 🧵👇

8

35

217

Rohan

@clusteredbytes

7 months

Fully local, open source chat-with-pdf app tutorial under 2.5 minutes 🔥🚀 Stack used: @llama_index Typescript for RAG @ollama @nextjs with server actions Phi2 and @nomic_ai models using Ollama Detailed tutorial: GitHub repo:

8

41

214

Rohan

@clusteredbytes

9 months

The "Dense X Retriever" paper shows that it significantly outperforms the traditional chunk-based retriever @LoganMarkewich created an awesome LlamaPack that lets you get started with this proposition-based retriever in no time using @llama_index 🔥 Let's see how it works 👇🧵

5

46

204

Rohan

@clusteredbytes

9 months

Introducing LlamaBot 🔥🚀 An open-source Discord bot that listens to your conversations, remembers them and answers your questions across a discord server, created using @llama_index (inspired by @seldo 's LlamaBot for Slack) Stack used: LlamaIndex, Gemini Pro, @qdrant_engine

8

33

192

Rohan

@clusteredbytes

11 months

LlamaPacks by @llama_index are out 🚀🔥 In this speedrun 🏃‍♂️, I wanted to demonstrate how fast and easy it is to create a gmail agent for your inbox using LlamaPacks. Spoiler Alert⚠️ It took only 54.86 seconds 🚀 to get to the chat interface, with only 4-5 lines of code 🔥🤯

4

29

187

Rohan

@clusteredbytes

7 months

Streaming intermediate events in RAG is crucial for best user experience 🚀 Let's see how to use @llama_index and @vercel AI SDK to properly stream intermediate events to the frontend. Full tutorial under 3 minutes 🔥

5

39

184

Rohan

@clusteredbytes

10 months

Previously I've talked about the amazing Ingestion Pipeline from @llama_index . Here's how to use Redis ( @Redisinc ) as the docstore, vectorstore and cache for the pipeline. LlamaIndex abstractions make it really easy to just use Redis for the entire pipeline 🔥👇

3

24

136

Rohan

@clusteredbytes

2 years

Created NewsBuddy 📰 using @LangChainAI and ChatGPT API. NewsBuddy is your personal news assistant. Had quite fun building this fun little project overnight. Learnt loads of new stuff about LangChain and Prompt Engineering. Here's a demo of our NewsBuddy 👇

10

13

103

Rohan

@clusteredbytes

9 months

Checkout this new OSS repo by @seldo that contains detailed, step by step instructions on how to build a slack bot completely from scratch using @llama_index The bot listens to conversations and answers questions about them 🔥 Here's the high level architecture of the bot 👇

1

17

73

Rohan

@clusteredbytes

2 years

Using #ChatGPT to easily create Chrome extensions from scratch in 15 minutes 🔥 Full Step-by-Step Tutorial with prompts. #OpenAI #AI We'll use ChatGPT to create a simple extension, QuikNote, that takes quick daily notes right from the browser. Here are the steps required 👇

6

5

32

Rohan

@clusteredbytes

1 year

The issue: - smaller chunks reflect more accurate semantic meaning after creating embedding - but they sometimes might lose the bigger picture and might sound out of context, making it difficult for the LLM to properly answer user's query with limited context per chunk.

1

0

24

Rohan

@clusteredbytes

1 year

Here's a short demo of how this Multi Document Agent architecture would work:

1

2

20

Rohan

@clusteredbytes

9 months

@llama_index @seldo @qdrant_engine GitHub repo of the bot: Invite the bot to your server: Full step-by-step guide on how to build this bot:

Create a Discord Chatbot Using LlamaIndex for Your Server

A guide to building a full-fledged discord bot using LlamaIndex, that listens to, remembers and answers questions from your discord server.

clusteredbytes.pages.dev

1

3

21

Rohan

@clusteredbytes

10 months

@llama_index Here's a nice animation by the authors demonstrating how FLARE blends generation and retrieval by dynamically incorporating relevant and up-to-date information. Source: LlamaIndex FLAREInstructQueryEngine:

1

2

20

Rohan

@clusteredbytes

8 months

Within 24 hours, OpenAI's Sora has dazzled with some stunning videos🌟 Introducing FlixAI, A one-stop hub where I've compiled all the videos by Sora so far, alongside their prompts. It supports semantic search and suggests similar videos, other models like Pika, runway etc.

4

3

19

Rohan

@clusteredbytes

1 year

Detailed blog post on ParentDocumentRetriever with more explanation and code snippets

LangChain ParentDocumentRetriever: Strike a Balance between large vs small chunks

Looking at LangChain ParentDocumentRetriever to strike a balance between large vs small document chunks while retrieving related documents.

clusteredbytes.pages.dev

1

2

18

Rohan

@clusteredbytes

1 year

Architecture: - For each document, a VectorIndex is created for semantic search, and a SummaryIndex is created for summarization - Then we create QueryEngine for both these Indices - Next the QueryEngines are converted to QueryTools

2

0

14

Rohan

@clusteredbytes

1 year

More on this on my blog post:

Lost in the Middle of long context and LangChain LongContextReorder

LLMs get lost in the middle of long context and we try to address this using LangChain LongContextReorder

clusteredbytes.pages.dev

1

14

Rohan

@clusteredbytes

1 year

@LangChainAI LongContextReorder addresses this issue by re-ordering the documents after retrieval. It puts the most similar ones at the top, and then the next few ones at the end, and the least similar ones in the middle.

3

14

Rohan

@clusteredbytes

1 year

The Issue: In the context window of LLM prompt, we put the most similar documents at the top, and least similar ones at the bottom. But LLMs tend to ignore documents at the middle of its context. Hence, this is where we should put the least similar ones, not at the bottom.

4

1

13

Rohan

@clusteredbytes

1 year

Recent research shows that: - Performance is often highest when document containing answer to user's question occurs at the beginning or at the end of the context

1

2

13

Rohan

@clusteredbytes

10 months

@llama_index Details about ingestion pipeline and other cool RAG guides on my blog:

LlamaIndex Ingestion Pipeline

A guide to the new LlamaIndex Ingestion pipeline, which is a new and improved way to ingest documents.

clusteredbytes.pages.dev

1

0

12

Rohan

@clusteredbytes

1 year

@LangChainAI ParentDocumentRetriever addresses this issue by creating embedding from the smaller chunks only as they capture better semantic meaning. But while plugging into the LLM input, it uses the larger chunks with better context.

1

0

12

Rohan

@clusteredbytes

1 year

Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG

Rohan

@clusteredbytes

1 year

Multi Document Agent architecture (v0) in @llama_index , a step beyond naive top-k RAG. It allows answering broader set of questions over multiple documents, which weren't possible with basic RAG. Let's break down the agent architecture and see how it works 👇🧵

17

83

484

3

0

10

Rohan

@clusteredbytes

2 years

1. First create the API server using ChatGPT - tell ChatGPT your requirements - and also which library to create the server (FastAPI in this case)

1

0

9

Rohan

@clusteredbytes

2 years

Unreal Engine is changing the Photorealistic Animation game with their upcoming "MetaHuman Animator". You can use your iPhone to shoot and then reproduce animation of facial expressions with Insane details and fidelity, all within minutes. #AI #UnrealEngine #MetaHuman

2

10

Rohan

@clusteredbytes

1 year

These Tools are passed to OpenAIAgent. This is the document agent. Each document has an agent like this that chooses to perform summarization or semantic search within each document.

1

0

10

Rohan

@clusteredbytes

1 year

Next we have a top-level Retriever-Enabled Agent. This boss agent orchestrates across different document agents. First it retrieves the document agents relevant to the question, then passes the input to those agents only and crafts the response from those agent outputs.

1

0

9

Rohan

@clusteredbytes

1 year

@llama_index Full guide with benchmarks in the official documentation:

1

2

8

Rohan

@clusteredbytes

11 months

@llama_index First let’s start with some simple stuff. We just want to ask questions about our images. OpenAIMultiModal is a wrapper around OpenAI’s latest vision model that lets us do exactly that.

2

0

9

Rohan

@clusteredbytes

1 year

Meet GlowGPT. Upload a photo and get instant feedback and suggestions from AI. Created using @OpenAI chatgpt API, @LangChainAI , @Gradio , @huggingface transformers and BLIP models. Here's a demo 👇

2

1

8

Rohan

@clusteredbytes

1 year

We're gonna need two splitters instead of one. - One for creating the larger chunks - Another one for creating the smaller chunks

1

9

Rohan

@clusteredbytes

9 months

@LoganMarkewich @llama_index Thanks to @LoganMarkewich , there's already a LlamaPack for ""Dense X Retriever" that handles: - generating the propositions - creating the vector index - Creating the retriever (Recursive retriever in this case) and the query engine Here's how to use the pack 👇

1

0

9

Rohan

@clusteredbytes

9 months

@llama_index @seldo @qdrant_engine Features: - We can ask LlamaBot questions about what's going on across the server - We can tell LlamaBot to start/stop listening to conversations. - We can check current listening status, or ask the bot to forget everything from the server.

1

8

Rohan

@clusteredbytes

9 months

@LoganMarkewich @llama_index The paper also shows how to create these propositions 👇 First GPT4 is prompted properly to generate some propositions. Then Flan-T5-Large model is finetuned with the generated propositions. The finetuned model is called "The Proposition-izer"

1

0

8

Rohan

@clusteredbytes

11 months

@llama_index More details about this on my blog 🚀

LlamaIndex Multi-Modal RAG Beginners Guide

A beginner friendly introduction to multi-modal RAG using LlamaIndex

clusteredbytes.pages.dev

1

8

Rohan

@clusteredbytes

11 months

@llama_index Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG

Rohan

@clusteredbytes

11 months

Multi-Modal AI is rapidly taking over 🔥🚀 It’s truly amazing how fast @llama_index incorporated a robust pipeline for multi-modal RAG capabilities. Here’s a beginners-friendly guide to get started with multi-modal RAG using LlamaIndex 👇🧵

12

106

636

0

1

8

Rohan

@clusteredbytes

1 year

Let’s walk through the example code from LangChain’s website on ParentDocumentRetriever 🧑‍💻 👇

1

0

8

Rohan

@clusteredbytes

1 year

@llama_index Details about it on the official documentation:

1

8

Rohan

@clusteredbytes

9 months

@LoganMarkewich @llama_index A proposition is an atomic, self-contained text encapsulating a distinct factiod, written in simple natural language format. A single Proposition encapsulates only one contextualized atomic fact. It cannot be further split into separate propositions.

1

7

Rohan

@clusteredbytes

2 years

2. Then create the plugin manifest. A Plugin manifest is a json file with: - simple metadata about the plugin - tells how to show the plugin to a human - also how to describe it to the language model

1

0

7

Rohan

@clusteredbytes

2 years

🚀 Github Copilot JUST got way better, with the help of GPT-4. 🔥 GitHub just announced Copilot X with stunning new features like: - Chat and voice support - Copilot for terminal - Answering questions from docs - Generate Pull requests 1/6 #AI #ChatGPT #GPT4 #Github #Copilot

1

0

7

Rohan

@clusteredbytes

10 months

@llama_index FLARE addresses this issue by dynamically adapting to the evolving context while it's being generated. During generation, when low confidence tokens are generated (possible hallucination), FLARE actively performs retrieval.

2

0

6

Rohan

@clusteredbytes

1 year

Thus we use small chunks (with better semantic meaning) for vector similarity matching and return their corresponding larger chunks that have the bigger picture and more context.

2

0

7

Rohan

@clusteredbytes

6 months

Update: I've added streaming partial objects feature to the built-in @llama_index OpenAIPydanticProgram (Thanks @_nerdai_ for the review) So you can just call the 'stream_partial_objects' method of the built-in class now. The project repo has been updated accordingly as well.

Rohan

@clusteredbytes

6 months

Open source AI Diagram Generator 🔥 Uses @llama_index Pydantic program with partial JSON parsing and @vercel AI SDK to send intermediate diagrams during generation for improved UX 🚀 Repo: Full tutorial under 2.5 minutes 👇

3

41

313

2

8

Rohan

@clusteredbytes

1 year

To address this issue, we can just re-order the retrieved documents ourselves so that the least relevant ones are at the middle. Or we can use LongContextReorder from LangChain, that does it automatically.

1

0

7

Rohan

@clusteredbytes

10 months

@llama_index These are the transformations we can use: 1. TextSplitter 2. NodeParser 3. MetadataExtractor 4. Any embedding model We can also create custom transformations. Guide on this is coming soon. Output of one transformation is the input to the next one.

1

0

5

Rohan

@clusteredbytes

1 year

Thanks for reading. I write about AI, ChatGPT, LangChain, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #ChatGPT #LangChain

Rohan

@clusteredbytes

1 year

Lost in the middle problem in RAG and how @LangChainAI LongContextReorder addresses it. In RAG, for really long context (10+ retrieved docs), turns out it's not the best way to just plug-in the docs in the descending order of vector similarity score.

8

35

247

2

0

7

Rohan

@clusteredbytes

2 years

Don't miss out on these amazing new ChatGPT powered chrome extensions 🚀🔥 1. ParagraphAI - Perfectly curated writing 2. Glasp - YouTube summary 3. Merlin - ChatGPT Plus, on all sites 4. Glarity - summarize Google/Bing results Make the most out of these AI tools.

1

6

Rohan

@clusteredbytes

1 year

More on this retriever and the details for evaluation results on LlamaIndex documentation:

1

0

7

Rohan

@clusteredbytes

1 year

Storing the chunks - As we're creating embedding for the small chunks only, we'll use a vectorstore to store those. - Whereas the larger chunks are stored in an InMemoryStore, a KEY-VALUE pair data structure, that stays in the memory while the program is running.

1

7

Rohan

@clusteredbytes

10 months

@llama_index FLARE Instruct: This mode prompts the LLM to identify and put search queries during generation through few shot prompting. e.g. Donald Trump attended [Search(which college did Donald Trump attend?)]

1

0

5

Rohan

@clusteredbytes

1 year

This is version 0, and there's still room for improvement. Next steps are parallel query planning, reducing latency and more 🚀 Full guide here:

1

6

Rohan

@clusteredbytes

2 years

Found this amazing GPT-4 powered chrome extension - Taxy AI It automates repetitive browsers actions by sending parts of DOM and user prompt to GPT-4. Then GPT-4 performs that action for you 🔥 Here's how it performs various repetitive taks from one-line user prompt 👇

1

3

6

Rohan

@clusteredbytes

9 months

@LoganMarkewich @llama_index Checkout the paper : The LlamaPack: The blog post for more details:

Dense X Retrieval - Propositions as Retrieval Unit

A guide to proposition basead retrieval, and how it outperforms traditional passage and sentence retrievers.

clusteredbytes.pages.dev

1

6

Rohan

@clusteredbytes

1 year

After filling in, we try merging parent nodes. Hypothesis is that if the ratio of no of retrieved children of a parent vs total children of that parent is above a threshold(we can adjust it), then we might as well return the larger parent for better context.

2

0

7

Rohan

@clusteredbytes

11 months

@llama_index LlamaIndex has MultiModalVectorStoreIndex which creates embedding for both image and text nodes and stores them in vector stores. For image nodes it uses 'clip' and for text nodes it uses 'ada' for getting the embedding (customizable). Let’s create the multi-modal index

1

0

6

Rohan

@clusteredbytes

1 year

The first step here is parsing via the HierarchicalNodeParser. It stores the node in a tree structure, where deeper nodes are smaller chunks and shallow nodes are larger chunks. We can specify how many layers of nodes we want and the splitter size for each layer.

2

0

6

Rohan

@clusteredbytes

1 year

Next we pass these documents to an instance of the LongContextReorder() and get the re-ordered docs where the least relevant ones are at the middle.

1

0

6

Rohan

@clusteredbytes

1 year

@LangChainAI ParentDocumentRetriever automatically creates the small chunks and links their parent document id. If we want to create some additional vectors for each documents, other than smaller chunks, we can do that and then retrieve those using MultiVectorRetriever.

Rohan

@clusteredbytes

1 year

While splitting the raw text for Retrieval Augmented Generation (RAG), what should be the ideal length of each chunk? What’s the sweet spot? Strike a balance between small vs large chunks using @LangChainAI ParentDocumentRetriever Let's see how to use it 👇🧵

18

66

371

1

0

6

Rohan

@clusteredbytes

1 year

All nodes are stored in a docstore and only the leaf nodes are stored in a vectorstore. At first, the vectorstore retriever is called to get the initial leaf nodes. From here we try to auto-merge parents to find parent with the correct chunk size.

1

0

6

Rohan

@clusteredbytes

9 months

After receiving some feedback from you guys (which I really appreciate), I've made some updates to LlamaBot: - Use GPT4 or Cohere - Remember user mentions - Refine prompt etc. If you encounter any issues while using the bot feel free to let me know or open an issue on GitHub

Rohan

@clusteredbytes

9 months

Introducing LlamaBot 🔥🚀 An open-source Discord bot that listens to your conversations, remembers them and answers your questions across a discord server, created using @llama_index (inspired by @seldo 's LlamaBot for Slack) Stack used: LlamaIndex, Gemini Pro, @qdrant_engine

8

33

192

0

6

Rohan

@clusteredbytes

1 year

@llama_index This parser: - extracts tables from data - converts those tables to Dataframe - for each of those tables, it creates 2 nodes - one Table Node that contains the Dataframe as string - another IndexNode that stores the summary of that table and a reference to that Table Node

2

0

6

Rohan

@clusteredbytes

10 months

@llama_index Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG

Rohan

@clusteredbytes

10 months

Ingestion Pipeline is a new and improved way to ingest and manage documents in @llama_index It supports: - applying a series of transformation on documents - caching those transformations - managing ever-changing documents etc. Let's see how to use it 👇🧵

4

55

313

0

6

Rohan

@clusteredbytes

1 year

@llama_index . @llama_index has guides on how to finetune embeddings in different ways: - finetune the embedding model itself (only sentence transformers) - finetune an adapter over any black-box embedding model (stay tuned for this one 🔥)

1

5

Rohan

@clusteredbytes

10 months

@llama_index Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG

2

0

5

Rohan

@clusteredbytes

1 year

@llama_index Next we partition the nodes using this built-in function of the Unstructured parser. Here BaseNodes contain the regular nodes and the IndexNodes (not the Table Nodes) NodeMapping contains {id->Node} mapping for those remaining Table Nodes.

2

0

5

Rohan

@clusteredbytes

2 years

Thanks for reading. I write about AI, CloudNative, Kubernetes, System Design etc. and try to make complex topics as easy as possible. Stay tuned for more.

3

1

5

Rohan

@clusteredbytes

1 year

AI won't steal your girl, but someone using FlirtGPT definitely will 😎 Don't use cheesy pick-up lines anymore 🚫 Just upload a pic of your crush and let ChatGPT generate amazing and personalized pick-up lines for you 😍 Built using @LangChainAI @Gradio & BLIP models. Demo 👇

1

0

5

Rohan

@clusteredbytes

1 year

We create any retriever as usual. And then get the relevant documents using the get_relevant_documents() method of that retriever. This returns the documents in the descending order of their similarity score.

1

0

5

Rohan

@clusteredbytes

11 months

@llama_index Just like text based RAG, where we were limited by the context length, here we’re also limited by how many images we pass. Hence, we would only want to pass the images that are related to our query. How do you find images related to your query?? Yep, via vector embedding 🚀

1

0

5

Rohan

@clusteredbytes

1 year

@llama_index 3 Steps for finetuning embeddings: 1. Prepare the data via generate_qa_embeddings_pairs() 2. finetune model via SentenceTransformersFinetuneEngine 3. Evaluate the model

1

0

5

Rohan

@clusteredbytes

1 year

@llama_index The linear adapter: The query embedding is updated using this linear transformation of the adapter: updated_q = W*q + b We train the linear adapter on the training corpus to find the best value for the weight and bias, W and b.

1

2

Rohan

@clusteredbytes

1 year

#AI PROJECTS MEGA-THREAD Thought of curating all my AI related projects and experiments in one thread so it's easier to find. Will be updating this thread with all the AI projects I build in the future. So stay tuned 🔥 Projects were built using @OpenAI @LangChainAI 🧵 👇

1

5

Rohan

@clusteredbytes

2 years

Learnt a lot about prompt engineering and how LangChain works under the hood. Really enjoying playing with LangChain. Created a custom chat agent for this one via extending LangChain's ConversationalChatAgent. Also had to cut the delays in the demo as GPT-4 was quite slow.

3

0

5

Rohan

@clusteredbytes

10 months

@llama_index . @llama_index has a FLAREInstructQueryEngine that makes is really easy to work with FLARE. It currently implements FLARE Instruct mode, which tells the LLM to generate retrieval instructions.

1

0

4

Rohan

@clusteredbytes

9 months

@LoganMarkewich @llama_index Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG

Rohan

@clusteredbytes

9 months

The "Dense X Retriever" paper shows that it significantly outperforms the traditional chunk-based retriever @LoganMarkewich created an awesome LlamaPack that lets you get started with this proposition-based retriever in no time using @llama_index 🔥 Let's see how it works 👇🧵

5

46

204

3

0

5

Rohan

@clusteredbytes

1 year

Create the ParentDocumentRetriever object We pass the vectorstore, docstore, parent and child splitters to the Constructor.

1

5

Rohan

@clusteredbytes

2 years

Learn how to create chrome extensions using ChatGPT. Full guide with prompts.

Rohan

@clusteredbytes

2 years

Using #ChatGPT to easily create Chrome extensions from scratch in 15 minutes 🔥 Full Step-by-Step Tutorial with prompts. #OpenAI #AI We'll use ChatGPT to create a simple extension, QuikNote, that takes quick daily notes right from the browser. Here are the steps required 👇

6

5

32

1

0

5

Rohan

@clusteredbytes

11 months

@llama_index Told you it was easy. LlamaIndex handles all the underlying logic for converting those image_documents to compatible format for the multi-modal llm. But there’s an issue !! 👇

1

0

5

Rohan

@clusteredbytes

1 year

Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG

Rohan

@clusteredbytes

1 year

We've seen that smaller chunks are good for capturing semantic meaning and larger ones are good for providing better context. @llama_index AutoMergingRetriever takes it one step further by keeping the chunks in a tree structure and dynamically choosing the chunk length. 🧵👇

8

67

412

2

0

5

Rohan

@clusteredbytes

1 year

LlamaIndex shows from the pariwise comparision evaluation results that when asked, GPT-4 preferred the results produced using AutoMergingRetriever vs baseline retriever 65% of the time, which is above average.

1

0

5

Rohan

@clusteredbytes

2 years

3. Then deploy the server and manifest json file. 4. After deploying the plugin server, add the plugin to ChatGPT - Provide the domain where the plugin is hosted - Provide auth token if needed

1

0

5

Rohan

@clusteredbytes

1 year

Thanks to LlamaIndex, creating an AutoMergingRetriever is quite straightforward. we just need to pass the base retriever and the storage context containing the docstore of the hierarchical nodes to it's constructor. And then we can use it like any other retriever.

1

5

Rohan

@clusteredbytes

10 months

@llama_index Transformations are the building blocks of Ingestion Pipeline. Each transformation takes a list of nodes, and returns another list of nodes after making the desired modifications to them. We define the transformations while instantiating the pipeline itself.

2

0

5