Rohan Profile Banner
Rohan Profile
Rohan

@clusteredbytes

3,149
Followers
261
Following
217
Media
577
Statuses

Professional AI Engineer. Sharing what I'm currently learning, mostly about AI, LLMs, RAG, building AI powered software, AI automation etc.

Joined January 2021
Don't wanna be here? Send us removal request.
@clusteredbytes
Rohan
11 months
Multi-Modal AI is rapidly taking over 🔥🚀 It’s truly amazing how fast @llama_index incorporated a robust pipeline for multi-modal RAG capabilities. Here’s a beginners-friendly guide to get started with multi-modal RAG using LlamaIndex 👇🧵
Tweet media one
12
106
636
@clusteredbytes
Rohan
2 years
Using #ChatGPT to easily create ChatGPT plugins 🔥 #OpenAI #GPT4 #AI #python #fastapi A ChatGPT plugin consists of 3 things: 1. An HTTP server 2. An OpenAPI spec 3. A manifest file Steps to create a plugin from OpenAI's tutorial 👇
23
92
558
@clusteredbytes
Rohan
1 year
Multi Document Agent architecture (v0) in @llama_index , a step beyond naive top-k RAG. It allows answering broader set of questions over multiple documents, which weren't possible with basic RAG. Let's break down the agent architecture and see how it works 👇🧵
Tweet media one
17
83
484
@clusteredbytes
Rohan
1 year
Previously we've seen how to improve retrieval by funetuning an embedding model. @llama_index also supports finetuning an adapter on top of existing models, which lets us improve retrieval without updating our existing embeddings. 🚀 Let's see how it works 👇🧵
Tweet media one
11
89
464
@clusteredbytes
Rohan
1 year
We've seen that smaller chunks are good for capturing semantic meaning and larger ones are good for providing better context. @llama_index AutoMergingRetriever takes it one step further by keeping the chunks in a tree structure and dynamically choosing the chunk length. 🧵👇
Tweet media one
8
67
412
@clusteredbytes
Rohan
1 year
While splitting the raw text for Retrieval Augmented Generation (RAG), what should be the ideal length of each chunk? What’s the sweet spot? Strike a balance between small vs large chunks using @LangChainAI ParentDocumentRetriever Let's see how to use it 👇🧵
Tweet media one
18
66
371
@clusteredbytes
Rohan
10 months
Let's talk about FLARE - Forward Looking Active RAG and how to implement it using @llama_index FLAREInstructQueryEngine. Instead of doing retrieval once at the beginning, FLARE retrieves information dynamically multiple times during token generation 🚀 Details below 🧵👇
Tweet media one
5
59
329
@clusteredbytes
Rohan
10 months
Ingestion Pipeline is a new and improved way to ingest and manage documents in @llama_index It supports: - applying a series of transformation on documents - caching those transformations - managing ever-changing documents etc. Let's see how to use it 👇🧵
Tweet media one
4
55
313
@clusteredbytes
Rohan
1 year
Finetuning the embedding model can allow for more meaningful embedding representations, leading to better retrieval performance. @llama_index has abstraction for finetuning sentence transformers embedding models that makes this process quite seamless. Let's see how it works 👇
Tweet media one
8
34
303
@clusteredbytes
Rohan
6 months
Open source AI Diagram Generator 🔥 Uses @llama_index Pydantic program with partial JSON parsing and @vercel AI SDK to send intermediate diagrams during generation for improved UX 🚀 Repo: Full tutorial under 2.5 minutes 👇
3
41
313
@clusteredbytes
Rohan
1 year
Extract tables from documents using @llama_index UnstructuredElementParser and then use RecursiveRetriever to enable hybrid tabular/semantic queries and also comparisons over multiple docs. Let's see how to use this advanced RAG technique 🧵👇
Tweet media one
11
57
293
@clusteredbytes
Rohan
8 months
New Open Source, Full Stack RAG project 🔥🚀 Bootstrapped with @llama_index create-llama 🔥 It uses loads of amazing LlamaIndex goodies e.g. Ingestion Pipeline, multi-documents agents, custom callback handler, transformations and more. Repo: Demo 👇
12
39
268
@clusteredbytes
Rohan
1 year
Lost in the middle problem in RAG and how @LangChainAI LongContextReorder addresses it. In RAG, for really long context (10+ retrieved docs), turns out it's not the best way to just plug-in the docs in the descending order of vector similarity score.
Tweet media one
8
35
247
@clusteredbytes
Rohan
2 years
Created FootballGPT using GPT-4 @LangChainAI and @ApiFootball . Taught GPT-4 how to use a complex API with many endpoints and numerous parameters. This is where GPT-4's advanced reasoning capability came in handy. Here's a demo 👇
21
30
224
@clusteredbytes
Rohan
1 year
One issue of using embeddings to retrieve relevant documents is that the results might vary with the slightest change in the wording of the query. @LangChainAI MultiQueryRetriever tries to address this issue with the help of LLMs. Let's see how to use it 👇🧵
Tweet media one
6
46
226
@clusteredbytes
Rohan
1 year
Previously we've seen @LangChainAI ParentDocumentRetriever that creates smaller chunks from a document and links them back to the initial documents during retrieval. MultiVectorRetriever is a more customizable version of that. Let's see how to use it 🧵👇
Tweet media one
8
35
217
@clusteredbytes
Rohan
7 months
Fully local, open source chat-with-pdf app tutorial under 2.5 minutes 🔥🚀 Stack used: @llama_index Typescript for RAG @ollama @nextjs with server actions Phi2 and @nomic_ai models using Ollama Detailed tutorial: GitHub repo:
8
41
214
@clusteredbytes
Rohan
9 months
The "Dense X Retriever" paper shows that it significantly outperforms the traditional chunk-based retriever @LoganMarkewich created an awesome LlamaPack that lets you get started with this proposition-based retriever in no time using @llama_index 🔥 Let's see how it works 👇🧵
Tweet media one
5
46
204
@clusteredbytes
Rohan
9 months
Introducing LlamaBot 🔥🚀 An open-source Discord bot that listens to your conversations, remembers them and answers your questions across a discord server, created using @llama_index (inspired by @seldo 's LlamaBot for Slack) Stack used: LlamaIndex, Gemini Pro, @qdrant_engine
8
33
192
@clusteredbytes
Rohan
11 months
LlamaPacks by @llama_index are out 🚀🔥 In this speedrun 🏃‍♂️, I wanted to demonstrate how fast and easy it is to create a gmail agent for your inbox using LlamaPacks. Spoiler Alert⚠️ It took only 54.86 seconds 🚀 to get to the chat interface, with only 4-5 lines of code 🔥🤯
4
29
187
@clusteredbytes
Rohan
7 months
Streaming intermediate events in RAG is crucial for best user experience 🚀 Let's see how to use @llama_index and @vercel AI SDK to properly stream intermediate events to the frontend. Full tutorial under 3 minutes 🔥
5
39
184
@clusteredbytes
Rohan
10 months
Previously I've talked about the amazing Ingestion Pipeline from @llama_index . Here's how to use Redis ( @Redisinc ) as the docstore, vectorstore and cache for the pipeline. LlamaIndex abstractions make it really easy to just use Redis for the entire pipeline 🔥👇
Tweet media one
3
24
136
@clusteredbytes
Rohan
2 years
Created NewsBuddy 📰 using @LangChainAI and ChatGPT API. NewsBuddy is your personal news assistant. Had quite fun building this fun little project overnight. Learnt loads of new stuff about LangChain and Prompt Engineering. Here's a demo of our NewsBuddy 👇
10
13
103
@clusteredbytes
Rohan
9 months
Checkout this new OSS repo by @seldo that contains detailed, step by step instructions on how to build a slack bot completely from scratch using @llama_index The bot listens to conversations and answers questions about them 🔥 Here's the high level architecture of the bot 👇
Tweet media one
1
17
73
@clusteredbytes
Rohan
2 years
Using #ChatGPT to easily create Chrome extensions from scratch in 15 minutes 🔥 Full Step-by-Step Tutorial with prompts. #OpenAI #AI We'll use ChatGPT to create a simple extension, QuikNote, that takes quick daily notes right from the browser. Here are the steps required 👇
Tweet media one
6
5
32
@clusteredbytes
Rohan
1 year
The issue: - smaller chunks reflect more accurate semantic meaning after creating embedding - but they sometimes might lose the bigger picture and might sound out of context, making it difficult for the LLM to properly answer user's query with limited context per chunk.
1
0
24
@clusteredbytes
Rohan
1 year
Here's a short demo of how this Multi Document Agent architecture would work:
1
2
20
@clusteredbytes
Rohan
10 months
@llama_index Here's a nice animation by the authors demonstrating how FLARE blends generation and retrieval by dynamically incorporating relevant and up-to-date information. Source: LlamaIndex FLAREInstructQueryEngine:
1
2
20
@clusteredbytes
Rohan
8 months
Within 24 hours, OpenAI's Sora has dazzled with some stunning videos🌟 Introducing FlixAI, A one-stop hub where I've compiled all the videos by Sora so far, alongside their prompts. It supports semantic search and suggests similar videos, other models like Pika, runway etc.
4
3
19
@clusteredbytes
Rohan
1 year
Architecture: - For each document, a VectorIndex is created for semantic search, and a SummaryIndex is created for summarization - Then we create QueryEngine for both these Indices - Next the QueryEngines are converted to QueryTools
Tweet media one
2
0
14
@clusteredbytes
Rohan
1 year
@LangChainAI LongContextReorder addresses this issue by re-ordering the documents after retrieval. It puts the most similar ones at the top, and then the next few ones at the end, and the least similar ones in the middle.
Tweet media one
3
3
14
@clusteredbytes
Rohan
1 year
The Issue: In the context window of LLM prompt, we put the most similar documents at the top, and least similar ones at the bottom. But LLMs tend to ignore documents at the middle of its context. Hence, this is where we should put the least similar ones, not at the bottom.
4
1
13
@clusteredbytes
Rohan
1 year
Recent research shows that: - Performance is often highest when document containing answer to user's question occurs at the beginning or at the end of the context
Tweet media one
1
2
13
@clusteredbytes
Rohan
1 year
@LangChainAI ParentDocumentRetriever addresses this issue by creating embedding from the smaller chunks only as they capture better semantic meaning. But while plugging into the LLM input, it uses the larger chunks with better context.
1
0
12
@clusteredbytes
Rohan
1 year
Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG
@clusteredbytes
Rohan
1 year
Multi Document Agent architecture (v0) in @llama_index , a step beyond naive top-k RAG. It allows answering broader set of questions over multiple documents, which weren't possible with basic RAG. Let's break down the agent architecture and see how it works 👇🧵
Tweet media one
17
83
484
3
0
10
@clusteredbytes
Rohan
2 years
1. First create the API server using ChatGPT - tell ChatGPT your requirements - and also which library to create the server (FastAPI in this case)
1
0
9
@clusteredbytes
Rohan
2 years
Unreal Engine is changing the Photorealistic Animation game with their upcoming "MetaHuman Animator". You can use your iPhone to shoot and then reproduce animation of facial expressions with Insane details and fidelity, all within minutes. #AI #UnrealEngine #MetaHuman
2
2
10
@clusteredbytes
Rohan
1 year
These Tools are passed to OpenAIAgent. This is the document agent. Each document has an agent like this that chooses to perform summarization or semantic search within each document.
Tweet media one
1
0
10
@clusteredbytes
Rohan
1 year
Next we have a top-level Retriever-Enabled Agent. This boss agent orchestrates across different document agents. First it retrieves the document agents relevant to the question, then passes the input to those agents only and crafts the response from those agent outputs.
Tweet media one
1
0
9
@clusteredbytes
Rohan
1 year
@llama_index Full guide with benchmarks in the official documentation:
1
2
8
@clusteredbytes
Rohan
11 months
@llama_index First let’s start with some simple stuff. We just want to ask questions about our images. OpenAIMultiModal is a wrapper around OpenAI’s latest vision model that lets us do exactly that.
Tweet media one
2
0
9
@clusteredbytes
Rohan
1 year
Meet GlowGPT. Upload a photo and get instant feedback and suggestions from AI. Created using @OpenAI chatgpt API, @LangChainAI , @Gradio , @huggingface transformers and BLIP models. Here's a demo 👇
2
1
8
@clusteredbytes
Rohan
1 year
We're gonna need two splitters instead of one. - One for creating the larger chunks - Another one for creating the smaller chunks
Tweet media one
1
1
9
@clusteredbytes
Rohan
9 months
@LoganMarkewich @llama_index Thanks to @LoganMarkewich , there's already a LlamaPack for ""Dense X Retriever" that handles: - generating the propositions - creating the vector index - Creating the retriever (Recursive retriever in this case) and the query engine Here's how to use the pack 👇
Tweet media one
1
0
9
@clusteredbytes
Rohan
9 months
@llama_index @seldo @qdrant_engine Features: - We can ask LlamaBot questions about what's going on across the server - We can tell LlamaBot to start/stop listening to conversations. - We can check current listening status, or ask the bot to forget everything from the server.
Tweet media one
1
1
8
@clusteredbytes
Rohan
9 months
@LoganMarkewich @llama_index The paper also shows how to create these propositions 👇 First GPT4 is prompted properly to generate some propositions. Then Flan-T5-Large model is finetuned with the generated propositions. The finetuned model is called "The Proposition-izer"
Tweet media one
1
0
8
@clusteredbytes
Rohan
11 months
@llama_index Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG
@clusteredbytes
Rohan
11 months
Multi-Modal AI is rapidly taking over 🔥🚀 It’s truly amazing how fast @llama_index incorporated a robust pipeline for multi-modal RAG capabilities. Here’s a beginners-friendly guide to get started with multi-modal RAG using LlamaIndex 👇🧵
Tweet media one
12
106
636
0
1
8
@clusteredbytes
Rohan
1 year
Let’s walk through the example code from LangChain’s website on ParentDocumentRetriever 🧑‍💻 👇
1
0
8
@clusteredbytes
Rohan
1 year
@llama_index Details about it on the official documentation:
1
1
8
@clusteredbytes
Rohan
9 months
@LoganMarkewich @llama_index A proposition is an atomic, self-contained text encapsulating a distinct factiod, written in simple natural language format. A single Proposition encapsulates only one contextualized atomic fact. It cannot be further split into separate propositions.
Tweet media one
1
1
7
@clusteredbytes
Rohan
2 years
2. Then create the plugin manifest. A Plugin manifest is a json file with: - simple metadata about the plugin - tells how to show the plugin to a human - also how to describe it to the language model
Tweet media one
1
0
7
@clusteredbytes
Rohan
2 years
🚀 Github Copilot JUST got way better, with the help of GPT-4. 🔥 GitHub just announced Copilot X with stunning new features like: - Chat and voice support - Copilot for terminal - Answering questions from docs - Generate Pull requests 1/6 #AI #ChatGPT #GPT4 #Github #Copilot
1
0
7
@clusteredbytes
Rohan
10 months
@llama_index FLARE addresses this issue by dynamically adapting to the evolving context while it's being generated. During generation, when low confidence tokens are generated (possible hallucination), FLARE actively performs retrieval.
2
0
6
@clusteredbytes
Rohan
1 year
Thus we use small chunks (with better semantic meaning) for vector similarity matching and return their corresponding larger chunks that have the bigger picture and more context.
2
0
7
@clusteredbytes
Rohan
6 months
Update: I've added streaming partial objects feature to the built-in @llama_index OpenAIPydanticProgram (Thanks @_nerdai_ for the review) So you can just call the 'stream_partial_objects' method of the built-in class now. The project repo has been updated accordingly as well.
@clusteredbytes
Rohan
6 months
Open source AI Diagram Generator 🔥 Uses @llama_index Pydantic program with partial JSON parsing and @vercel AI SDK to send intermediate diagrams during generation for improved UX 🚀 Repo: Full tutorial under 2.5 minutes 👇
3
41
313
2
2
8
@clusteredbytes
Rohan
1 year
To address this issue, we can just re-order the retrieved documents ourselves so that the least relevant ones are at the middle. Or we can use LongContextReorder from LangChain, that does it automatically.
1
0
7
@clusteredbytes
Rohan
10 months
@llama_index These are the transformations we can use: 1. TextSplitter 2. NodeParser 3. MetadataExtractor 4. Any embedding model We can also create custom transformations. Guide on this is coming soon. Output of one transformation is the input to the next one.
1
0
5
@clusteredbytes
Rohan
1 year
Thanks for reading. I write about AI, ChatGPT, LangChain, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #ChatGPT #LangChain
@clusteredbytes
Rohan
1 year
Lost in the middle problem in RAG and how @LangChainAI LongContextReorder addresses it. In RAG, for really long context (10+ retrieved docs), turns out it's not the best way to just plug-in the docs in the descending order of vector similarity score.
Tweet media one
8
35
247
2
0
7
@clusteredbytes
Rohan
2 years
Don't miss out on these amazing new ChatGPT powered chrome extensions 🚀🔥 1. ParagraphAI - Perfectly curated writing 2. Glasp - YouTube summary 3. Merlin - ChatGPT Plus, on all sites 4. Glarity - summarize Google/Bing results Make the most out of these AI tools.
1
1
6
@clusteredbytes
Rohan
1 year
More on this retriever and the details for evaluation results on LlamaIndex documentation:
1
0
7
@clusteredbytes
Rohan
1 year
Storing the chunks - As we're creating embedding for the small chunks only, we'll use a vectorstore to store those. - Whereas the larger chunks are stored in an InMemoryStore, a KEY-VALUE pair data structure, that stays in the memory while the program is running.
Tweet media one
1
1
7
@clusteredbytes
Rohan
10 months
@llama_index FLARE Instruct: This mode prompts the LLM to identify and put search queries during generation through few shot prompting. e.g. Donald Trump attended [Search(which college did Donald Trump attend?)]
Tweet media one
1
0
5
@clusteredbytes
Rohan
1 year
This is version 0, and there's still room for improvement. Next steps are parallel query planning, reducing latency and more 🚀 Full guide here:
1
1
6
@clusteredbytes
Rohan
2 years
Found this amazing GPT-4 powered chrome extension - Taxy AI It automates repetitive browsers actions by sending parts of DOM and user prompt to GPT-4. Then GPT-4 performs that action for you 🔥 Here's how it performs various repetitive taks from one-line user prompt 👇
1
3
6
@clusteredbytes
Rohan
1 year
After filling in, we try merging parent nodes. Hypothesis is that if the ratio of no of retrieved children of a parent vs total children of that parent is above a threshold(we can adjust it), then we might as well return the larger parent for better context.
2
0
7
@clusteredbytes
Rohan
11 months
@llama_index LlamaIndex has MultiModalVectorStoreIndex which creates embedding for both image and text nodes and stores them in vector stores. For image nodes it uses 'clip' and for text nodes it uses 'ada' for getting the embedding (customizable). Let’s create the multi-modal index
Tweet media one
1
0
6
@clusteredbytes
Rohan
1 year
The first step here is parsing via the HierarchicalNodeParser. It stores the node in a tree structure, where deeper nodes are smaller chunks and shallow nodes are larger chunks. We can specify how many layers of nodes we want and the splitter size for each layer.
2
0
6
@clusteredbytes
Rohan
1 year
Next we pass these documents to an instance of the LongContextReorder() and get the re-ordered docs where the least relevant ones are at the middle.
Tweet media one
1
0
6
@clusteredbytes
Rohan
1 year
@LangChainAI ParentDocumentRetriever automatically creates the small chunks and links their parent document id. If we want to create some additional vectors for each documents, other than smaller chunks, we can do that and then retrieve those using MultiVectorRetriever.
@clusteredbytes
Rohan
1 year
While splitting the raw text for Retrieval Augmented Generation (RAG), what should be the ideal length of each chunk? What’s the sweet spot? Strike a balance between small vs large chunks using @LangChainAI ParentDocumentRetriever Let's see how to use it 👇🧵
Tweet media one
18
66
371
1
0
6
@clusteredbytes
Rohan
1 year
All nodes are stored in a docstore and only the leaf nodes are stored in a vectorstore. At first, the vectorstore retriever is called to get the initial leaf nodes. From here we try to auto-merge parents to find parent with the correct chunk size.
Tweet media one
Tweet media two
1
0
6
@clusteredbytes
Rohan
9 months
After receiving some feedback from you guys (which I really appreciate), I've made some updates to LlamaBot: - Use GPT4 or Cohere - Remember user mentions - Refine prompt etc. If you encounter any issues while using the bot feel free to let me know or open an issue on GitHub
Tweet media one
@clusteredbytes
Rohan
9 months
Introducing LlamaBot 🔥🚀 An open-source Discord bot that listens to your conversations, remembers them and answers your questions across a discord server, created using @llama_index (inspired by @seldo 's LlamaBot for Slack) Stack used: LlamaIndex, Gemini Pro, @qdrant_engine
8
33
192
0
0
6
@clusteredbytes
Rohan
1 year
@llama_index This parser: - extracts tables from data - converts those tables to Dataframe - for each of those tables, it creates 2 nodes - one Table Node that contains the Dataframe as string - another IndexNode that stores the summary of that table and a reference to that Table Node
Tweet media one
Tweet media two
2
0
6
@clusteredbytes
Rohan
10 months
@llama_index Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG
@clusteredbytes
Rohan
10 months
Ingestion Pipeline is a new and improved way to ingest and manage documents in @llama_index It supports: - applying a series of transformation on documents - caching those transformations - managing ever-changing documents etc. Let's see how to use it 👇🧵
Tweet media one
4
55
313
0
0
6
@clusteredbytes
Rohan
1 year
@llama_index . @llama_index has guides on how to finetune embeddings in different ways: - finetune the embedding model itself (only sentence transformers) - finetune an adapter over any black-box embedding model (stay tuned for this one 🔥)
1
1
5
@clusteredbytes
Rohan
10 months
@llama_index Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG
2
0
5
@clusteredbytes
Rohan
1 year
@llama_index Next we partition the nodes using this built-in function of the Unstructured parser. Here BaseNodes contain the regular nodes and the IndexNodes (not the Table Nodes) NodeMapping contains {id->Node} mapping for those remaining Table Nodes.
Tweet media one
2
0
5
@clusteredbytes
Rohan
2 years
Thanks for reading. I write about AI, CloudNative, Kubernetes, System Design etc. and try to make complex topics as easy as possible. Stay tuned for more.
3
1
5
@clusteredbytes
Rohan
1 year
AI won't steal your girl, but someone using FlirtGPT definitely will 😎 Don't use cheesy pick-up lines anymore 🚫 Just upload a pic of your crush and let ChatGPT generate amazing and personalized pick-up lines for you 😍 Built using @LangChainAI @Gradio & BLIP models. Demo 👇
1
0
5
@clusteredbytes
Rohan
1 year
We create any retriever as usual. And then get the relevant documents using the get_relevant_documents() method of that retriever. This returns the documents in the descending order of their similarity score.
Tweet media one
1
0
5
@clusteredbytes
Rohan
11 months
@llama_index Just like text based RAG, where we were limited by the context length, here we’re also limited by how many images we pass. Hence, we would only want to pass the images that are related to our query. How do you find images related to your query?? Yep, via vector embedding 🚀
1
0
5
@clusteredbytes
Rohan
1 year
@llama_index 3 Steps for finetuning embeddings: 1. Prepare the data via generate_qa_embeddings_pairs() 2. finetune model via SentenceTransformersFinetuneEngine 3. Evaluate the model
1
0
5
@clusteredbytes
Rohan
1 year
@llama_index The linear adapter: The query embedding is updated using this linear transformation of the adapter: updated_q = W*q + b We train the linear adapter on the training corpus to find the best value for the weight and bias, W and b.
1
1
2
@clusteredbytes
Rohan
1 year
#AI PROJECTS MEGA-THREAD Thought of curating all my AI related projects and experiments in one thread so it's easier to find. Will be updating this thread with all the AI projects I build in the future. So stay tuned 🔥 Projects were built using @OpenAI @LangChainAI 🧵 👇
1
1
5
@clusteredbytes
Rohan
2 years
Learnt a lot about prompt engineering and how LangChain works under the hood. Really enjoying playing with LangChain. Created a custom chat agent for this one via extending LangChain's ConversationalChatAgent. Also had to cut the delays in the demo as GPT-4 was quite slow.
3
0
5
@clusteredbytes
Rohan
10 months
@llama_index . @llama_index has a FLAREInstructQueryEngine that makes is really easy to work with FLARE. It currently implements FLARE Instruct mode, which tells the LLM to generate retrieval instructions.
1
0
4
@clusteredbytes
Rohan
9 months
@LoganMarkewich @llama_index Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG
@clusteredbytes
Rohan
9 months
The "Dense X Retriever" paper shows that it significantly outperforms the traditional chunk-based retriever @LoganMarkewich created an awesome LlamaPack that lets you get started with this proposition-based retriever in no time using @llama_index 🔥 Let's see how it works 👇🧵
Tweet media one
5
46
204
3
0
5
@clusteredbytes
Rohan
1 year
Create the ParentDocumentRetriever object We pass the vectorstore, docstore, parent and child splitters to the Constructor.
Tweet media one
1
1
5
@clusteredbytes
Rohan
2 years
Learn how to create chrome extensions using ChatGPT. Full guide with prompts.
@clusteredbytes
Rohan
2 years
Using #ChatGPT to easily create Chrome extensions from scratch in 15 minutes 🔥 Full Step-by-Step Tutorial with prompts. #OpenAI #AI We'll use ChatGPT to create a simple extension, QuikNote, that takes quick daily notes right from the browser. Here are the steps required 👇
Tweet media one
6
5
32
1
0
5
@clusteredbytes
Rohan
11 months
@llama_index Told you it was easy. LlamaIndex handles all the underlying logic for converting those image_documents to compatible format for the multi-modal llm. But there’s an issue !! 👇
1
0
5
@clusteredbytes
Rohan
1 year
Thanks for reading. I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible. Stay tuned for more ! 🔥 #AI #RAG
@clusteredbytes
Rohan
1 year
We've seen that smaller chunks are good for capturing semantic meaning and larger ones are good for providing better context. @llama_index AutoMergingRetriever takes it one step further by keeping the chunks in a tree structure and dynamically choosing the chunk length. 🧵👇
Tweet media one
8
67
412
2
0
5
@clusteredbytes
Rohan
1 year
LlamaIndex shows from the pariwise comparision evaluation results that when asked, GPT-4 preferred the results produced using AutoMergingRetriever vs baseline retriever 65% of the time, which is above average.
1
0
5
@clusteredbytes
Rohan
2 years
3. Then deploy the server and manifest json file. 4. After deploying the plugin server, add the plugin to ChatGPT - Provide the domain where the plugin is hosted - Provide auth token if needed
1
0
5
@clusteredbytes
Rohan
1 year
Thanks to LlamaIndex, creating an AutoMergingRetriever is quite straightforward. we just need to pass the base retriever and the storage context containing the docstore of the hierarchical nodes to it's constructor. And then we can use it like any other retriever.
Tweet media one
1
1
5
@clusteredbytes
Rohan
10 months
@llama_index Transformations are the building blocks of Ingestion Pipeline. Each transformation takes a list of nodes, and returns another list of nodes after making the desired modifications to them. We define the transformations while instantiating the pipeline itself.
Tweet media one
2
0
5