virat Profile Banner
virat Profile
virat

@virattt

12,968
Followers
84
Following
501
Media
5,528
Statuses

playing with gen ai + finance

New York, NY
Joined April 2010
Don't wanna be here? Send us removal request.
Pinned Tweet
@virattt
virat
2 months
I built a stock market API in 42 days. All during @_buildspace s5. What is the API? Today, it lets you pull financials for 16,000 tickers going back 30+ years. Why did I build the API? Because the big providers have: • poor API design • poor documentation • expensive sales
30
59
608
@virattt
virat
3 months
It's happening. Last night, I started downloading financial data from the SEC. • income statements • balance sheets • cash flow statements 10,000+ public companies. 3 million rows in total. I’m using multiple worker nodes to pull, parse, and clean the data. The
105
145
2K
@virattt
virat
2 months
It's live. I launched my stock market API for AI financial agents today. • 16,857 tickers • 30+ years of fundamental data Everything is usage-based, meaning no contracts or subscriptions. We have 3 endpoints today: • GET income statements • GET balance sheets • GET cash
49
74
931
@virattt
virat
2 months
I’ve been building in public for 1 year now. Last year, I promised to build and tweet every day for a year. Today, I fulfilled that promise. Year 1 review: • grew twitter from 110 to 11.7K • built 400 member discord community • got 350 stars on a github project • earned
Tweet media one
49
18
741
@virattt
virat
4 months
I just trained a 124M param LLM from scratch. It took ~26 seconds in @GoogleColab Training details: • 5145 tokens in training set • 1024 tokens in context window • 256 tokens per batch • 10 epochs total The model went from generating gibberish to full sentences. Extremely
Tweet media one
21
128
871
@virattt
virat
1 year
I've seen multiple questions about how to build a Chatbot that: • Retrieves data from PDFs and • Has conversational memory Turns out, it's really simple to do with @LangChainAI . So, I wrote a quick tutorial with a real-world example for you all. Code below.
27
91
772
@virattt
virat
3 months
I finally looked at the Apple Intelligence architecture today. Main thing that stands out is the orchestration of multiple models. Especially between on-device and server models. 1 • Small LM (SLM) that is 3B param and runs on your device. Vocab size of 49K. 2 • Large LM
Tweet media one
15
133
767
@virattt
virat
3 months
It's going live. My stock market API now has coverage for all S&P 500 tickers. • income statements • balance sheets • cash flow statements 30+ years of data. No API limits. You can connect your AI financial agents to this data. This open beta will run for ~1 week. Main
36
60
764
@virattt
virat
4 months
I finally understand how GPT generates text. Really helps to code it from scratch in Python. There are 5 components: • token embeddings • positional embeddings • transformer blocks • layer normalization • output head It sounds complex, but grokking GPT is simple. Token
Tweet media one
8
130
736
@virattt
virat
6 months
Fine-tuning a Warren Buffett LLM 🧠 I started this workstream today. Overall goal: fine-tune an LLM to analyze companies like Mr. Buffett does. My initial setup: • use mistral 7b instruct • use single gpu in colab • use QLoRA for fast fine-tune • use small dataset to prove
Tweet media one
25
93
737
@virattt
virat
6 months
It’s finally Friday. Time for another LLM cost vs. performance showdown. The result from today’s tests indicate an emergence of 3 distinct LLM tiers: • throughput tier • workhorse tier • intelligence tier Throughput tier: Unreal tokens / sec. Only groq mistral 8x7b at the
Tweet media one
19
122
683
@virattt
virat
7 months
Open Source SEC Filing Reader 📊 A cool and exciting update today. I finally extracted income income statements from a 10-K using Mistral-7B. The output was cleanly JSON formatted. High-level implementation: • download and chunk SEC filing • store chunks in a vector db •
Tweet media one
34
78
672
@virattt
virat
3 months
I finetuned an LLM for spam detection. It reuses gpt-2 weights and has 124M params. Total times: • finetuning took 59 seconds • predictions took 0.02 seconds Everything was done in @GoogleColab for free. Since it's a tiny model, finetuning and inference is extremely fast.
Tweet media one
25
86
662
@virattt
virat
1 year
Alright, I finally understand @LangChainAI agents and tools. I can now create a custom: 1. Tool that "reads" annual reports 2. Agent that answers queries via the tool For my example, I am using $META's 2022 annual report. Code is below. Happy learning 🙂
14
57
657
@virattt
virat
1 month
I fixed up our Warren Buffett financial agent. • added few-shot prompting • use sonnet 3.5 as main LLM This increased answer correctness from 53.3% to 86.6% Small tweaks, big gains.
Tweet media one
15
86
657
@virattt
virat
5 months
Llama 3 crushed my financial metrics tests I tested both the 70B and 8B models. Both aced the metric calculation tasks. The result from today’s tests indicate an emergence of 4 distinct LLM tiers: • thropughput tier • workhorse tier • intelligence tier • groq tier Groq
Tweet media one
14
129
628
@virattt
virat
6 months
My current RAG stack 🥞 • cohere embeddings • cohere command r+ tool calling • cohere rerank 3 • weaviate vector db • opus final output layer Why @cohere : intuitive api, fast inference with high quality for cheap. Why @weaviate_io : easy setup, solid retrieval, helpful
33
53
628
@virattt
virat
1 month
I am building a financial agent from scratch. Inspired by my investing hero, Warren Buffett. All of my code will be open source. Current tools: • get financials • calculate owner earnings • calculate intrinsic value • calculate ROE, ROIC, etc. I am creating the
Tweet media one
27
76
622
@virattt
virat
6 months
LLM Pricing vs. Speed 💰 I ran experiments comparing inference cost vs. speed Task: Text Generation. Given Item 1 (Business) from latest 10-K, explain Nvidia's business model. Experiment setup: • 10 runs per model • 1000 max output tokens • calculate cost per run •
Tweet media one
15
113
487
@virattt
virat
4 months
I finetuned my first LLM today. It was super easy using the code from @DeepLearningAI My initial setup: • 70M param model • 900 row dataset from SEC filings • 1000 training steps • run on google colab This was a test run to see how it all works. My finetuned LLM is far
16
49
495
@virattt
virat
5 months
Llama 3 on @GroqInc is incredible The 70b model beat opus on my financial RAG tests. Llama 3 RAG results: • speed: 2.59s • correctness: 81.33% This is the highest score I have seen on financial RAG. • 7 secs faster than opus • 4% more correct than opus With insane
Tweet media one
15
59
488
@virattt
virat
5 months
Can LLMs understand long documents? A microsoft team just tackled this question. To answer, they fine-tuned Mistral-7B-Instruct with a synthetic dataset. How they created dataset: • used realnewslike corpus • split corpus by 128 token segments • used gpt-4 to generate
Tweet media one
8
71
481
@virattt
virat
2 years
Playing around with @LangChainAI this morning. It's a step-function change in how we'll "fine tune" ChatGPT moving forward. To test it out, I uploaded Airbnb's latest 10K via LangChain's document uploader. Then, I asked questions about the report
Tweet media one
12
40
474
@virattt
virat
8 months
Faster RAG re-ranking with ColBERT After re-ranking using GPT-4 yesterday, I tested out ColBERT for re-ranking today. Test: • Re-ranking Airbnb's 10-K, like before. Results: • ColBERT and GPT-4 were identical in ranking quality However, ColBERT was lightning-fast.
Tweet media one
8
66
464
@virattt
virat
3 months
If you are looking to understand: • how to build an LLM • how to do pretraining • how to do finetuning …all from scratch, then @rasbt book is the best resource I have found. Each chapter is hands-on and written in an easy-to-follow style. Truly a masterpiece on technical
@rasbt
Sebastian Raschka
3 months
If you are looking for something to read this weekend, I am happy to share that Chapter 7 on instruction finetuning LLMs is now finally live on the Manning website: This is the longest chapter in the book and takes a from-scratch approach to implementing
Tweet media one
22
271
2K
5
80
443
@virattt
virat
4 months
I’m building an open source financial agent for fun. Goal is to explore generative UI. Under the hood: • uses @LangChainAI agent + tools • uses @vercel ai sdk • uses @polygon_io financials The code is live on github. Thanks to the great @SullyOmarr for the starter code.
13
34
387
@virattt
virat
5 months
I just migrated my financial RAG evals to LangSmith. Previously, I was doing evals by hand. Now, LangSmith takes care of: • managing datasets • evaluating correctness • measuring latency • visualizing prediction vs. answer These features come built-in. My financial RAG
Tweet media one
16
50
426
@virattt
virat
7 months
I am blown away by RAGAS With 10 lines of code, I created a question + answer dataset of Airbnb's latest annual report (10-K). The dataset has 3 parts: • questions • contexts • ground truth answers Next step: Evaluate how well various LLMs perform RAG on financial
Tweet media one
12
62
425
@virattt
virat
6 months
Can LLMs have infinite context? Researchers from Google say yes. A new paper proposed Infini-attention, which lets LLMs have infinite context. How Infini-attention works: • has local attention like any transformer • has global attention via compression • combines local and
Tweet media one
21
56
417
@virattt
virat
7 months
Reading SEC filings with Instructor Can an LLM read an SEC filing and output structured data? Yes and it is really easy with Instructor. Entire Setup: • store 10-K in vector DB • define @pydantic model • pass 10-K and model to instructor • call LLM using instructor With
Tweet media one
11
45
417
@virattt
virat
3 months
The beta version of my stock market API is live. To start, you can fetch: • income statements • balance sheets • cash flow statements In the beta, you can access up to 10 tickers. Once I setup auth and API keys, I'll expand coverage to 10,000+ stocks. All data goes back
Tweet media one
35
31
417
@virattt
virat
3 months
I finetuned a 124M param LLM for sentiment classification today. Given financial news article, detect if sentiment is positive or negative. Model setup: • gpt-2 pretrained weights • 12 transformer blocks • 124M trainable params • 1024 context window I only used 1,208 rows
Tweet media one
32
47
414
@virattt
virat
5 months
Exploring LLM Pricing 💰 I updated my table to include llama 3. The table now has 4 pricing tiers: • tier 1 starts at $0.25 • tier 2 starts at $12.00 • tier 3 starts at $24.00 • tier 4 starts at $42.00 To get "total cost", I combine input cost and output cost. The
Tweet media one
21
71
404
@virattt
virat
6 months
Open source financial agent 🤖 The github repo is live. You can now run the agent in your browser via LangServe. Things the agent can do: • Get prices for stocks • Get financials for stocks • Get market news for stocks I will be adding more features to the repo over the
12
50
401
@virattt
virat
7 months
Exploring LLM Pricing With so many new LLMs, how do API costs compare? I delved into cost comparisons of models that I would use in production. Main takeaways: • cohere leads with cost-effective model • gpt-3.5 remains excellent value • mistral cost higher than anticipated
Tweet media one
23
74
397
@virattt
virat
3 months
I launched API docs for my stock market API today. If you are building: • ai financial agents • stock analysis tools • quant trading models ..then this API is for you. The API offers: • income statements • balance sheets • cash flow statements You can actually call the
33
28
389
@virattt
virat
8 months
Cohere reranking is seriously good Today, I expanded the RAG reranking tests that I'm running to include Cohere. Overal Test: • Reranking Airbnb's 10-K, as before Reranking Speeds: • 0.24 secs for Cohere • 1.04 secs for ColBERT • 25.47 secs for GPT-4 Turbo • 50.94
Tweet media one
19
60
388
@virattt
virat
8 months
Corrective RAG (CRAG) What happens when the RAG retrieval step performs poorly? A recent paper proposed CRAG, which improves the robustness of RAG systems. CRAG uses T5 to calculate the relevance score of retrieved documents. Relevance scores: • Correct • Incorrect •
Tweet media one
8
77
387
@virattt
virat
9 months
Perplexity AI's team is brilliant. I've been impressed with how fast and helpful their Copilot is. The Copilot is fast because Perplexity was using a fine-tuned GPT-3.5 as of Aug 2023. The Copilot is helpful because it's constantly fine-tuned and has real-world knowledge. In
Tweet media one
13
31
375
@virattt
virat
6 months
Friday is LLM battle day. I added DBRX to the financial metrics challenge. Overall, very impressed with DBRX. Main takeaways: • correctly calculated metrics • ranked top 4 fastest models • competitive pricing DBRX was +50% cheaper and +100% faster than models in its tier.
Tweet media one
14
76
375
@virattt
virat
3 months
I am exploring LLM finetuning this week Two approaches I'm interested in: 1 • finetuning the entire LLM 2 • finetuning only part of the LLM Finetuning the entire LLM is exactly what you think. You update all of the weights in the transformer blocks. Finetuning only part of
Tweet media one
11
46
374
@virattt
virat
9 months
Earlier today, @LangChainAI announced LangGraph. LangGraph lets us build language agents as graphs. The interface is pretty clean. And I just used LangGraph to build a financial agent graph. My graph has two tools: • extract ticker from user query • get latest price for the
Tweet media one
10
49
373
@virattt
virat
1 year
This morning, I spent some more time playing around with @LangChainAI and @pinecone . This time, I did question answering over Airbnb's last 3 annual reports (PDFs). • Less than 50 lines of code • All in Python • Code linked below Exploration was inspired by @mayowaoshin
7
30
362
@virattt
virat
7 months
Exploring LLM Pricing New models have come out since I last shared my pricing table. New models: • command-r (cohere) • mixtral 8x7B (groq) • claude 3 (anthropic) Main takeaways: • cohere offers excellent value for cost • groq mixtral cheaper than mistral mixtral • opus
Tweet media one
12
61
365
@virattt
virat
4 months
I am finetuning llama 3 (8b) on SEC filings Goal: ace financial Q&A and launch in production. Step 1 is data collection: • pick a ticker (eg $NVDA) • grab its SEC filings • generate question + answer dataset • upload dataset to huggingface for reuse So far, I have created
Tweet media one
14
54
359
@virattt
virat
6 months
I never enjoyed parsing SEC filings for LLM use. Now, I never need to manually parse them again. I just came across edgartools, which is an open source library for easily accessing SEC EDGAR. With edgartools, we can: • query filings for a ticker • extract items from filings
Tweet media one
9
36
342
@virattt
virat
5 months
I found a new tool calling champion Llama3 70b on @GroqInc Challenge: given user query, extract financial quarters and years. Example: "How did revenue change between Q4 2023 and year before that?" The 70b model: • passed the task • was very fast • had best pricing I
Tweet media one
22
34
334
@virattt
virat
4 months
I'm training a small LLM this weekend. Found some cool Llama 2 facts while researching. Time to train in GPU hours: • 7B param took 184,320 GPU hours • 13B param took 368,640 GPU hours • 34B param took 1,038,336 GPU hours • 70B param took 1,720,320 GPU hours How about in
Tweet media one
11
59
329
@virattt
virat
8 months
Cohere is excellent at query rewriting Today, I added Cohere and Mistral to my RAG query rewriting explorations. Four models in total: • Cohere • GPT-3.5 • GPT-4 • mistral-medium The input query was ambiguous: "What's up with Airbnb's numbers"? Rewritten query results
Tweet media one
12
41
319
@virattt
virat
6 months
Open Source SEC Filing Reader 📊 The code is now more production-ready. Implementation updates: • used XBRL instead of raw 10-K text • used gpt-4 for better extractions I first downloaded XBRL financial data from EDGAR. Then extracted financial statements from XBRL using
Tweet media one
12
34
310
@virattt
virat
2 months
I am rebuilding my AI financial agent. Fully open source. Runnable locally. Today's change adds stock price charts. How it works: • ask a question • agent selects best tool • agent renders UI components • agent answers question @LangChainAI is perfect for this project:
24
29
315
@virattt
virat
7 months
Financial RAG Evaluation 🕵️ Which LLM can answer financial questions quickly and correctly in a RAG pipeline? This morning, I compared 3 models: • haiku (anthropic) • gpt-3.5 turbo (openai) • command-r (cohere) Overall, haiku was fastest while command-r was most correct.
Tweet media one
15
58
309
@virattt
virat
6 months
Introducing financial-datasets In 5 lines of code, generate datasets from SEC filings. The financial datasets are useful for: • LLM evaluation • LLM fine-tuning • and more The repo is live and fully open source. pip install financial-datasets to get started.
Tweet media one
5
35
301
@virattt
virat
6 months
My financial RAG dataset is live 🗃️ You all asked me to share the 100 question dataset. So, here it is. Dataset details: • 100 questions on Airbnb 2023 10-K • synthetically generated via opus The dataset is tiny right now, but I will continue expanding it. Eventually, I
Tweet media one
10
30
296
@virattt
virat
6 months
Open Source SEC Filing Reader 📊 Another fun update today. I used mistral-7b to extract all financial statements from a 10-K. Extracted statements: • income statement • balance sheet • cash flow statement This was really cool since mistral-7b is free and open source. I
Tweet media one
12
38
276
@virattt
virat
1 year
So, I'm learning how to build LLM-powered chat apps that are more production-ready, from scratch. I've created an open-source repo that contains my ongoing explorations. The stack: • Django backend • React frontend • @LangChainAI agents • Websocket protocol GitHub below.
Tweet media one
Tweet media two
15
30
284
@virattt
virat
5 months
I am diving into LLM fine-tuning. There is a lack of deep tech content on fine-tuning: • how it works • why it works • what it does to an LLM, etc. There is a ton of high-level stuff, however. I want to grok the first principles of fine-tuning. If you have an excellent
18
24
287
@virattt
virat
5 months
LLM Pricing Tiers 💰 I just updated my pricing table. The table now indicates 3 pricing tiers: • tier 1 is $1 to $7 • tier 2 is $12 to $24 • tier 3 is $42 to $120 To calculate "total cost", I combined input cost and output cost. I use a 3:1 ratio, assuming there are 3
Tweet media one
7
46
284
@virattt
virat
3 months
I rewrote my open source + gen ui financial agent today. The new stack: • python backend • nextjs frontend All powered by LangGraph from @LangChainAI . You can run the full-stack application locally on your machine. So far, I've added a tool for charting stock prices. Next
Tweet media one
14
32
276
@virattt
virat
6 months
Open source financial agent 🤖 Our agent can now do basic valuation. I added a tool for calculating intrinsic value via discounted cash flow analysis. We now have 5 tools: • get intrinsic value for ticker • get latest price for ticker • get latest news for ticker • get
Tweet media one
16
35
279
@virattt
virat
6 months
LLM Pricing + Speed + Quality 💰 I ran tests comparing inference cost, quality, and speed today. Task: Financial Metrics Calculation. Given JSON of financial statements, calculate financial metrics. Experiment setup: • 3 financial calculations • 10 iterations per model •
Tweet media one
5
39
276
@virattt
virat
8 months
I have found my RAG rerankers I spent the past few days testing rerankers and there are two that I'll use going forward. • Cohere • ColBERT Both performed as well as GPT-4 in reranking quality and are lightning fast. Cohere's avg inference time was ~200ms. It's
18
29
274
@virattt
virat
5 months
I added arena elo to my LLM pricing table The score is pulled from @huggingface Initial takeaways: • llama 3 70b is game changing • haiku remains excellent value • gemini 1.5 pro is exceptional • gpt-4 turbo reigns supreme My table is sorted by arena elo, desc. Happy to
Tweet media one
15
39
270
@virattt
virat
6 months
Financial RAG eval just got spicier 🌶️ Cohere launched Command R+ today. I tested the LLM and it scored a 70.12% on financial RAG. That is the highest score I have seen on this eval to date. Excellent work from the @cohere team. Command R+ has over 100B parameters, so
Tweet media one
10
37
270
@virattt
virat
7 months
Open source financial agent 🤖 I just added a new tool to @LangChainAI and am super excited for it. This tool fetches daily prices for a ticker. We now have 4 tools in total: • get latest price • get latest news • get financials • get historical prices The possibilities
Tweet media one
12
33
255
@virattt
virat
8 months
ColBERT reranking continues to impress Previously, I compared reranking speeds of ColBERT and GPT-4 Turbo. Today, I added mistral-medium to the mix. Overall Test: • Reranking Airbnb's 10-K, like before. Reranking Speeds: • 1.04 secs for ColBERT • 25.47 secs for GPT-4 Turbo
Tweet media one
10
44
264
@virattt
virat
6 months
I am fine-tuning my Warren Buffett LLM The toughest part is creating datasets. Not anymore. In 1 line of code, financial-datasets now creates datasets from Buffett’s letters. This works for any PDF. How to use: • set PDF url • set max questions • generate dataset I can
Tweet media one
8
34
250
@virattt
virat
8 months
Exploring Corrective RAG in code A few days ago, @LangChainAI released an excellent cookbook on implementing CRAG. I reused the cookbook to implement a simple financial assistant. My setup: • use vector db for SEC filings • use Tavily for web search To test the CRAG flow, I
Tweet media one
4
48
253
@virattt
virat
7 months
My open source financial agent 🤖 This is a new side project that I'm building for fun. It'll begin in colab, so you can run all of my code as I implement it. Two tools to start: • latest price for ticker • latest news for ticker Right now, it can answer: "What is the
Tweet media one
17
36
249
@virattt
virat
3 months
I finally read up on LoRA last night. LoRA can reduce finetuning params by 10,000 times. High-level implementation: • we freeze original LLM weights • we create small, low-rank matrices • we only train small matrices • we adjust LLM output w/ small matrices Instead of
Tweet media one
5
43
252
@virattt
virat
10 days
@sakethkotamraju Missed opportunity to call it Scammers will love this
1
3
251
@virattt
virat
7 months
Cohere's command-r is solid. The model launched today and is optimized for RAG. I ran it through my financial RAG evaluation pipeline vs. gpt-3.5 turbo. command-r won. Financial RAG Eval Setup: • naive RAG (no reranking, etc.) • 100 questions on Airbnb's 2023 10-K •
Tweet media one
7
40
250
@virattt
virat
5 months
I studied word embeddings today. Mainly, how LLMs like GPT-4 convert input text into input embeddings. It’s simpler than I expected. There are five key steps: 1. Convert input text to input tokens. 2. Map tokens to token IDs. Common vocab size is ~50K tokens. 3. Create
Tweet media one
6
27
247
@virattt
virat
7 months
Open source financial agent update 🤖 I just added a tool that lets us retrieve financials. Our agent can now get: • income statements • balance sheets • cash flow statements We can ask questions like: "What is $ABNB's latest net income?" Upcoming tools: • get
Tweet media one
7
32
245
@virattt
virat
6 months
Cmd R+ beats Sonnet at financial RAG I initially assumed these models were equivalent due to pricing. However, command r+ was both faster and 5% more correct than Claude Sonnet on financial RAG evals. Financial RAG pipeline: • openai embeddings • cosine similarity retrieval
Tweet media one
10
37
244
@virattt
virat
6 months
Financial RAG Evaluation 🕵️ I added reranking to the pipeline today. As expected, command-r performed even better. Main takeaways: • command-r excels at RAG • cohere reranking is seriously fast • gpt-3.5 slow at reranking, fine without Experiment setup: • included
Tweet media one
9
31
245
@virattt
virat
1 year
@burrytracker “This makes up 93% of his portfolio.” It actually doesn’t. His 13-F holdings, which is what you’re looking at, doesn’t contain his cash and non-US positions.
12
1
235
@virattt
virat
3 months
Our OSS financial agent now has a python backend. I migrated the agent code to use the latest @LangChainAI gen ui framework • set up langgraph • set up langserve • set up fastapi • set up agent tools Super excited about this project. We'll have a true client + server app
Tweet media one
8
28
239
@virattt
virat
3 months
My stock market API landing page is live. Initial focus is fundamentals data. • starting with 10,000 stocks • optimized for LLMs and AI agents • no subscriptions or contracts • simple and clean API The waitlist is now live 🙏 I am setting aggressive goals for myself.
Tweet media one
25
21
237
@virattt
virat
4 months
My fine-tuning journey begins today I am training llama 3 8b to create high quality datasets for financial Q&A. Fine-tuning approach: • create high quality datasets via gpt-4o • fine-tune llama 3 on datasets • evaluate performance I am using my financial-datasets library to
13
21
236
@virattt
virat
7 months
Using an LLM to evaluate an LLM Yesterday, I shared initial thoughts on LLM evaluation. One method was LLM-as-judge. Turns out, there is an excellent paper on it from Jan 2024: "Leveraging Large Language Models for NLG Evaluation: A Survey" My 3 favorite techniques: •
Tweet media one
14
41
231
@virattt
virat
6 months
I am pumped about today's update 🧪 In 1 line of code, financial-datasets lets you create Q&A datasets from a 10-K. Just specify: • ticker • year • max questions And financial-datasets takes care of the rest. No need to manually download, parse, and chunk SEC filings ever
Tweet media one
6
24
226
@virattt
virat
6 months
Fine-tuning a Warren Buffett LLM 🧠 Exciting update today. I generated a question + answer dataset using Berkshire's 2023 annual letter. Dataset schema: • question • answer • context The synthetic dataset contains 110 generated questions. Next step is to generate
Tweet media one
15
31
214
@virattt
virat
4 months
I’m learning how to build an LLM from scratch. Found some fun facts today. As we know, GPT-3 has 175B params. To train GPT-3 from scratch: • takes 355 years with single V100 • takes 665 years with single RTX 8000 The V100 is a data center GPU and would cost ~$4.6M. The
8
25
206
@virattt
virat
6 months
Financial RAG Evaluation🕵️ Haiku got lots of excitement yesterday. I ran it through my financial RAG eval pipeline today. Haiku was fast, but struggled on correctness versus similar models. Cmd-r remained financial RAG champ. Main takeaways: • haiku faster than gpt-3.5 •
Tweet media one
7
34
202
@virattt
virat
1 year
Lots of excitement around BabyAGI and AutoGPT. Meanwhile, I’m still trying to understand how @LangChainAI Agents and Tools work on a deeper level. Creating a simple tutorial on Agents this weekend. I’m curious as to what use cases folks would find helpful in the tutorial.
21
8
198
@virattt
virat
4 months
Our open source financial agent can now show price charts for multiple stocks 📈 On the fly. Using generative UI. Only took 10 mins to add thanks to @LangChainAI tools and @vercel ai sdk. Current agent tools: • show price charts • show latest news • show current price One
Tweet media one
14
24
204
@virattt
virat
5 months
Understanding LLM attention is tough. I will simplify how it works. The attention mechanism has 3 steps: 1 • compute attention scores 2 • compute attention weights 3 • compute context vectors Main goal of self-attention is step 3, computing context vectors. What are
Tweet media one
5
37
198
@virattt
virat
5 months
How GPT-4o predicts the next token 🎭 I have covered: • simple attention (linked) • trainable attention (linked) Next is causal attention. Causal attention is a fancy term for masking future tokens. It builds on top of trainable attention. The main change is applying a
Tweet media one
5
26
197
@virattt
virat
5 months
I tried LangSmith evaluation for financial RAG today. Pleased to report it does a bunch of heavy lifting. • loading dataset • creating RAG pipeline • running evaluator My favorite is that eval results are automatically displayed in real-time UI. Before, I was tracking
Tweet media one
6
26
193
@virattt
virat
3 months
Anthropic launched claude 3.5 sonnet today. In the release, agentic coding evals caught my attention. How agentic coding eval works: • claude reads an open source codebase • claude gets instruction (fix bug, etc.) • claude creates action plan • claude implements required
Tweet media one
3
18
168
@virattt
virat
6 months
Open source financial agent 🤖 We are running on LangServe. This means that we can chat with the agent in our browser. Once I have access to Hosted LangServe, I will deploy the agent to production. Current agent tools: • get latest price for ticker • get latest news for
14
24
171
@virattt
virat
5 months
@gaganbiyani “The app is fairly useless for language learning” Disagree. Learning is what you make of it. Duolingo is a great starting point for grokking initial conversation + dialogue, which can very well lead to further language learning.
5
3
176
@virattt
virat
1 year
Code is here: I'm not using @OpenAI 's GPT-4 for this, but an older model. Please let me know if you have any feedback! My goal is to make learning about LLMs as accessible as possible for everyone 🙂
5
14
173
@virattt
virat
4 months
I just loaded pretrained GPT-2 weights into my own custom LLM. Why I think this is cool: • our model has a great starting point • we can run the model for free • we can finteune the model easily • we can customize the model architecture • we ultimately own the model By
Tweet media one
4
26
170
@virattt
virat
6 months
I've been mesmerized by generative UI So, I decided to figure out how it works. Turns out, rendering agent output in UI components is easier than expected. Main steps: • define tools that your agent can use • map each tool to a UI component • maintain agent state (eg.
6
22
169
@virattt
virat
5 months
Query transformation via tool calling I am trying to create perfect queries for my vector DB. Challenge: given user query, extract financial quarters and years. Query: "How did revenue change between Q4 2023 and year before that?" • years: [2023, 2022] • quarters: [4, 4]
Tweet media one
8
24
166
@virattt
virat
8 months
Listwise Reranking with LLMs I came across this paper that proposes Listwise reranking of retrieved documents for RAG. Two reranking approaches: • pointwise reranking • listwise reranking Pointwise reranking Given list of documents, we feed query + each document individually
Tweet media one
3
26
159
@virattt
virat
3 months
I added a classification head to my 124M param LLM today. Goal is to finetune the LLM for binary classification. Current architecture: • 1024 context window • 12 transformer blocks • 124M parameters • 1 output head for binary classification How does this differ from a
Tweet media one
11
20
159
@virattt
virat
1 month
I trained a 1.5B param LLM on 10-Ks All from scratch. It took ~60 seconds on an A100. Training details: • 50,000 tokens in data set • 1600 embedding dimensions • 1024 context window • 48 transformer blocks • 25 attention heads • 10 epochs total I previously
Tweet media one
5
19
155