Ivan Leo Profile Banner
Ivan Leo Profile
Ivan Leo

@ivanleomk

2,293
Followers
900
Following
408
Media
3,021
Statuses

I work on applied AI projects and spend my spare time tinkering with models for fun. I write longer tweets at

Singapore
Joined March 2011
Don't wanna be here? Send us removal request.
Pinned Tweet
@ivanleomk
Ivan Leo
26 days
Using whisper is so 2023. Just use gemini, pass in the raw audio directly and prompt the model directly with all the questions you have. With instructor, we can get - The exact mispronounced word - The timestamp when we did it - Advice on how to do better Flash truly is the
1
0
24
@ivanleomk
Ivan Leo
9 months
I've had some people ask me for advice on ML so I compiled some of my thoughts into 3 tips that I wish I had thought of/done more of when I first started I'm honestly still a beginner but I figured I'd offer some thoughts as to how someone can juggle this with a fulltime job
9
34
280
@ivanleomk
Ivan Leo
11 months
Literally got into machine learning because of @jeremyphoward 's FastAI course a few months ago. This is surreal. This just made my entire week.
Tweet media one
9
4
182
@ivanleomk
Ivan Leo
3 months
Save 50% on your OpenAI Bill ( even on fine-tuned models) with batch jobs using Instructor Just use our new BatchJobs object :)
1
3
109
@ivanleomk
Ivan Leo
5 months
@NielsHoven This was my experience to some degree. I was using algebra to solve problems in the 5th grade but was penalised due to the education system requiring people to use models for math. I do think I would have discovered how much more fun math was if I had been able to explore/learn
3
2
82
@ivanleomk
Ivan Leo
11 months
We had some good results before but with some careful data selection, our fine tuned GPT3.5 models managed to generate summaries an order of magnitude faster and better than GPT-4. Read here :
Tweet media one
7
6
81
@ivanleomk
Ivan Leo
5 months
Officially unemployed today - next step Bangkok for some Muay Thai and much needed rest before starting full time with @jxnlco soon. Time to speed run some distributed Rust for fun
7
2
80
@ivanleomk
Ivan Leo
1 month
1/ If you're building a RAG application, these problems probably sound familiar: 1. Irrelevant search results 2. Insufficient Data to create a database index 3. Multiple data sources that are out of sync 4. Untested LLM agents How do these problems manifest?
3
7
71
@ivanleomk
Ivan Leo
10 months
tl;dr : Do give @magicpatterns a try if you've got some time. It's pretty handy. I managed to generate the following UI in about 1-2 hours of work using their product including setting up all the dependencies from scratch with @shadcn 's ui package. The design itself is based off
Tweet media one
5
4
65
@ivanleomk
Ivan Leo
9 months
I used >70 A10Gs for 4 hours on demand for batch jobs on @modal_labs insane to see that I only spent < 100 bucks lol.
3
3
65
@ivanleomk
Ivan Leo
7 months
If you’re using colab to prototype, I highly suggest using a @modal_labs instance instead. You’ll pay per second and get much better GPUs and performance. I was able to download all my Nvidia libraries within around 10s yesterday. Colab took almost 10 minutes.
5
3
56
@ivanleomk
Ivan Leo
20 days
Spent some time today working on a data annotation tool to clean some web data after seeing @HamelHusain 's AI Engineer talk on data annotation tools. Took me ~10 mins to spin up with O1 and cursor and now I've annotated around 50 different chunks to be used for evaluation.
Tweet media one
2
4
53
@ivanleomk
Ivan Leo
10 months
Using @modal_labs for batch jobs is like damn I found a weird bottleneck, time to just use .map and 50 GPUs to get a 500x speed up lmfao
1
4
52
@ivanleomk
Ivan Leo
6 months
@culturaltutor Singapore would not be where it is without Air conditioning. Lee Kuan Yew famously said that one of the greatest inventions of the 21st century was the air conditioning unit haha
2
0
53
@ivanleomk
Ivan Leo
6 months
Can't wait to get started! We've got so many amazing projects in the pipeline in store for everyone :)
@jxnlco
jason liu
6 months
I'm officially boutique consulting firm now that @ivanleomk is joining me full time in a month!
7
1
92
8
1
51
@ivanleomk
Ivan Leo
6 months
Integrating Logfire with Instructor is dead simple. With just these 4 lines of code, you'll get full transparency into your entire application - from parsing errors of individual requests to the latency of individual functions down to the millisecond
Tweet media one
7
5
49
@ivanleomk
Ivan Leo
7 months
Reading the @deepseek_ai moe paper and it's really interesting how they decided to implement shared experts which are always used. Would love to see a comparison of these shared experts/MOE using the token visualisation that Mixtral did
Tweet media one
1
6
48
@ivanleomk
Ivan Leo
1 month
Enjoy up to 90% cost savings with Anthropic's new beta prompt caching feature. It's still in beta at the moment but using it in instructor takes just a few steps. First, let's initialise an Instructor instance
Tweet media one
3
3
48
@ivanleomk
Ivan Leo
6 months
Just ran 110 experiments with different hyper-parameters to find the best ones for my model with @modal_labs . All it took was 20 bucks, 40 A10Gs and containers. god bless
3
3
46
@ivanleomk
Ivan Leo
28 days
I get asked every now and then about whether to use @LangChainAI or instructor My general advice normally boils down to 1. How familiar are you with either library 2. How long are you maintaining this code for This is kind of how I've been thinking about it.
4
2
44
@ivanleomk
Ivan Leo
4 months
My first Agent!
Tweet media one
3
1
44
@ivanleomk
Ivan Leo
10 months
Constraining LLMs to use APIs without hallucinating is hard. json mode is a tad bit unreliable and offering a $200 tip to gpt isn't feasible in this economy with inflation. Instead, just define a @pydantic class and force the model to output the parameters you want using
Tweet media one
4
3
42
@ivanleomk
Ivan Leo
6 months
If you're using FastAPI and Instructor, Logfire is a perfect fit. We've just published a new guide with complex examples to show how to 1. Use streaming to consume extracted objects faster 2. Take advantage of asyncio to run multiple instructor calls in parallel
Tweet media one
1
3
38
@ivanleomk
Ivan Leo
8 months
@allgarbled I think it takes a certain level of courage to post about your learning process because it means opening it up to the world to critique and see. Much easier to just criticise without adding anything of value.
0
2
37
@ivanleomk
Ivan Leo
3 months
I am now a newfound convert of XML for prompting
10
1
37
@ivanleomk
Ivan Leo
5 months
Finished up more of the rust exercises in @algo_luca 's 100 exercises of rust. I'm pretty pumped to finally understand what a trait in rust is. Just three more modules and then I start on Rust Atomics and Locks recommended by @diptanu . Instructor-rs coming soon maybe?
Tweet media one
2
1
36
@ivanleomk
Ivan Leo
3 months
Playing with @cohere 's new Structured Extraction mode and it's significantly faster to use for structured extraction. Seems to be ~40% faster than using their normal chat completion and generating raw JSON
3
1
36
@ivanleomk
Ivan Leo
1 month
Want to give O1 a try over the weekend? Just use instructor's new JSON_O1 mode and you're good. Same old familiar api with a single loc to change
Tweet media one
3
3
33
@ivanleomk
Ivan Leo
3 months
@netcapgirl Maybe the tokenizer at fault here? Seems to lump 11 and 9 in single digits at the end
Tweet media one
5
1
31
@ivanleomk
Ivan Leo
2 years
Playing around with the new @supabase auth helpers and pleasantly suprised at how fast it was to set everything up - magic link auth in literally 10 minutes. That was including the time to provision the db god damn.
3
3
30
@ivanleomk
Ivan Leo
13 days
TIL that you should not be relying on your embeddings as a hash Embedding the same sentence 200 times gives 8 different embeddings. Cosine similarity is at least 0.9993 but imagine this inconsistency multiplied across 1000s of embeddings lol. Use FTS peeps
Tweet media one
3
2
31
@ivanleomk
Ivan Leo
6 months
Walking on a treadmill to burn calories and need some ML content? Here's a running thread which I'll keep up of things I liked that I've watched. Would appreciate any other recommendations too
3
4
30
@ivanleomk
Ivan Leo
9 months
1/ Show your work much earlier that you're comfortable with à la @swyx 's Learn In Public I'm a big fan of volume based outputs ever since @jxnlco introduced it to me and I think sharing my work in public has been the biggest difference for me. Early 2023 I was doing so but
4
2
30
@ivanleomk
Ivan Leo
9 months
2/ Start Top Down As @jeremyphoward says in Fast AI, you don't teach baseball by learning aerodynamics ( paraphrasing a bit here ) I think you shouldn't start looking into ML by finishing the entire linear algebra course from MIT and then learning pytorch before writing/using
1
4
30
@ivanleomk
Ivan Leo
9 months
Managed to implement a small scraper with GPT! I couldn't find a good html to markdown parser with python and had some edge cases with the html I was working with. Here's how I did it - first I defined a data model using @pydantic for the final result I wanted.
Tweet media one
1
2
29
@ivanleomk
Ivan Leo
1 month
Curious about how instructor's response_model parameter works under the good? I wrote up a short blog post walking you through a high level overview of how we convert your pydantic model into a validated response. Link :
1
1
29
@ivanleomk
Ivan Leo
11 months
Forked out @abacaj 's replit inference script to create a small Copilot with CPU! These autocompletions run off the older @Replit v1-3b model and directly hook into VScode using a small extension I coded up. Next step is to upgrade context and figure out webviews
4
1
28
@ivanleomk
Ivan Leo
9 months
Woo! The article I worked on all December is finally out. This was an insanely fun one to write with @jxnlco running these massive batch jobs at scale.
@jxnlco
jason liu
9 months
We were able to embed all of Wikipedia in < 15 minutes on Modal Labs for ~ 17$. Here's what this actually means for an organization: 1. Unrestricted Rates: Eliminates the bottleneck of rate limits for large-scale operations. 2. Rapid Experimentation: Allows quick iterations of
24
115
1K
3
2
28
@ivanleomk
Ivan Leo
1 month
RAG doesn't have to be complex. 1. @TimescaleDB for embedding search and metadata filtering 2. @Pydantic w instructor to test your LLMs 3. LLMs to generate metadata at scale Curious to find out how? @jxnlco , @avthars and I wrote about how to do so in
3
3
27
@ivanleomk
Ivan Leo
9 months
@visakanv Actually very interesting point! Can't seem to find an exact source but I remember reading that in Japan for example, ox plows stopped being used for a while because human labor was cheaper. Innovations take flight when they're good but also when conditions are right.
1
0
25
@ivanleomk
Ivan Leo
11 months
Organising a small drinks session for the llm/m folks in singapore 8 december. Come through for some fun conversations, sharings and a good time
2
5
26
@ivanleomk
Ivan Leo
9 months
Used Open AI's new large model and I keep getting rate limited, i've spent 3 hours trying to slowly pace out my requests and it keeps throwing an error. @cohere on the other hand, worked like a charm on the first try! Really highly recommend
@ivanleomk
Ivan Leo
9 months
@cohere ’s new embed job api is very good actually, prob around 5-10x faster than using their previous API It took a bunch of sentences that I previously spent almost 45 minutes embedding and returned the final product in just under 5-10 minutes.
1
2
16
4
2
26
@ivanleomk
Ivan Leo
6 months
Spent some time today working on the @elicitorg machine learning list. Checked off - 3Blue1Brown videos on Deep Learning - Karpathy's 2 hour video on micro-grad - The short introduction article on deep reinforcement learning Good productive saturday, now time to get back to
5
2
25
@ivanleomk
Ivan Leo
9 months
Moderated a small discussion yesterday in our @latentspacepod paper club by @swyx on the hugging face Mixture Of Experts article with @eugeneyan and we took some notes for the first time! Thought I'd highlight some interesting things that stood out to me (1/n)
1
10
26
@ivanleomk
Ivan Leo
11 months
Spent some time in the evening playing around with @modal_labs 's new volume feature and it's definitely a huge step up. On @TheBlokeAI 's Open hermes model, I'm getting a speedup of around 10% in terms of raw inference speed for both Cold and Warm endpoints. Volumes also tend
Tweet media one
Tweet media two
2
2
25
@ivanleomk
Ivan Leo
1 year
Finally got around to trying @swyx 's smol-menubar and its pretty solid. Worked out of the box for me and gave some good responses. Amazing to query 4 diff LLMs at the same time and get a response to your query.
1
3
25
@ivanleomk
Ivan Leo
9 months
Instead, work with a small problem in mind and then go backwards from that to learn what you need. Eg. Start Playing with hosted APIs, then use @modal_labs to see what u can do with OSS models, then go through @karpathy 's videos to learn more about the basics.
1
2
24
@ivanleomk
Ivan Leo
9 months
I'll add an additional tip here - find your people. It takes effort but I think it's worth it. I've been lucky enough to find friends like @aimuggle , @SriniN123 and @hrishioa who are equally interested in this space and doing great work. Twitter helps a lot. Discord groups are
2
2
22
@ivanleomk
Ivan Leo
9 months
Thought it was pretty cool that @LumaLabsAI promoted their internship and was like yeah join us we got 3200 A100 GPUs
Tweet media one
1
3
22
@ivanleomk
Ivan Leo
1 year
Took some time over the week to play around with @swyx 's menubar source code and added in support for @perplexity_ai 's new Llama-2 Hosted runtime. It's so god damn fast, so curious about the source code behind it. Hopefully PR gets merged soon :)
2
4
20
@ivanleomk
Ivan Leo
4 months
Doing good rag is hard and full of tiny gotchas. Learn real actionable tips that have worked for our clients deploying RAG systems at scale. We're also capping max sign-ups with this course so that everyone gets enough air time to ask @jxnlco and other
Tweet media one
1
1
21
@ivanleomk
Ivan Leo
6 months
We got an event venue! If you've ever wanted to find out more about RWKV, here's your chance. We've got @picocreator from @RWKV_AI and @recursal_AI sharing about it and more over a round of drinks at @AWSCloudSEAsia Singapore. Shout out to @gabchuayz for helping to pull
1
6
20
@ivanleomk
Ivan Leo
26 days
@BowTiedFox What I found was useful was to 1. Take examples of copywriters you like 2. Get sonnet to extract writing guidelines rinse and repeat. Works decently well I would say.
1
3
21
@ivanleomk
Ivan Leo
2 months
@NeelNanda5 @3blue1brown @senr For a moment I thought 3blue1brown was Neel Nanda and my two worlds collided
1
0
21
@ivanleomk
Ivan Leo
9 months
There' s 3 ways to improve complex extractions using LLMS 1. use instructor w/ @pydantic 2. define relationships 3. incrementally update state now also available on NPM at @instructor -ai/instructor
1
3
19
@ivanleomk
Ivan Leo
4 months
I wrote up some of my thoughts after the AI Engineering Conference featuring some snippets from the talks i liked. Would love to get feedback on it! :)
2
1
20
@ivanleomk
Ivan Leo
28 days
TIL that Gemini's cache has an unlimited TTL, stores up to 1M tokens and supports caching video, audio and even text. Seems like I'll really be switching over quite a few workloads to gemini @OfficialLoganK
Tweet media one
2
1
20
@ivanleomk
Ivan Leo
2 years
@fullStackRacc Lol why use typescript when you can have millions of users testing your code for free in parallel in production. Only cowards use type safety
0
1
20
@ivanleomk
Ivan Leo
4 months
SF has left me incredibly energized and happy with good discussions and the pleasant weather. I've been amazed at the quality of discussions here about all things LLMs and can't wait for the week to come :)
2
1
19
@ivanleomk
Ivan Leo
7 months
Built a lil cli with @tiangolo 's typer to ingest data and run some queries using @lancedb with the help of @jxnlco . Excited to ship more things out of this little cli
3
2
18
@ivanleomk
Ivan Leo
7 months
Interested in learning more about RWKV and what @recursal_AI is doing with this potential transformer killer? @aimuggle , @gabchuayz and I are organising a fun night of drinks with @picocreator - CEO of Recursal AI coming up in 2 weeks! Come by for good vibes and a great
3
4
19
@ivanleomk
Ivan Leo
1 month
The 1.5 billion free tokens with gemini have really been useful while I run my tests to verify code changes. Fun stuff coming to instructor soon!
3
1
19
@ivanleomk
Ivan Leo
11 months
if you're experimenting with open source models, you should definitely look into working with Modal's volume functionality. I spent some time working with it and I'm quite impressed. Works straight out of the box and provides a consistently fast experience for cold and warm
@ivanleomk
Ivan Leo
11 months
Spent some time in the evening playing around with @modal_labs 's new volume feature and it's definitely a huge step up. On @TheBlokeAI 's Open hermes model, I'm getting a speedup of around 10% in terms of raw inference speed for both Cold and Warm endpoints. Volumes also tend
Tweet media one
Tweet media two
2
2
25
2
1
18
@ivanleomk
Ivan Leo
9 months
I wrote down some reflections on my experience teaching myself ML. I'm still a beginner with much more to learn but this is basically what I've told most people who've asked me for advice. I wrote a longer article here:
@ivanleomk
Ivan Leo
9 months
I've had some people ask me for advice on ML so I compiled some of my thoughts into 3 tips that I wish I had thought of/done more of when I first started I'm honestly still a beginner but I figured I'd offer some thoughts as to how someone can juggle this with a fulltime job
9
34
280
1
1
17
@ivanleomk
Ivan Leo
6 months
Doing a presentation next week on a brief introduction to llms. I tried to condense some of the things I had learnt over the last ~ 6 months into a single article which could be read in < 15 mins at most
3
0
19
@ivanleomk
Ivan Leo
11 months
It’s a wrap on the meetup today organised with @eugeneyan , met a lot of new friends and caught up with old ones. Excited for what’s to come with another round of drinks and a paper club en route with @aimuggle and @SriniN123 in January
Tweet media one
5
3
16
@ivanleomk
Ivan Leo
1 year
Spent the day playing with @pydantic , Rich by @willmcgugan and selenium in order to build a small script to extract out all my saved google map restaurants. I'd say that ever since I started using Pydantic, input validation has never been easier
1
3
18
@ivanleomk
Ivan Leo
3 months
I'm thoroughly confused why there's vertex-ai and then there's the gemini api which has a completely different route.
9
0
19
@ivanleomk
Ivan Leo
4 months
The number of insanely interesting talks at the AI Engineering conference is incredible lol
2
0
18
@ivanleomk
Ivan Leo
8 months
Today I learnt that the C4 Dataset came from the Google T5 paper - that's pretty crazy ngl. This paper has so many gems honestly, if you're interested in learning more or discussing your thoughts on the paper, swing by the @latentspacepod discord later today for our weekly paper
Tweet media one
2
2
18
@ivanleomk
Ivan Leo
6 months
RNNs finally clicked thanks to @patloeber 's RNN tutorial, next steps is LSTMs but this tutorial series really helped me with pytorch. Everything makes sense lol
Tweet media one
2
1
17
@ivanleomk
Ivan Leo
11 months
Spent some time watching @dontusethiscode 's old pycon keynotes - So you want to be a Python Expert and it's phenomenal. I'd been avoiding generators and metaclasses for a while and I think this has convinced me to give them a second shot.
2
3
18
@ivanleomk
Ivan Leo
8 months
Reading the BERT paper for the first time for the asia paper club run inside the @latentspacepod discord and it's a good reminder of how far things have come. Pretty crazy that back then they were blown away by 110-340M parameters and now we're looking at models that are
Tweet media one
2
2
18
@ivanleomk
Ivan Leo
9 months
3/ Read papers - I think this was very daunting for me and I've been doing it for some time. The benefit of most LLM research is that it's a lot more digestible than say traditional ML. People spend months of their lives to generate results and insights but you can get it in
1
1
17
@ivanleomk
Ivan Leo
1 month
With instructor 1.4.1, using Gemini is easy, even with Vertex. 1. Declare a Pydantic model 2. Use Gemini's multimodal messages 3. Enjoy automatic retries, validation and even streaming out of the box Thanks to @sonalsaldanha for the hard work implementing this!
3
2
17
@ivanleomk
Ivan Leo
2 months
You can get better results from your LLM by 1. Performing the task manually by hand first 2. Generating a few samples for few shots 3. Getting claude to rewrite your prompt after your first few rounds 4. Looking at the data Reminded of all 4 today after working with
1
0
17
@ivanleomk
Ivan Leo
3 months
Generating large quantities of synthetic data at scale can get expensive. So we built out an easy integration for OpenAI's Batch Jobs using instructor
1
2
16
@ivanleomk
Ivan Leo
2 months
Man literally taught me how BERT and so many other models worked with his videos, so excited to see him on twitter
@hkproj
Umar Jamil
2 months
My new video on how to code a Multimodal (Vision) Language Model from scratch using only Python and PyTorch while explaining every single concept step by step. Link: #tutorial #pytorch #python #coding #fromscratch #llm #paligemma #gemma #visionmodel
21
155
925
0
0
15
@ivanleomk
Ivan Leo
4 months
First thing in the US is a big fat slice of pizza
Tweet media one
5
0
16
@ivanleomk
Ivan Leo
9 months
Fantastic turnout tonight at the AI Engineer Meetup! Thanks to everyone that came, was great to meet more people in the Singapore scene :)
Tweet media one
2
1
16
@ivanleomk
Ivan Leo
9 months
@cohere ’s new embed job api is very good actually, prob around 5-10x faster than using their previous API It took a bunch of sentences that I previously spent almost 45 minutes embedding and returned the final product in just under 5-10 minutes.
1
2
16
@ivanleomk
Ivan Leo
9 months
When you get rate limited by OpenAI and u crash ur entire batch job of 500,000 sentences even with 7x retries and backoffs of 3 mins
Tweet media one
3
0
16
@ivanleomk
Ivan Leo
8 months
Just finished another great paper club session covering BERT and here are 3 things I found super interesting about the paper. Thanks again to @swyx , @woof_hs and @bryanblackbee for dropping by and sharing about your experience working with BERT
Tweet media one
2
0
16
@ivanleomk
Ivan Leo
9 months
Use abstractions first, then throw them away. I'd also add here that it's ok to not know everything. I think this was something I struggled with a lot at the start - you've got a billion things to know and there's more to know/do the more you learn. Just compare yourself to who
2
1
16
@ivanleomk
Ivan Leo
3 months
Save more than 50% on your OpenAI bill today using just Pydantic and Instructor by running your first Batch Job. Use the same familiar syntax in Instructor that you've grown to love. This works for fine-tuned models too :)
1
2
15
@ivanleomk
Ivan Leo
4 months
Thanks to @gabrielchua_ for having me! the talk can also be titled ‘Start with @lancedb full text search pls’ haha
@gabrielchua_
gabriel
4 months
@ivanleomk sharing on some advanced rag techniques maximising performance, moving beyond vibe checks, and how to make best use of @lancedb @the_builderclub
Tweet media one
Tweet media two
3
3
18
3
3
15
@ivanleomk
Ivan Leo
4 months
Probably read like 25 different papers just this week on prompting.. the range is so far. Some are just add this phrase while others build entire pipelines just to optimize the example choice. It's quite fascinating
2
0
15
@ivanleomk
Ivan Leo
7 months
Managed to get flash attention working in a jupyter notebook on @modal_labs . Thanks to @akshat_b for providing a useful code snippet.
Tweet media one
1
0
15
@ivanleomk
Ivan Leo
9 months
Working on implementing a simplified numpy version of Attention for the Paper Club this thursday, going to work on Multi Headed Attention Head tomorrow PR is open if anyone would like to give some feedback :
2
3
15
@ivanleomk
Ivan Leo
17 days
General thoughts on gemini after blowing 60 USD on it in the last 2 weeks 1. You need much more detailed instructions, especially for pro 2. Flash is a good model but when it comes to writing seems to output consistently shorter text
7
0
14
@ivanleomk
Ivan Leo
9 months
I'm a bit of a perfectionist so this took some getting used to. I think the biggest shift for me was to move from feature-complete to forcing myself to prioritise things that can be done in 1-2 weeks. Life often gets in the way and so making sure that you're working on smaller
1
1
14
@ivanleomk
Ivan Leo
5 months
@NielsHoven I think this is very true! I got into programming because I had incredible friends who showed me how fun it was. I imagine math might have been the same
1
0
13
@ivanleomk
Ivan Leo
6 months
Created a small pomodoro timer with NextJS backed up by D1 and deployed on @cloudflare with the help of @magicpatterns Thought I'd showcase some of the nice components they created
3
2
14
@ivanleomk
Ivan Leo
8 months
We r so back lol - pretty phenomenal lineup this week for the @latentspacepod discord's paper club We've got @wregss and @BhattGantavya presenting on their paper Matryoshka Representation Learning which Open AI uses under the hood for their new embedding model We've got
1
4
14
@ivanleomk
Ivan Leo
5 months
Ran my first 12k today, amazing start to the week
2
0
14
@ivanleomk
Ivan Leo
4 months
This was one of the best talks at AI Eng for sure
@ExaAILabs
Exa
4 months
How does Exa serve billion-scale vector search? We combine binary quantization, Matryoshka embeddings, SIMD, and IVF into a novel system that can beat alternatives like HNSW. @shreyas4_   gave a talk today at the @aiDotEngineer World's Fair explaining our approach! ⬇️
12
56
464
1
0
14
@ivanleomk
Ivan Leo
8 months
@jh3yy A bit surprised at some of the replies to be honest, it's not an uncommon question. it's 2024, figured we'd have moved past that lol.
3
0
14
@ivanleomk
Ivan Leo
30 days
Unlock the full potential of your audio and video content with Instructor's new list support for Gemini in 1.4.3 trying out the new pro and flash models has never been easier
Tweet media one
2
2
13
@ivanleomk
Ivan Leo
7 months
I've seen long bibliographies but 486 cited sources in a single paper takes the cake. Reading through this very comprehensive summary of LLMs over the past year with @bryanblackbee doing a recap tomorrow in the @latentspacepod . Can't wait to see how he's going to organise
Tweet media one
3
5
13