Greg Diamos Profile
Greg Diamos

@GregoryDiamos

3,009
Followers
104
Following
47
Media
633
Statuses

Lamini | I build AI supercomputers

Joined April 2013
Don't wanna be here? Send us removal request.
Pinned Tweet
@GregoryDiamos
Greg Diamos
8 months
If you are an AI startup blocked on GPUs, send me a note. At Lamini, we figured out how to use AMD GPUs, which gives us a relatively large supply compared to the rest of the market.
28
120
1K
@GregoryDiamos
Greg Diamos
9 months
Lamini now supports JSON output at full speed. We guarantee that your LLM produces a valid JSON object according to your spec. Works for any LLM we support, e.g. Mistral 7b show below. API Docs:
Tweet media one
4
18
220
@GregoryDiamos
Greg Diamos
8 months
@remusrisnov IMO, ROCm is 90% of the way there, but the market treats it like it is 0% of the way there. We filled in the 90%->100% gaps for LLM finetuning and inference with a team of a few good hackers.
5
13
195
@GregoryDiamos
Greg Diamos
7 months
New LLMs for new years! On my vacation I streamed a 5 hour walkthrough of building an optimized medical LLM from scratch using an AMD supercomputer from a beach in Hawaii. All of the code is open source! via @YouTube
2
25
139
@GregoryDiamos
Greg Diamos
3 years
I'm so excited to release two giant speech datasets today: with clean CC licenses with academic and commercial use! Special shout out to Daniel Galvez, Mark Mazumder, and the whole team who put in a huge effort to create these.
3
28
132
@GregoryDiamos
Greg Diamos
5 months
Here's a new earnings call Q&A dataset I made with Lamini. It has millions of questions and answers generating using a Lamini LLM reading earnings call transcripts. Still uploading - 10k finished so far - should have about 1M by tonight
6
15
112
@GregoryDiamos
Greg Diamos
1 year
Excited to release 1.109 billion times faster LLM  switching using a PEFT cache in HBM.  See the blog for details of how it works. Finding an opportunity for a 1 billion times speedup suggests that we are just scratching the surface of fine tuning custom LLMs.
@LaminiAI
Lamini
1 year
Training multiple LLMs taking forever? 😤 Costing you a fortune?💸 Enter PEFT! Get ready to multiply!! 🚀 1000 models, just 1 machine! 🤖 3 months of training -> 3 milliseconds ⚡️ Just one API call, load and train with Lamini! 👉 👀
1
7
42
5
6
107
@GregoryDiamos
Greg Diamos
4 months
Hiring a world class ML Engineer at Lamini. Drop me a note if you want to push model accuracy to the limit on unlimited AMD compute.
9
15
96
@GregoryDiamos
Greg Diamos
3 months
This is the paper that convinced me - Showing that a Frankenstein CUDA cluster could beat a 10,000 cpu map reduce cluster
@karpathy
Andrej Karpathy
3 months
# CUDA/C++ origins of Deep Learning Fun fact many people might have heard about the ImageNet / AlexNet moment of 2012, and the deep learning revolution it started. What's maybe a bit less known is that the code backing this winning submission to the
Tweet media one
166
901
7K
1
12
89
@GregoryDiamos
Greg Diamos
9 months
Nice work replicating this. The recipe is out. AMD GPUs work.
@cognitivecompai
Cognitive Computations
9 months
I've been working on my AI servers in my garage. (8x AMD Instinct mi100) Yesterday I got the first server running and inferencing with oobabooga. Now duplicating the os drive to get the 2nd server running. Then once infiniband is setup I will start working on getting
Tweet media one
Tweet media two
Tweet media three
Tweet media four
42
34
474
3
12
84
@GregoryDiamos
Greg Diamos
5 months
We are hiring an HPC (MPI / OpenAI Triton) Engineer at Lamini. Apply here: We are inventing and building the largest AMD LLM training system in the world. Join us in strongly scaling to 1000s of GPUs and beyond.
2
16
85
@GregoryDiamos
Greg Diamos
10 months
Hit me up if you need GPUs for finetuning LLMs. We are bringing online more capacity together with AMD. Start training on AMD in 3 lines of code. pip install lamini from lamini import LlamaV2Runner model = LlamaV2Runner() model.load_data(path=...) model.train(args=...)
@LisaSu
Lisa Su
10 months
Love working with @LaminiAI and @realSharonZhou making LLMs easy and accessible for all on @AMD @AMDInstinct GPUs! So cool what can be done with @LaminiAI LLM Superstations!!
18
58
270
2
13
72
@GregoryDiamos
Greg Diamos
8 months
I remember the same time, it wasn’t luck. My first exposure to the CUDA vision was from John Nickolls . He very clearly saw it as becoming the dominant form of computing.
@ctnzr
Bryan Catanzaro
8 months
I worked at Intel on Larrabee applications in 2007. Then I went to NVIDIA to work on ML in 2008. So I was there at both places at that time and I can say: NVIDIA's dominance didn't come from luck. It came from vision and execution. Which Intel lacked.
44
195
1K
3
5
67
@GregoryDiamos
Greg Diamos
7 years
We used a supercomputer to perform the largest study to date of how deep learning scales up with more data and faster computers. It turns out to be simple and predictable across diverse applications.
0
27
53
@GregoryDiamos
Greg Diamos
5 months
Training LLMs in the Wild Wild West - or how I learned to stop worrying and do multi-node training on AMD GPUs.
0
6
52
@GregoryDiamos
Greg Diamos
6 years
Our latest enhancement to text to speech makes knowledge distillation much easier to train and results in high quality models that are also extremely efficient to deploy on GPUs.
0
15
48
@GregoryDiamos
Greg Diamos
5 months
Now over 1 million Q&As - I'm going to stop now. Generating this dataset using Claude 3 would have cost about $50,000. Enjoy it for free.
@GregoryDiamos
Greg Diamos
5 months
Here's a new earnings call Q&A dataset I made with Lamini. It has millions of questions and answers generating using a Lamini LLM reading earnings call transcripts. Still uploading - 10k finished so far - should have about 1M by tonight
6
15
112
3
7
46
@GregoryDiamos
Greg Diamos
8 months
@belacquant What info do you need? AMD CDNA has a whitepaper: ROCm is open source: Some of our SW is open and documented: Good docs take time, we will keep publishing.
2
6
49
@GregoryDiamos
Greg Diamos
5 years
I'm excited to pursue a new adventure.
6
1
46
@GregoryDiamos
Greg Diamos
6 years
Check out , a community supported benchmark for measuring deep neural network performance on software frameworks and hardware accelerators.
Tweet media one
0
17
41
@GregoryDiamos
Greg Diamos
7 months
First MI300 live in production.
Tweet media one
Tweet media two
@realSharonZhou
Sharon Zhou
7 months
> rocm-smi like freshly baked bread, 8x MI300X is online if you're building on open LLMs and you're blocked on compute, lmk. Everyone should have access to this wizard technology called LLMs. that is to say, the next batch of @LaminiAI LLM pods are here.
Tweet media one
16
36
228
1
8
39
@GregoryDiamos
Greg Diamos
10 months
Love deploying LLMs with ROCm on AMD Instinct GPUs.
@LisaSu
Lisa Su
10 months
Love working with @LaminiAI and @realSharonZhou making LLMs easy and accessible for all on @AMD @AMDInstinct GPUs! So cool what can be done with @LaminiAI LLM Superstations!!
18
58
270
2
3
37
@GregoryDiamos
Greg Diamos
8 months
@remusrisnov @realGeorgeHotz There are oh so many. IOMMU, I'm looking at you! It was essential to use MI class servers, e.g. MI210/MI250/MI300. I read that ROCm on Radeon/RDNA has gotten better recently, but last year it was not pretty and we have moved on completely to CNDA by now.
1
0
36
@GregoryDiamos
Greg Diamos
5 months
An LLM is a database that speaks English instead of SQL and runs on a GPU instead of a disk.
7
5
32
@GregoryDiamos
Greg Diamos
3 months
Announcing Lamini series A to build more powerful supercomputers to enable enterprises to build their own Expert LLMs
@realSharonZhou
Sharon Zhou
3 months
Super excited to announce our Series A!! ✨ @LaminiAI Raises $25M For Enterprises To Develop Top AI Capabilities In-House ▫️ We have incredible enterprise customers who are able to build LLMs with capabilities that exceed general LLMs, e.g. don't hallucinate on their revenue
Tweet media one
43
35
438
4
2
29
@GregoryDiamos
Greg Diamos
10 months
Using Lamini, ROCm finally matches CUDA in SW compatibility for LLMs. It took years to get this to work, but it finally does. This blog begins to explain how it works, it was very unintuitive. More technical details to come...
@realSharonZhou
Sharon Zhou
10 months
Excited to announce a HUGE secret with @LisaSu : @LaminiAI has been building LLMs on @AMD GPUs *in production* for over the past year! We’ve made running LLMs on AMD super easy and a highly competitive option through our LLM Superstation, available now at ~10x lower cost than
36
103
782
0
7
26
@GregoryDiamos
Greg Diamos
11 months
I enjoyed Song Han's brain dump of 400+ slides & videos worth of content on MLSys methods. Interesting mix of practical methods like data/model parallelism, and bleeding edge research topics from automl to quantum ML
Tweet media one
0
4
25
@GregoryDiamos
Greg Diamos
3 months
GPUs are currently disrupting cloud - if you walk into a datacenter you can see it happening
7
0
24
@GregoryDiamos
Greg Diamos
5 months
Excited to partner with @Meta to bring Llama finetuning to every enterprise.
@realSharonZhou
Sharon Zhou
5 months
Excited to partner with @Meta ! We have a ton of great F500 enterprises, building on open foundation models like Llama 2: ▫️ Deploying Lamini Stack on premise ▫️ Scaling on Lamini Instances in the cloud Let me know if you want to chat 🙌
Tweet media one
5
12
96
2
2
23
@GregoryDiamos
Greg Diamos
5 years
My son opened his first GPU with TensorCores. Which one will learn faster?
Tweet media one
4
0
23
@GregoryDiamos
Greg Diamos
3 months
Jensen said the same thing to us internally in 2009 - stand on the shoulders of giants “on CPUs and C” - and use the GPU to reach higher In hindsight it’s amazing to see CUDA enabled deep learning, which enabled LLMs
@JonErlichman
Jon Erlichman
3 months
2024 valuations: Nvidia: $2.2 trillion Intel: $132 billion 2009 valuations: Intel: $90 billion Nvidia: $5 billion
190
4K
24K
0
2
23
@GregoryDiamos
Greg Diamos
10 months
@DylanOnChips @LaminiAI @AMD @nvidia @CRN It took years to get this to work, but ROCm finally can run a full training and inference stack at hundred GPU scale.
1
4
24
@GregoryDiamos
Greg Diamos
10 months
@LisaSu @LaminiAI @realSharonZhou @AMD @AMDInstinct We have been running on hundreds of @AMDInstinct GPUs in production for over one year. The hard part of this space has been software, but we are now at the point where ROCm works for LLMs.
1
1
19
@GregoryDiamos
Greg Diamos
4 months
Awni is one of my favorite people to work with ever
@awnihannun
Awni Hannun
4 months
We’re hiring people to work with us on MLX. If you’re interested, can write fast GPU kernels, and have machine learning experience, reach out. More here:
18
91
765
1
1
19
@GregoryDiamos
Greg Diamos
6 months
What worked for me was asking what I really cared about, and then just start building it.
@karpathy
Andrej Karpathy
6 months
Hi everyone yes, I left OpenAI yesterday. First of all nothing "happened" and it’s not a result of any particular event, issue or drama (but please keep the conspiracy theories coming as they are highly entertaining :)). Actually, being at OpenAI over the last ~year has been
2K
1K
23K
0
1
19
@GregoryDiamos
Greg Diamos
10 months
@DylanOnChips @jaygoldberg @AMD @nvidia I mean that we can run LLM training and inference for models that we use in production like Llama 2 (70B, Code, etc). It needs to work (usability) and be fast (e.g. close to peak TOPs). It's not easy to do this. Only ROCm and CUDA work out of platforms I've tried.
1
3
17
@GregoryDiamos
Greg Diamos
4 months
Tons of code. If you expect someone else to write your code while you work on algorithms, get out. No salary. If you want a $1 million comp package. No way.
27
0
17
@GregoryDiamos
Greg Diamos
1 year
Check out @LaminiAI , the LLM engine that gives every developer the superpowers that took the world from GPT-3 to ChatGPT! Excited to work with @realSharonZhou to build this!
@realSharonZhou
Sharon Zhou
1 year
I’m super excited to announce @LaminiAI , the LLM engine that gives every developer the superpowers that took the world from GPT-3 to ChatGPT! We make it easy to rapidly train custom LLMs from @OpenAI @EleutherAI @Cerebras @Databricks @HuggingFace @Meta 🧵
Tweet media one
94
605
3K
0
2
18
@GregoryDiamos
Greg Diamos
4 months
You can call Llama 3 on Lamini now Check out the docs:
Tweet media one
0
3
16
@GregoryDiamos
Greg Diamos
6 months
More AMD GPUs on the way
@LaminiAI
Lamini
6 months
Our 2024 first startup cohort is working hard at building LLMs on Lamini 💪 🌶️ We are now accepting applications for our next batch in March. If you are an early-stage startup building LLM applications and needing compute, please apply now! 🙌 🥳
0
3
15
0
1
15
@GregoryDiamos
Greg Diamos
8 months
@kadokaelan Thanks, opened
0
0
13
@GregoryDiamos
Greg Diamos
5 years
The MLPerf training paper is live! Can we improve ML training speed by more than 50% per year?
0
6
15
@GregoryDiamos
Greg Diamos
11 months
I didn’t realized how unique the NVIDIA flat culture was until I left. Jensen and team got it right and more of the industry should pay attention to how much faster you can innovate if you drop the hierarchy.
@danhockenmaier
Dan Hockenmaier
11 months
@petergyang Interview where he talked about his management style:
10
202
2K
2
0
14
@GregoryDiamos
Greg Diamos
7 years
I’m excited to present our recent work showing that deep learning scaling is simple and predictable at 1:50pm at the Deep Learning at Supercomputer Scale NIPS workshop today.
0
2
13
@GregoryDiamos
Greg Diamos
5 years
Einstein was an amazing research project that succeeded in having impact on Volta in a surprisingly short amount of time. Industry researchers will know how hard this is to pull off. It wouldn’t have happened without an amazing development team and visionary leadership.
@ctnzr
Bryan Catanzaro
5 years
Indeed, it was. And several important ideas from Einstein’s vision made it into Volta’s design. I think Volta’s improvements over prior GPUs are generally underappreciated. Volta is really a huge change in the way GPUs are built. And now those ideas are in Turing as well.
2
18
53
0
4
14
@GregoryDiamos
Greg Diamos
6 months
It's insane that you can almost fit a 200B parameter FP8 LLM in one MI300 GPU. (192GB of HBM)
@GregoryDiamos
Greg Diamos
7 months
First MI300 live in production.
Tweet media one
Tweet media two
1
8
39
0
3
14
@GregoryDiamos
Greg Diamos
5 months
I asked 20 engineers if they knew how allreduce works inside of frameworks like Megatron-LM. 0 / 20 explained it correctly. This tech is 10 years old, but it is so hidden in the guts of libraries that few people see it or implement it. Maybe it's time to visit this topic again?
@GregoryDiamos
Greg Diamos
5 months
Training LLMs in the Wild Wild West - or how I learned to stop worrying and do multi-node training on AMD GPUs.
0
6
52
2
0
13
@GregoryDiamos
Greg Diamos
9 months
Now with code snippets in our Llama 2 70B playground.
Tweet media one
1
2
13
@GregoryDiamos
Greg Diamos
4 months
No egos. If your goal is personal fame over building amazing models, this is not the right place for you. No research. If your goal is to publish and rack up citations, this is not the right place.
2
0
13
@GregoryDiamos
Greg Diamos
5 years
SysML hasn’t quite reached NeurIPS level of attendance, but wow, there are a lot of people here!
Tweet media one
0
0
12
@GregoryDiamos
Greg Diamos
8 months
@yusufg We have optimized training and inference for LLMs. So we can offer a higher level interface. Just submit 10k LLM requests to a cluster of 1000s of GPUs, we will handle it for you. You can also drop to a lower level, eg PyTorch, BLAS, etc. that would work fine
0
1
10
@GregoryDiamos
Greg Diamos
2 months
Lamini is up -
@BenjaminDEKR
Benjamin De Kraker 🏴‍☠️
2 months
ChatGPT, Claude, and Perplexity -- three totally different AI services -- are all down right now. What the actual F is going on?
Tweet media one
825
273
3K
1
2
12
@GregoryDiamos
Greg Diamos
9 months
This was a good paper. One of the first clear demonstrations of what you can do with 40 GB of on-chip SRAM.
@CerebrasSystems
Cerebras
9 months
(1/2) Argonne National Laboratory researchers demonstrate the Cerebras CS-2 is 130x faster than NVidia A100 on a nuclear physics simulation workload. Read the paper here:
Tweet media one
6
8
44
0
2
12
@GregoryDiamos
Greg Diamos
1 year
Let's democratize LLMs. Thank you @realSharonZhou and @AndrewYNg for creating a simple and accessible 1-hour course.
@realSharonZhou
Sharon Zhou
1 year
Super excited to announce Finetuning LLMs, a short-course made with my friend @AndrewYNg !! 🎉 By importing open-source @LaminiAI core & @HuggingFace & @PyTorch & @Weights_Biases , you can: ✅ Gain an expert's intuition behind finetuning* ✅ Understand how finetuning fits in vs.
18
122
749
1
2
12
@GregoryDiamos
Greg Diamos
3 years
@IanCutress Wow, MI200 is a monster.
1
0
12
@GregoryDiamos
Greg Diamos
5 years
The MLPerf inference benchmark is launched!
0
5
11
@GregoryDiamos
Greg Diamos
6 months
Open source Lamini software development kits are out:
0
2
11
@GregoryDiamos
Greg Diamos
1 year
This is how a tiny diffusion LLM generates a sentence.
Tweet media one
0
0
11
@GregoryDiamos
Greg Diamos
4 months
Incredible return to open source!
@_philschmid
Philipp Schmid
4 months
Casual Easter Monday with a huge gift from @OpenAI !🤯 They just released an old GPT-3.5 version. 😍 👉
Tweet media one
117
199
1K
2
0
11
@GregoryDiamos
Greg Diamos
9 months
Learn how we built an optimized LLM finetuning system on AMD's ROCm AI stack. Leveraging AMD Instinct GPUs & optimizations for major speedups with more to come! 🚀 👉 Read more in-depth technical details:
0
2
10
@GregoryDiamos
Greg Diamos
9 months
Get ready for bigger LLMs! As more AMD GPU capacity comes online we are opening up big model finetuning and inference to a bigger audience. How did we do it?
Tweet media one
0
2
10
@GregoryDiamos
Greg Diamos
6 months
We open sourced the Lamini python package here: Apache 2 License Our philosophy is to make our framework code as open source and transparent as possible because we believe that this is the best way to ensure high quality, simplicity, and security.
0
2
10
@GregoryDiamos
Greg Diamos
7 years
Excited to present our successful validation of mixed precision deep neural network training at @reworkAI !
1
1
10
@GregoryDiamos
Greg Diamos
9 months
We had to write a custom inference server from scratch. We do license it.
@cognitivecompai
Cognitive Computations
9 months
Anyone know how to inference on AMD? tgi and vllm don't work
10
0
13
2
1
8
@GregoryDiamos
Greg Diamos
1 year
Databricks built a high performance and secure data system. I'm excited to work together to enable every business to build and own LLMs trained on their own data. @LaminiAI @databricks . Thank you @matei_zaharia
@matei_zaharia
Matei Zaharia
1 year
This is a big addition to the Databricks platform, letting software vendors reach our 10,000+ customers much more easily! Thanks to our early dev partners @retool , @posit_pbc , @Kumo_ai_team and @LaminiAI !
1
9
51
0
1
9
@GregoryDiamos
Greg Diamos
9 months
These tiny models are so useful for LLM development. The unit tests of LLMs.
@StasBekman
Stas Bekman
9 months
If this is useful for your work, I have just created a <1MB tiny random llama2 model including a tiny 3k tokenizer. This is crucial for extremely fast testing/development. You can easily adapt the tiny model maker script to any other model
15
74
589
1
2
8
@GregoryDiamos
Greg Diamos
7 months
Mixtral paper is out. The experts specialize a bit per domain, eg math, code
@dchaplot
Devendra Chaplot
7 months
We just released Mixtral 8x7B paper on Arxiv:
Tweet media one
47
495
3K
0
0
9
@GregoryDiamos
Greg Diamos
7 years
Check out our recent work. Deep neural network generalization error scaling is simple and predictable, even for large complex architectures on hard open problems:
0
4
8
@GregoryDiamos
Greg Diamos
8 months
@belacquant We have some docs for LLM training and inference on AMD here: We also wrote a blog post about the AMD tech stack here:
1
0
9
@GregoryDiamos
Greg Diamos
5 years
The MLPerf community has made amazing progress in a short amount of time. I’m really excited to see the inference effort coming together.
0
1
9
@GregoryDiamos
Greg Diamos
8 months
Good old fashion ML is dying LLMs popularized zero-shot learning, or “prompt engineering” which is drastically easier to use and more effective than labeling data. IMO, it’s a short matter of time before this takes over all of what used to be called “deep learning”.
4
1
8
@GregoryDiamos
Greg Diamos
6 years
MLPerf training results are out!
1
6
9
@GregoryDiamos
Greg Diamos
5 years
If you want to partner with our team at Landing AI, we are open to working with you, and I would be excited to explore using cutting-edge AI technology to make your products even better and to level up the AI capability inside your organization.
1
1
9
@GregoryDiamos
Greg Diamos
3 months
We put Llama3 and Phi3 in the Lamini playground. You can submit tons of traffic to them and finetune them now. Playground: API Docs:
Tweet media one
2
2
8
@GregoryDiamos
Greg Diamos
3 months
Appreciate your contributions getting us this far - you accelerated the scaling law timeline by years Can’t wait to see what you do next
@ilyasut
Ilya Sutskever
3 months
After almost a decade, I have made the decision to leave OpenAI.  The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama , @gdb , @miramurati and now, under the
2K
3K
27K
0
2
8
@GregoryDiamos
Greg Diamos
5 years
Im experimenting with short videos about systems and machine learning. Let me know what you think.
0
1
8
@GregoryDiamos
Greg Diamos
5 months
Without Infiniband and NVLink - we needed to optimize the software in our own collectives and models library to scale. See the section on Lamini collectives in our blog.
@realSharonZhou
Sharon Zhou
5 months
Excited to share how we’re scaling to thousands of GPUs in production! …with multi-node LLM training, on not just Nvidia but @AMD GPUs Details 👉 Great blog by our team, led by Ayushi 💅 tl;dr - Push the limits of training LLMs on enterprise data
Tweet media one
6
15
163
1
0
8
@GregoryDiamos
Greg Diamos
5 months
Generate millions of questions and answers from 💰💰💰 earnings call transcripts 💰💰💰 with this simple open source Lamini library. It's easy to make your LLM an expert in the most up to date information. 🚀 🚀🚀
0
1
8
@GregoryDiamos
Greg Diamos
4 months
@nbrempel No “ai salary” - we pay above market rate for similar stage startups
1
0
4
@GregoryDiamos
Greg Diamos
7 years
Getting ready to survey progress on AI Chips and AI Datacenters at #OReillyAI
0
0
8
@GregoryDiamos
Greg Diamos
9 months
Why tokenize when you can just predict pixels. Reading papers is more fun than Twitter. I went back over Here's a fun one.
Tweet media one
0
2
8
@GregoryDiamos
Greg Diamos
5 years
Are you an outstanding engineer who wants to break into AI? Reach out to me to learn more.
0
0
7
@GregoryDiamos
Greg Diamos
1 year
Public alpha is here: HPC LLM fine-tuning!
@realSharonZhou
Sharon Zhou
1 year
Excited to announce @LaminiAI ’s new public alpha: free, fast & furious LLM finetuning! Finetuning was complex & expensive before. Now, anyone can do it in: ⏳10 mins 🧑‍💻3-5 lines of code 💸$0 free 👉🏻 🧸Free tier: toy ~400M LLMs. This is just the start 🧵
Tweet media one
12
43
193
1
0
7
@GregoryDiamos
Greg Diamos
7 years
We finally have evidence that low precision training works reliably, even for our biggest models!
@BaiduResearch
Baidu Research
7 years
Check out our paper on training deep neural networks using half precision FP numbers, a joint work w/ @NvidiaAI
1
51
115
0
1
7
@GregoryDiamos
Greg Diamos
8 months
I spent some time this week playing with Intel's new Neural Chat LLM which is at the top of the Huggingface LLM leaderboard. Nice work Intel. The pace of new LLMs is relentless. On Lamini, we support new models immediately. Try it out now at:
0
0
7
@GregoryDiamos
Greg Diamos
6 months
Over the last few months I’ve talked to many top AI researchers and software engineers leaving big tech. I was curious to hear where they want to go next. Two answers rise to the top: 1) OpenAI 2) Found an AI startup I wonder what that says about the job market
0
1
7
@GregoryDiamos
Greg Diamos
2 months
We got hit with a DDOS after this post. Maybe that had something to do with the outages last week. As the LLM space matures it is helpful to think about security. What would happen if there was a data breach that leaked all conversation history?
1
0
2
@GregoryDiamos
Greg Diamos
3 months
@ylecun @npinto @karpathy @AndrewYNg I remember looking at the shader assembly (SASS) of one of Bryan’s early conv kernels with David Tarjan - they were very close to matrix multiplication - one of the most compute bound kernels we had in CUDA
1
0
6
@GregoryDiamos
Greg Diamos
6 months
Scaling laws: improve AI by sucking up more data
Tweet media one
0
0
6
@GregoryDiamos
Greg Diamos
5 months
It includes scores from a reward model for DPO or RLHF training.
0
1
6
@GregoryDiamos
Greg Diamos
11 months
Interesting perspective from @petewarden Why Nvidia's AI Supremacy is Only Temporary I agree with a lot of this, but not that CPUs are good enough for inference. Scaling laws give an advantage to accelerators, but the SW also matters.
0
0
7
@GregoryDiamos
Greg Diamos
2 months
We have some good SW to go with that HW.
@realSharonZhou
Sharon Zhou
2 months
It’s official: Satya announced AMD GPUs on Azure! 🎉 If you want to run LLMs on the most cost-effective AMD GPUs on Azure, please contact us @LaminiAI :) Contact:
4
7
65
1
1
7
@GregoryDiamos
Greg Diamos
5 years
Even though AI is starting to work well, many companies still struggle to land it in mass production. After seeing many examples, we wrote down these best practices. I expect that many of them can be applied to your AI deployments.
0
3
7
@GregoryDiamos
Greg Diamos
3 months
Excited to see that LLMs with photographic memory are starting to work really well internally We would have never found this with a pure research perspective and no user input
0
0
6
@GregoryDiamos
Greg Diamos
2 months
@IanCutress @LaminiAI @realSharonZhou It was fun reminiscing about the early days of CUDA @IanCutress - and what comes next. Drop by anytime.
0
1
3