1/7 Spent the week-end with ControlNet, a new approach to have precise, fine-grained control over image generation with diffusion models. It's huge step forward and will change a number of industries. Here is an example.
About 15% of all Stanford undergrads (of all majors) are learning how to build LLMs. Stats for
@chrmanning
's CS224N Natural Language Processing with Deep Learning below.
That makes sense. LLM's are becoming a basic systems component like compute, networking and storage.
🔥 In a recent post Sequoia's
@DavidCahn6
argues AI infra is overbuilt:
- NVIDIA GPU revenue is $50b/y
- This requires $200b in "AI revenue"
- There is only $75b in "AI revenue"
Thus there is a $125b hole
I strongly disagree. AI Infra will be huge.
Grab🍿and read on (🧵1/7).
6/7 If you want to try this out, it's realtively easy to get running if you have a Windows PC with a high-end Graphics Card (12 GB RAM is ideal, but less works too). I used the Automatic1111 extension here.
I am excited to announce that I am joining
@Intel
as CTO of the Data Platforms Group. Intel is not without challenges. But I think strength in CPUs, a diverse portfolio, massive scale and Navin’s great team will make it successful. And it's awesome to work with
@PGelsinger
again!
1/5 For companies doing Generative AI, finding enough GPUs is a difficult and expensive. We’ve seen companies spend 80%+ of total capital raised on compute resources. To help them
@casado
,
@BornsteinMatt
and I wrote a post how to navigate the cost of AI.
1/6 I fine-tuned a
@bfl_ml
's Flux.1-dev LoRA on myself. It fine tunes really well, easier than SDXL although not quite as easy as SD 1.5.
Tuning was done on
@replicate
, the model hosted on
@huggingface
. Total cost is about $7 for 75 minutes on an A100. Full instructions below.
2/7 The basic idea of control net is that your diffusion model works in tandem with a second model trained on a specific task. This is the sd15_scribble model. It helps Stable Diffusion 1.5 to interpret sketches, and turns my "sketch" of an Owl into an awesome drawing.
1/3 New benchmarks from
@MLPerf
, and they include the first good B200 numbers that I have seen. 11,264 tokens/s for Llama 2 70b is crazy, and about 3.7x the performance of the H100🤯
The bar for every AI silicon start-up out there just went up. Some thoughts below.
1/8 I created a picture of three robots posing to spell the letters "AI". A month ago creating this would have taken me days, this week-end it took me 15 minutes. Below an overview of ControlNet and OpenPose. Safe to say this will change image generation forever.
5/7 And last but not least, ControlNet combines with fine-tuning via DreamBooth. I can fine-tune a model on myself (like in the James Bond picture), and then use it to render myself into a specific scene. Here is our daughter who wanted to be Black Widow.
1/5 Meta launched their GPT-3 competitor LLaMA today. Here is a quick analysis of how it stacks up, how open it is and how it changes the industry landscape.
Meet the newest addition to our executive team: Guido Appenzeller! Guido joins us as the Chief Product Officer of Yubico to focus on product development and strategy. Learn more about Guido in our latest interview Q&A:
Exciting additions to Intel's executive team. A huge welcome to my former colleague from
@vmware
Greg Lavender joining as CTO and my Ph.D. advisor Nick McKeown stepping up to GM. And thanks to Navin Shenoy for dragging me to
@intel
, I will miss you!
1/3 The amazing team at
@lmsysorg
have their new leaderboard up, and MPT-30B-Chat is the new
#1
for open-source models. Congratulations
@MosaicML
!
Technically the model they tested is not fully OSS (it is CC-BY-NC-SA-4.0). But this doesn't matter.
4/7 ControlNet can also work with Pose. Here is a super interesting demo of taking the pose from a 3D character editing tool (Houdini) and using it to generate an exact pose in an image.
1/8 Copilot for
@Microsoft
@Office
came out today. As a reformed CTO and fan of
@GitHub
Copilot, I love the idea and gave it a spin.
TL;DR: It's a mess and not anywhere close to adding value.
A thread🧵.
If you combine
@Intel
and
@VMware
, you get a great home server. New blog post on installing ESXi on an Intel NUC11 with a custom install image (as ESXi still lacks drivers).
Understanding technology and market implications of Generative AI and Foundation Models is critical for navigating the tech economy. Based on what we read at
@a16z
to get up to speed,
@derrickharris
@BornsteinMatt
and I put together a reading list.
1/6
@gptzero
soft launched, and this is an opportunity to look at the state of the art in detecting AI generated text. It works surprisingly well. For a simple prompt, it in most cases identifies AI generated parts.
However with a good prompt you can fool it.
Naru and myself announcing Mt Evans,
@intel
's Infrastructure Processing Unit. Make no mistake, this will change how data centers will be architected for both hyperscale and enterprise alike. CPUs are for tenants, CSP code runs on the IPU.
Thread: I trained Stable Diffusion on photos of myself, and can generate pictures of myself purely based on text input. So for example entering "A photo of <appenz> as Captain America" will generate this. Total cost is about $0.30 of Google TPU time and $0.001 per picture.
Wohoo! After 20 hours in the air we made it from California to Punta Cana, Dominican Republic in a single engine propeller plane. An awesome way to end the year.
@dweekly
My kids have started using GPT-3 for school assignments, quoting it as a source. It makes me wonder if teaching simple knowledge questions (e.g. "Define acid rain.") is worth it today, or if it is the equivalent of dividing large numbers manually without a pocket calculator.
7/7 In summary, there is no hole. AI will be a ubiquitous component in any product that contains software. A $50b spend on GPUs infra can easily be amortized over a $5T worldwide IT spend. And I am super excited to invest in AI infra and the applications that are powered by it!
We've raised a $1.25B infrastructure fund! We love all infra, compute, network, storage, databases, data science, gen AI, dev tools ... from silicon to UIs.
Infra is the true root of value in tech. And we're deepening our commitment to it.
3/7 ControlNet has a number of these models and a framework to train new ones. The model used for the Bond photo uses the output of the Canny edge detector and creates an image with very similar edges. This creates a very close match in terms of overall layout.
I wrote a ChatGPT tool. It watches the clipboard on macOS. If it finds a trigger (default "@@"), it replaces the Clipboard with its ChatGPT output.
1. Copy "@
@Capital
of France?"
3. wait 1-2 seconds
4. Paste "Paris..."
Super useful, 70 lines of Python.
How hard is it to fly a small plane from the SF Bay Area to the Carribean? Well, I am in the process of finding out. If you want to follow the adventure, I am posting updates on my home page.
Blog Post: I recently updated my home office to become more centered around video conferencing. Here are some lessons learned, the gear I use and why we are all just streamers.
5/7 But most importantly, it misses the magnitude of the AI revolution.
AI Models are an infra component like CPUs, databases & networks.
- Today almost all software uses CPU/DB/Network
- In the future the same will be true for models
So all software will be powered by AI infra.
Congrats to
@block_one_
team for an amazingly
#b1june
event to day here in Washington DC today. It was great to be on stage, we are excited about EOS using Yubikeys
@Yubico
and future development!
Very sad to see IEEE turning on its own members like this. Their mission is to promote engineering research and innovation, and instead they give their front page to a political activist with zero engineering background who uses it to attack AI research.
Sad to see
@IEEESpectrum
with a big fuck you to open source.
It’s one thing to print irrelevant shit (which it’s been doing for years). It’s quite another to be actively anti tech.
Good thing few people under the age of 50 read Spectrum any longer.
SMS for two factor authentication or password reset is unsafe and will disappear. If you don't believe it, read this article about recent arrests of people who used SIM swapping to reset accounts and steal crypto currencies in the
@mercurynews
1/4 ServiceNow & Huggingface just released StarCoder, a language model for code. The impressive stats are 15.5B parameters, trained on 1T tokens, 80+ programming languages. Here is a quick test how it stacks up against Co-pilot.
Big day for
@Intel
as
@PGelsinger
and Amin Vahdat announce the collaboration on Mt Evan's. With heterogeneous compute becoming the norm, the IPU is the most strategic socket in a modern hyperscale data center and Intel+Google built a very strong product.
LoRA stacking for
@bfl_ml
#Flux
is out and it works for combining people into one photo. Some bleed of the LoRAs concepts, perfectionists should use inpainting.
Below:
- Both LoRA's stacked
- My LoRA, different ages
- Charlotte's LoRA
Model:
We are launching
@Intel
's Infrastructure Processing Unit (IPU), a strategic pillar of our data center strategy. It may look like a SmartNIC, but main benefits are actually storage virtualization and workload provisioning for Clouds.
#SixFiveSummit
Flux.1 dev with the right LoRA's is just ridicolously good for Watercolor style profile images. Via
@venturetwins
, she is working on a tutorial how to do it.
1/5
@StabilityAI
released their LLM today. It's called StableLM and they are planning versions up to 65b parameters. I tried the 7b parameter model and unfortunately it is not really usable yet. Example:
4/7 Second, you don't need $1 of electricity for each $1 of GPU. An H100 PCIe costs ~$30k and uses ~350W of power, maybe 1kW with server/cooling. At $0.10/kWh, this H100 costs about $0.15 in electricity over a 5 year life cycle for each $1 spent on hw.
6/7 Networking infra spend is $200b+ per year. Does this create $800b in "networking software" revenue? No, but Google uses networking infra to sell ads and it shows up as ad revenue. Or Microsoft sells O365.
The label for infra spend and revenue generated by it are different.
Incredibly humbled that Nick McKeown, Isaac Keslassy & I were invited to write one of 20 editorials reflecting on 50 years of networking research for SIGCOMM. We have come a long way since we tried to explain why buffers are too large back in 2003.
In 2005 Isabelle and I climbed Kilimanjaro and this year I did it with our oldest kid. Not much has changed. The gear is a bit better, you now have 5G coverage at 19,341 ft altitude, Lemosho is now the favored route, but in the end it's all about an incredibly beautiful mountain.
Germans (like me) talking about AI is is apparently against
@OpenAI
's content policy. Americans are fine.
This is why we need the open source AI. We don't want a few large companies dictate "AI ethics" for all of us.
@bilawalsidhu
Yes, I saw it on Reddit (and upvoted it!). That looks really amazing. 100% agreed that this will change interior design (or for that matter most design).
Is this based on depth? Or Canny??
Asked GPT-4 to remove unnecessary boilerplate from a message from my bank (
@Chase
), and it reduced its length by 85%.
This tells us a lot about the power of AI.
And about Chase.
1/3 New paper on Transformer Models with long context windows. This is an important result as it at least for now it means that LLM based search and knowledge extraction over large amounts of data will continue to use vector databases.
ControlNet experiment where I'm toggling through different styles of contemporary Indian décor, while keeping a consistent scene layout.
Loving how ControlNet is putting the artists back in control of AI image generation process.
🧵Thread
#ControlNet
#StableDiffusion
#EbSynth
1/2 Detection of AI generated text:
1. It discriminates against essays from non-native speakers 😠
2. AI can be used to make that human generated text *less* likely to be flagged. Just tell it to re-write your essays to be more "native"🤣
@antona23
Yes, apparently this runs on M1 macs, Reddit thread below. Now it will be slow. You may want to consider running it on a Google Colab or something similar instead.
We are super excited about the partnership between RSA and Yubico. RSA pioneered strong authentication with their tokens and now we are jointly delivering FIDO2 based solutions to the market. Looking forward to working with
@Zulfikar_Ramzan
, Jim Ducharme and the entire RSA Team!
NEW from RSA: We're partnering with
@Yubico
! Learn how we're tackling the growing challenge of
#DigitalRisk
together with a FIDO2-enabled device that delivers enterprise-grade security and risk-based
#authentication
:
Wohoo! Kubernetes is getting a native API gateway based on
@envoyproxy
. Very cool as it allows you to run service mesh
@IstioMesh
and API gateway on the same foundational technology. Congrats
@mattklein123
@Tetrateio
@VMware
& the steering comittee!
[New program] a16z Open Source AI Grants
Hackers & independent devs are massively important to the AI ecosystem.
We're starting a grant funding program so they can continue their work without pressure to generate financial returns.
Apparently "Guido" is a "bad word" and prohibited in
@Ring
's support forums. Before you buy a Ring Door Bell, you may want to check if your legal name is in compliance with their policy. Otherwise, change your name first (or buy Nest instead).
"Wouldn't you rather work from Hawaii?"
@Intel
's Robert Noyce predicting 40 years ago that half of the work force could work remotely in the future, and that communication will be the gating element (I am looking at you, bad hotel Wifi)
#agedlikefinewine
2/6 I used
@lucataco93
's trainer on Replicate yesterday, although Replicate came out with their own trainer later in the day. It looks like it is easier to use:
Great slide from Bill Dally at
@hotchipsorg
2023 on NVIDIA's performance gains. This graph, parallelism (i.e. transformers) and optimization (e.g. flash attention) are the foundation for the Generative AI boom. It's all about infra.
More here:
Modern AI hardware is just wild:
- Left: 1U 2xGH200 server, 2x 2.7kW power
- Right: Cabinet spec for Equinix SV2 is 2.4-5kW power
So a single server exceeds the power limit for the entire cabinet. Which has space for 42 servers 🤯 And Blackwell will make this even worse.
Nice flex from the
@NVIDIA
PR team: Press release includes quotes from the CEOs of Amazon, Microsoft, Google Deepmind, Meta, Oracle, Tesla and OpenAI 😲
Picks & Shovels in an early gold rush are one hell of a business.
Today we are launching YubiEnterprise Delivery. You can now physically mail
@Yubikeys
to your employees or partners via a simple API call. Congrats to
@Percy_Wadia
, Mash and the team for getting this out the door.
I now have an "Add Types" for Python. 🤯
🔥Hell just froze over 🧊
Generative AI is completely changing the developer experience. This is CoPilot (beta) + CoPilot Labs + VSCode Insider.
It was a privilege to hang out with some of the leading minds in AI art last night.
Many artists appear to fear AI, but these individuals are not among them; they are boldly inventing the future of art.
Dubbing videos in different languages is looking like a solved problem. The video below was recorded in 🇬🇧 English, and dubbed in 🇪🇸 Spanish. Below a description of the tech stack and the a list of models.
Video from /
@kn
@fatimasabar
🧵Thread 1/6
I fine-tuned SDXL on myself and I am really impressed. More detail, much more dramatic pictures, more consistency and good looking hands. Fine-tuning is still harder than SD 1.5. A few observations below.
1/5 🧵⬇️
The amount of hype around Generative AI is incredible, but is there real revenue? Turns out the answer is yes, and together with
@BornsteinMatt
and
@casado
we are trying to define the Generative AI stack and where value accrues.
[New Post] Where does the value accrues in the generative AI stack? Most is accruing to infra, and there aren't clear signs of long term moats.
With a market this size (all of human endeavor?) that's tremendously exciting.
(w/
@BornsteinMatt
,
@appenz
)
3/7 First, the illustrative math mixes capex (GPU cost), annual opex, cumulative revenues during GPU lifetime & annual revenues from AI apps to derive the headline $200b number. A more appropriate calc would be based on the annual return earned on capital invested by GPU buyers.
Other big announcement of the day at
@s1p
:
#RSocket
built by
@facebook
,
@netifi_inc
and
@Pivotal
. Reactive messaging with back pressure propagated to the clients over the network is a major game changer!
I added U2F keys for a bunch of online accounts. So much badness in how these are implemented. Top annoyance: Requiring TOTP setup before you can add U2F keys. So if you thought that with a blue Yubico product (not full Yubikey), you'd avoid problems when switching phones, nope!