kwindla @kwindla Twitter profile | Pikagi

Pikagi

kwindla

@kwindla

5,529

Followers

3,587

Following

879

Media

3,769

Statuses

Co-founder of @trydaily . he/him/his.

San Francisco, CA

https://t.co/plyseTkcW0

Joined September 2008

Don't wanna be here? Send us removal request.

Pinned Tweet

@kwindla

kwindla

4 days

Conversational Voice and Video AI Hackathon Oct 19th-20th - in-person in SF and remote $20,000 in cash prizes for the best ... 💠 voice AI agents 💠 virtual avatar experiences 💠 UIs for multi-modal AI 💠 apps built around conversational dynamics 💠 art projects

8

9

48

Last Seen Profiles

@schwulerpinguin

@okayama_kocho

@kajudeeesu

@Chiemi_MiDiON

@zazu2_spare

@stw_pdg

@marusan_jp

@stwmaniax

@bb_fist

@ibubohay2

@hcl1210

@titianas914130

@CerissaMat60746

@yanisolar

@Dadd_baguette

@bokeplokalmalam

@LittlewHett

@DaquanMarc74193

@hostile289

@EcommRakan

@moranbishop

@izzynavs_

@JseaknurbsAZQ9

@aboalali697

@CJermon26902

@BrunaKanet46523

@CedericSat42463

@SIMOEDU_

@Pele

@AnGenericArtist

@Allh558

@berry_voice33

@AlethaDeys59716

@TacosySarcasmo

@KGotit_

@Lucy100051989

@kwindla

kwindla

3 months

Very, very fast voice bots. Llama 3.1 running on @GroqInc . 🚀 500ms voice-to-voice response times

85

483

4K

@kwindla

kwindla

4 months

How to build the world's fastest voice AI bot: - Self-host speech-to-text, LLM inference, and text-to-speech all together in the same container/cluster. - Route audio over the internet using WebRTC and edge networking. - Configure timings for voice activity detection,

68

382

2K

@kwindla

kwindla

5 months

Llama 2 70B in 20GB! 4-bit quantized, 40% of layers removed, fine-tuning to "heal" after layer removal. Almost no difference on MMLU compared to base Llama 2 70B. This paper, "The Unreasonable Ineffectiveness of the Deeper Layers," was my airplane reading on the way to a

Tweet media one

31

120

989

@kwindla

kwindla

2 months

Voice AI fast response - phrase endpointing How does a Voice AI bot/agent/process know when it should process input speech and respond? This problem is called phrase endpointing. Getting phrase endpointing right is critical for voice interactions. The Open Source,

24

69

558

@kwindla

kwindla

13 days

OpenAI Realtime client in 75 lines of Python I've been hacking on an OpenAI Realtime API service for @pipecat_ai and it occurred to me that the core voice-to-voice loop in pseudo-code is quite small. (Which is a nice testament to the API design!)

Tweet media one

12

53

445

@kwindla

kwindla

5 months

. @OpenAI has announced that native speech-to-speech APIs are coming soon. A lot of us are eagerly awaiting access. Potential new use cases *and* even more latency reduction! But you don't have to wait for GPT-4o native speech input to build really cool voice+AI applications. We

8

40

421

@kwindla

kwindla

4 months

@nikillinit @bobbyfijan Two low-level technical reasons: the need to do echo cancellation for people not wearing headphones means enabling a *lot* of audio processing logic; and aggressively trying to limit background noise and volume differences improves the median user experience. If everybody is

6

7

270

@kwindla

kwindla

3 months

Try it out here:

Tweet media one

23

38

264

@kwindla

kwindla

2 months

Sub-700ms AI video conversational latency 🤯🤯🤯 First, so as not to bury the lede, this is @heytavus 's new digital twin API delivering ~670ms latency end-to-end ... from a client application, to the cloud, and back to the client. That's very, very, very fast. Somewhere

3

28

226

@kwindla

kwindla

3 years

I'm happy and grateful to announce @trydaily has raised a $40M Series B led by @RenegadePtnrs . 🎉 @imthemusic is joining our board to work with us on the future of video, audio, and WebRTC.

Tweet card media

Announcing Our $40M Series B

Today we're happy and grateful to announce a new $40 million round of funding, led by Roseanne Wincek at Renegade Partners. You can read the full story on TechCrunch. We founded Daily as a bet on the...

33

33

209

@kwindla

kwindla

2 months

Voice-to-voice AI using Open Source tools. I did a small demo at @AITinkerers last night. ✅ zero-UI HTML circa 1997 ✅ talks like a pirate ✅ function calling demo "get_weather()" ✅ joke about Blaise Pascal ✅ fast, natural voice conversation with an LLM in ~100 lines of

@jheitzeb

Joe Heitzeberg

2 months

@kwindla showing their very very new voice to voice APIs that make voice bots easy to create… reminds me of Twilio back in the day! @trydaily

1

1

13

3

18

163

@kwindla

kwindla

2 months

I've spent a while today talking to Carter, a video avatar from the team at @heytavus . Really impressive — fast responses, excellent video and audio generation, nicely leverages the strengths of today's SOTA LLMs and vision models. You can build your own video avatars and

7

27

151

@kwindla

kwindla

3 months

A new, open standard for AI voice and video. Reference SDKs for JavaScript and React today, with iOS and Android coming soon. "Hello world" LLM voice chat in 21 lines of code.

@trydaily

Daily

3 months

Today we’re announcing an open standard for Real-time Voice and Video Inference: RTVI-AI. The RTVI abstractions and data structures define how client applications communicate with inference services. These are the “real-time APIs” for use cases like: - Voice chat with LLMs

Tweet media one

7

43

209

5

12

146

@kwindla

kwindla

3 months

@sstur_ @GroqInc It's a combination of making every part of the data pipeline cancelable, using a good voice activity detection module, and being careful about managing the LLM context. Shoutout to @cartesia_ai who added support for word-level timestamps in their streaming voice APIs recently.

6

8

131

@kwindla

kwindla

4 months

@WillManidis Trip: "Our LPA requires me to have 10% ownership at the time of investment." Tripp: "How about a 15% overriding royalty interest, shut-in royalties as fountain coin collection rights at The Galleria, and delay rentals delivered as drilling mud samples from all of your family

2

1

128

@kwindla

kwindla

3 months

405B + Voice + tool use!

@GroqInc

Groq Inc

3 months

We are proud to collaborate with @trydaily on real-time voice #AI . Check out enterprise voice workflows, such as this healthcare patient intake demo running on #Llama 3.1 405B by @AIatMeta ! #VoiceAI #Meta #Inference #LLM #Llama3

3

21

117

1

4

103

@kwindla

kwindla

9 months

A self-driving car just picked me up. Blade runner levels of rain tonight. I’m going to an office where lava lamps generate the random numbers that power the internet to moderate a panel discussion about machine conversation and robot voices.

3

2

102

@kwindla

kwindla

4 months

Live demo and link to source code here: Technical write-up here:

2

17

103

@kwindla

kwindla

22 days

Conversational Voice and Video AI Hackathon Oct 19th-20th at @solarislll in San Francisco $20,000 in prize money for the best ... 💠 voice AI agents 💠 virtual avatar experiences 💠 UIs for multi-modal AI 💠 apps built around conversational dynamics 💠 art projects

Tweet media one

5

21

89

@kwindla

kwindla

5 months

Delve in here:

1

5

86

@kwindla

kwindla

16 days

Old 4o vs New 4o — a dialog between two generations of voice AI Here's the demo I showed last night at the @cloudflare / @openai builders event. This is two GPT-4o Voice AI bots talking to each other. The first voice is coming from the phone and is powered by the standard Daily

5

13

86

@kwindla

kwindla

5 months

. @garrytan has been hosting "YC Launch Live" for the past few weeks — @ycombinator founders get together, catch up with old friends, meet new friends, and Garry emcees a few demos. At the last YC Launch Live, we talked about our experience over the past few months building voice

4

12

75

@kwindla

kwindla

2 months

🤖 Daily Bots launch day at @trydaily ! 🤖 Voice-to-voice with any LLM. (And vision and video, of course.) We've had a ton of fun building this hosted service for real-time AI on top of the Open Source projects that we've been contributing to for the last year. I'm all in on the

@trydaily

Daily

2 months

Today we’re launching Daily Bots, the ultra low latency Open Source cloud for voice, vision, and video AI. Build voice-to-voice with any LLM, at conversational latencies as low as 500ms. With Daily Bots, developers can: *️⃣ build with Open Source SDKs *️⃣ mix and match the

11

32

139

3

13

75

@kwindla

kwindla

5 months

Initial implementation of @GoogleAI Gemini Flash 1.5 in @pipecat_ai . Nice space poetry, Flash!

2

10

73

@kwindla

kwindla

6 days

Swapping between LLMs during a long-running voice AI conversation. I had been wanting to build this for a while, but put it off partly because I kept hoping I'd run across a library that would make it easy. (I've asked a couple of times in various threads/chats/forums, but

9

19

168

@kwindla

kwindla

11 months

Twilio announced today that they are discontinuing their Programmable Video service. #WebRTC is a small world. If you are impacted in any way by this change, my DMs are open and I'm here to be helpful however I can.

6

17

69

@kwindla

kwindla

3 months

Really nice tutorial showing how to build an application that combines voice AI conversation with RAG. Also, super-compelling use case. It's impossible to over-state how amazing it is that we will soon be able to give every child on the planet access to personalized, 1:1

@cerebriumai

cerebriumai

3 months

When @karpathy announced he is launching @EurekaLabsAI and he mentioned that subject matter experts would not be able to personally tutor all 8 billion people on demand we thought we would prove him wrong... Demo: #aiagent #GenerativeAI #education

6

17

64

1

10

66

@kwindla

kwindla

7 months

Function calling, interruptibility, fast responses. This is a nice example of where real-world voice interfaces are headed.

3

3

67

@kwindla

kwindla

1 month

An Homage To Metal Gear Solid a playable voice AI puzzle game <overheard in slack> me: I wrote some sample code to show how you switch out LLM context on the fly and why you might want to. @JonPTaylor : hold my beer ... </> Tech stack: - input speech processing @DeepgramAI

6

16

66

@kwindla

kwindla

4 years

New story by @alex about what we're doing at @trydaily ...

Tweet card media

Daily.co raises $4.6M video chat API service | TechCrunch

API-powered startups are having a good year, with Plaid's mega-exit to Visa still fresh in mind. And digital video-powered startups are also having a good

9

2

56

@kwindla

kwindla

6 months

. @mickeyxfriedman and @eoghan at AI Tinkerers tonight talking about the return of “weirdos and creatives” to the center of tech culture in SF.

Tweet media one

3

2

56

@kwindla

kwindla

2 months

The @cartesia_ai on-device models are an exciting development. I have been playing with Rene 1.3B on my macOS machine today. The MLX 4-bit quantization is amazingly fast.

@cartesia_ai

Cartesia

2 months

Today, we’re unveiling a significant milestone in our journey toward ubiquitous artificial intelligence: AI On-Device. Our team pioneered a radically more efficient architecture for AI with state space models (SSMs). Now, we’ve optimized and deployed them at the edge. We believe

Tweet media one

11

83

368

1

2

56

@kwindla

kwindla

9 months

So this Lumiere thing (which looks awesome) is another AI "release" from Google that I can't try out, even as a demo, huh?

7

3

52

@kwindla

kwindla

26 days

Quality, latency, cost: pick three For conversational voice AI applications, the three most important voice model attributes are realism (quality), time to first audio byte (latency), and cost. These days, when I talk to startup founders building real-time voice AI apps, I

@cartesia_ai

Cartesia

27 days

Daily Bots by @trydaily 's launch last month saw hundreds of devs sign up instantly. We're thrilled to be their main voice provider, powering top multimodal agents. Check out this Metal Gear-inspired demo showcasing Sonic's real-time gaming potential. Link to full story below

1

4

34

2

4

53

@kwindla

kwindla

8 months

@levelsio @fal_ai_data has llava and everything I’ve tried from them is very fast:

Tweet card media

LLaVA v1.5 13B | Vision | AI Playground | fal.ai

Vision

5

4

53

@kwindla

kwindla

17 days

The amazing @craigsdennis introducing @OpenAI hack night at @Cloudflare .

Tweet media one

4

1

51

@kwindla

kwindla

3 months

Client code is here: We'll clean up this @pipecat_ai and infra management code and post it. But the bot is pretty similarly to the example here: TTS is @cartesia_ai . You can definitely run Llama-3.1 8B locally with a bot like

Tweet card media

GitHub - pipecat-ai/rtvi-web-demo: Example UI implementing the RTVI web client

Example UI implementing the RTVI web client. Contribute to pipecat-ai/rtvi-web-demo development by creating an account on GitHub.

@mysticaltech

The Canaanite

3 months

@kwindla @GroqInc what do you use for tts? any chance you could open-source this, please? my dream is to run this locally with llama-3.1 8B always with the tts model

1

0

1

2

7

50

@kwindla

kwindla

4 months

Here's a quick walk-through from @aconchillo showing how to set up a voice AI agent that you can talk to on the phone. 📱 clone @Viking5274 's example project 🚇 set up an @ngrokHQ tunnel ☎️ buy a phone number from @twilio 🔗 point the phone number at the ngrok url 🗣️ talk

@pipecat_ai

Pipecat AI

4 months

Complete @twilio phone bot example from @Viking5274 .

Tweet media one

0

5

18

5

8

48

@kwindla

kwindla

3 months

I look extremely tired in this video, for reasons that are probably obvious. But no rest for the wicked. I'm moderating a panel tonight at @solarissociety with an all-star lineup of people who think a lot about SOTA models, the pace of progress with LLMs, economically valuable

@rajivayyangar

Rajiv Ayyangar

3 months

Llama 3.1-405B is out and @kwindla and I have thoughts! See below for a special event in SF today and a launch challenge:

2

6

29

1

10

46

@kwindla

kwindla

4 years

@JuliaLipton We closed a round recently, and @MoxxieVentures , @GroundUpVC , @elizabeth , @__aston__ , @MLifschitz32 , @davidneckstein , @aunder , @stopman , @toddg777 , @jeffseibert , @EllenLevy , @aroetter , @scottbelsky , @natebosshard , @edcohen55 have all been unbelievably helpful.

7

1

43

@kwindla

kwindla

3 months

@LiveLaughLauSai @GroqInc The speed is a combination of using today's fastest models/services ( @DeepgramAI , @GroqInc , @cartesia_ai ), optimized network infrastructure ( @trydaily ), and efficient buffering/pipelines/algorithms ( @pipecat_ai ).

0

4

45

@kwindla

kwindla

2 months

I wrote parts of the function calling implementations and improvements that just shipped in @pipecat_ai 0.40 and gee whiz I have a lot of thoughts about LLM tool use. 1. Claude 3.5 Sonnet, GPT-4o, and LLama 3.1 are all very good at function calling now. This makes whole

@pipecat_ai

Pipecat AI

2 months

Lots of new things in 0.40. ✨ Function calling and prompt caching for @AnthropicAI Claude 3.5 Sonnet ✨ Llama 3.1 function calling support in the @togethercompute service ✨ A complete implementation of the RTVI standard ✨ Studypal, a new application example from the team at

Tweet media one

1

1

15

6

7

44

@kwindla

kwindla

1 year

... and it's out in the world! An SDK for AI + real-time audio, video, and data. This library makes it so easy to connect an LLM into a video call that I've been tinkering with WebRTC + GPT-4 in a colab notebook. 🤯

Tweet card media

Introducing daily-python: An SDK for AI-powered interactive video and audio

Combine WebRTC real-time video and audio with Large Language Models and other AI tools

3

9

37

@kwindla

kwindla

6 months

Nothing like a 30-foot diagonal screen to remind you that you need to work on your posture.

Tweet media one

2

0

39

@kwindla

kwindla

5 months

We have consistent ~500ms audio-in-to-first-token latency with a pipeline of @DeepgramAI transcription -> @GroqInc LLama-3 -> Deepgram TTS now. So I think it's pretty realistic that a natively speech-to-speech model can get down to ~300ms. We've been assuming this is where

@thdxr

dax

5 months

the new openai stuff looks cool and i’m definitely not trying to dismiss it but a large part of what makes these demos impressive is the lack of latency - which i don’t see surviving in something public facing

25

3

222

1

2

39

@kwindla

kwindla

5 months

More fun with Pipecat. This is @vikhyatk 's incredible Moondream model running locally on my mac. @DeepgramAI transcription. An @elevenlabsio voice. Just the simplest possible demo code with no optimizations. (The inference here could be a lot faster with a tiny bit of effort.)

4

5

39

@kwindla

kwindla

8 months

Join us for an Infrastructure for real-time AI meetup on Wednesday evening at @Cloudflare . Panel featuring: * Jay Jackson, VP of AI and ML at @OracleCloud * @gorkemyu , co-founder of @fal_ai_data * @LoganGrasby , AI at @CloudflareDev Plus demos and pizza. If you have

5

9

38

@kwindla

kwindla

5 years

@villi We ( @trydaily ) make that platform today. Two lines of code to add one-click, no-download video calls to any website or app. Operating at scale with partners like @Tandem_HQ . Powering developers who show us awesome new use cases every day.

4

1

37

@kwindla

kwindla

5 months

PR for @cartesia_ai TTS support in @pipecat_ai . I couldn't resist trying out the "1920's Radioman" voice. Here are two bots, running in separate processes, doing play-by-play for an imaginary baseball game. (Making it up as they go along.) The announcer is GPT-4o. The color

1

4

37

@kwindla

kwindla

13 days

And, of course, one thing led to another. It turns out actual working Python code is ~75 lines. So ... here's a command-line client for `gpt-4o-realtime-preview-2024-10-01`. No error handling. No configurability. No echo cancellation, so you'll need to

1

5

37

@kwindla

kwindla

6 months

Yesterday there was a thread on the @latentspacepod discord about what's changed in the voice+AI domain over the past few months. I think three big things have changed since ~October last year. First, as with everything in Generative AI, there's a proliferation of more and

1

10

36

@kwindla

kwindla

10 days

Conversational Voice and Video Hackathon updates 🌟 @cartesia_ai and @googlecloud have joined as sponsors 🌟 @ProductHunt is sponsoring and is facilitating a remote participation track Oct 19th-20th. In-person in San Francisco and remote. 💸 $20,000 in cash prizes 💸

Tweet media one

5

8

38

@kwindla

kwindla

3 months

Our friends at @Vapi_AI and Toby are hosting an Audio and Speech AI event at @solarissociety in SF on Thursday evening. Pizza, conversation with people building new things at the intersection of audio+AI, and almost certainly some impromptu demos.

Tweet card media

Audio & Speech AI @ Solaris · Luma

Lil shindig for builders in Audio and Speech A.I. We'll have pizza, drinks, and good vibes 😎 (demos welcome)

0

4

34

@kwindla

kwindla

3 months

@RobRoyce_ @GroqInc @LangChainAI There's a @LangChainAI service in @pipecat_ai . (The back end of this demo is Pipecat.) So you could just drop a LangChain agent/element into this code!

Tweet card media

pipecat/src/pipecat/processors/frameworks/langchain.py at 3fc85e75e0edd8eb70cdfac024a94a093d6cd83b...

Open Source framework for voice and multimodal conversational AI - pipecat-ai/pipecat

2

2

34

@kwindla

kwindla

11 months

Computer vision is so, so much fun to play with. Useful, too, of course. But you don't need an actual use case to hack on things that are interesting, beautiful, and surprising. You just need libraries like this! Also, I think that a good working knowledge of CV libraries,

@skalskip92

SkalskiP

11 months

supervision-0.17.0 release is just around the corner - plug in your favorite detection/segmentation model - compose the perfect visualization github:

9

90

486

2

1

35

@kwindla

kwindla

2 years

I have always been a huge fan of the people at @confrere_video and the work they've done on #WebRTC . It's incredibly exciting to have the opportunity to work with many of them now as colleagues.

Tweet card media

Confrere is joining Daily

Confrere joins Daily, the video API platform for developers, to further grow its global WebRTC team and embeddable video.

3

3

35

@kwindla

kwindla

6 months

We start them young at our hackathons.

Tweet media one

3

0

34

@kwindla

kwindla

5 months

EHR Voice Assistant. An LLM-powered 24/7 assistant for physicians, saving time and making it easier to manage a complex daily schedule. - What patient am I seeing next? - Give me a summary of the patient's last visit. - Am I seeing any patients today that have never been to our

5

2

32

@kwindla

kwindla

1 month

In 2025, how many of your phone calls will be to an AI? My general take here is that almost everyone is under-estimating how fast voice AI is growing. ➡️ LLMs do a very, very good job with customer support. We're at the point where the experience of talking to an LLM (backed by

3

3

32

@kwindla

kwindla

17 days

Big @pipecat_ai release today. Lots of low-level improvements to performance and ergonomics. TTS service additions and improvements (Google, AWS, Azure). The event framework that makes it easy to build complex client apps and workflows is rounding out nicely. I spent a lot of

Tweet media one

1

4

31

@kwindla

kwindla

3 months

this is why we computer

@MnightAwoHusky

Midnighthowlinghuskydog

@MnightAwoHusky

3 months

@kwindla 🤣🤣🤣🤣

1

1

11

1

3

29

@kwindla

kwindla

5 months

Join @rajivayyangar and me on Thursday May 30 at Solaris AI ( @solarissociety ) for a panel discussion with three very plugged-in AI investors: - @JenniferHli ( @a16z ) - @joshbuckley - @rememberlenny (the @aigrant ecosystem) There will be pizza and demos. (DM me if you want to

Tweet media one

5

5

31

@kwindla

kwindla

6 months

. @isidentical from @FAL kicking off the @aiengfoundation real-time & multimodal hackathon this morning.

Tweet media one

0

3

30

@kwindla

kwindla

5 months

@IlyasHairline I mean, you’re right, of course. But channeling Bill Clinton, it depends on the meaning of “is.” I haven’t reproduced the results. But just looking at the graphs, fingerprinting the behavior on MMLU it is still 70b in some sense. It’s not a new model. It behaves very, very

3

0

29

@kwindla

kwindla

2 months

🎥🤖🤖🤖 Hail, fellow kids. @ProductHunt CHALLENGE incoming 🤖🤖🤖🎥 We launched Daily Bots today and people are recording very funny videos talking to AI voice bots. Record a video, post it here, then go put the link in the Product Hunt conversation thread. (There are lots of

3

1

29

@kwindla

kwindla

4 months

In April last year I hacked together a voice bot that did live translation between Spanish and English during video calls. That was the first non-trivial thing I built using GPT-4 The fact that it was possible for a general-purpose LLM to do very good language translation

Tweet media one

1

2

27

@kwindla

kwindla

7 months

Thanks @swyx , @hackgoofer , and @VibhuSapra for putting on a great mini-conference tonight about diffusion models.

Tweet media one

0

3

28

@kwindla

kwindla

22 days

@rileybrown_ai The Open Source @pipecat_ai project if you’re looking for an API/orchestration layer. It supports WebSockets, WebRTC, and telephony integration. Demo/playground here:

Tweet card media

Daily Bots Demo

Daily Bots voice-to-voice example app

demo.dailybots.ai

0

3

28

@kwindla

kwindla

4 months

@_junaidkhalid1 The displayed latency in the main UI is the total voice-to-voice latency — from the time you stop talking to the time you hear the first audio bytes from the bot. So that includes all of the network transit time in both directions and the processing time. 250ms will be hard to

3

1

28

@kwindla

kwindla

8 days

Something new cooking ...

Tweet media one

2

2

28

@kwindla

kwindla

8 months

100 people came to the Voice + AI meetup this evening and at the end of the night we had exactly one piece of pizza left. @ramyavmani ’s prognostimation [ed. yes he did just make up that word] skills are very impressive.

Tweet media one

6

1

27

@kwindla

kwindla

7 months

This weekend in one of the AI engineer group chats I'm in, we had an interesting discussion about self-hosting the components of voice/conversational AI apps as compared to using SaaS services. This is a generally interesting topic, and I've had this (new) AI infra conversation

4

1

27

@kwindla

kwindla

3 months

@SynapticRebirth @GroqInc Here's the client-side code: The backend is a Pipecat bot:

Tweet card media

GitHub - pipecat-ai/pipecat: Open Source framework for voice and multimodal conversational AI

Open Source framework for voice and multimodal conversational AI - pipecat-ai/pipecat

3

2

27

@kwindla

kwindla

5 months

The broad applications for very fast general vision models are as yet under-appreciated. To be fair, this is true right now of vision in general, and really all of generative AI. Those of us building applications have not yet caught up to the last 24 months of multiple leaps

@sumo43_

artem

5 months

Got fast paligemma inference working on RTX 4090. Here's an object detection demo with the the 224px model running in real time at 16fps. I generate 10 tokens per iteration

28

51

674

3

0

26

@kwindla

kwindla

2 months

Pull shopping suggestions out of a video stream on the fly! The thread below links to a live demo, the code, and an in-depth walk through. This is a really good primer on real-time, multimodal AI. Built using @cerebriumai , @trydaily , @tursodatabase , @supabase , and @vercel

@cerebriumai

cerebriumai

2 months

When @garyvee mentioned that users would be able to shop items in movies and videos in real-time, I didn't think he knew that day would be today! Live demo in comments 🧵 #ecommerce #genai #ai #cerebrium #streaming

4

8

29

1

4

25

@kwindla

kwindla

9 months

@patrickc At Oblong we did a bunch of work for Boeing. This was during the final push to get the 787 shipped. Watching that project from a funny part-inside but mostly outside perspective, I remember thinking that the 787 might be the last civilian design Boeing would ever produce. Since

7

2

24

@kwindla

kwindla

11 months

Things I love about what @Humane is doing. A partial list. 1. We're living through the early days of a platform shift. AI is going to enable and require new hardware — both at the chip level and the device level. The Humane AI pin is a big and interesting first swing at harware

1

2

25

@kwindla

kwindla

3 years

Big release for us, today. 📹 Stream any video session to a broadcast audience ... anywhere! Lots more new related APIs coming soon. If you're doing live streaming, tell us what you need (we might be able to give you early access to beta features).

@trydaily

Daily

3 years

We're excited to announce Daily Live Streaming! Our newest API makes it easy to build video calls and live stream them. Broadcast to millions on #RTMP platforms like #FBLive , YouTube, #Twitch . Or stream to custom players via #AmazonIVS #Brightcove #Mux 🧵

1

9

21

1

6

24

@kwindla

kwindla

1 month

On Monday @rajivayyangar and I hosted an AI evals and observability demo night at @solarislll We had demos from: ✨ @timssweeney / @weights_biases ✨ @tom_shapland / Canonical AI ✨ @hellovai / @boundaryML ✨ @Andydy42 / @keywordsai ✨ @skull8888888888 / @lmnrai ✨

2

1

24

@kwindla

kwindla

10 months

This is a fantastic home page headline. (kudos to @fouadmatin )

Tweet media one

1

3

23

@kwindla

kwindla

4 months

All I want for father's day is an 8xH100.

3

1

24

@kwindla

kwindla

1 month

I'm looking forward to the @cartesia_ai x @usepylon Conversational AI meetup next Tuesday. Registration link in the next tweet -> They've lined up a great panel and demos. Cartesia's co-founder @_albertgu will be there. You can embarrass him by asking what it's like to be on

Tweet media one

1

5

23

@kwindla

kwindla

2 months

Inversion of the AI agent model — long-running processes that pause for human direction when necessary ... The demo Dex of @humanlayer_dev did at @AITinkerers in SF last week really stuck with me. He talked about a few things, including: 1. How long-running agent processes can

3

1

23

@kwindla

kwindla

4 months

HN discussion:

1

3

23

@kwindla

kwindla

4 years

New @trydaily starter kit for audio apps on @ProductHunt today.

3

2

23

@kwindla

kwindla

4 years

Tonight in the @joinClubhouse Future of Work room: a conversation with @shl about the evolution of independent creative work online since he founded @gumroad , and the trends he’s seeing lately as a tech investor. 8:30 PDT. 💫

2

3

23

@kwindla

kwindla

3 years

I’ve done a name change + rebrand twice. Always happy to help out another startup founder. My DMs are open.

1

1

22

@kwindla

kwindla

8 months

Packed house tonight at @solarissociety for demos, hosted by @seanxthielen and @pk_iv . Some of the LLMs are behaving, some are not. It turns out that in this brave new world we have to pray to two sets of demo gods, the normal ones and the new, hungry AI deities.

Tweet media one

1

3

22

@kwindla

kwindla

2 years

We are hosting a Happy Hour on Wed Oct 12th in SF's Mission District. Please join us at the Root Ventures office (2670 Harrison St) any time between 6:30 and 9:00. Feel free to register here, or just to join us without registering:

Tweet card media

Daily @Demuxed Happy Hour · Luma

Daily and Root Ventures invites you to pizza, drinks, and good conversation. We’ll be at the Root office, featuring IoT espresso machines, electric…

1

3

22

@kwindla

kwindla

9 months

Today I learned that a @Waymo doesn’t turn on its windshield wipers in the rain. And, indeed, that should not have surprised me as much as it did.

Tweet media one

6

0

21

@kwindla

kwindla

6 months

@WillManidis Maria Rosa Menocal. The Arabic Role in Medieval Literary History. U Penn Press, 1987. Out of print for a long time. Now back in print. Beautiful and mind expanding.

Tweet media one

0

0

21

@kwindla

kwindla

12 days

I really love to see this — we have customers coming to us asking how to build conversational voice AI from all over the world. Interest and use cases are truly global. Available today in @pipecat_ai .

@cartesia_ai

Cartesia

12 days

Introducing our next 8 languages on Sonic-Multilingual 🌍🗣️ The same low-latency, ultra-realistic voice generation available on 14 total languages 🎉 🇩🇪 German | 🇵🇹 Portuguese | 🇨🇳 Chinese | 🇯🇵 Japanese | 🇫🇷 French | 🇪🇸 Spanish | 🇮🇳 Hindi | 🇮🇹 Italian | 🇰🇷 Korean | 🇳🇱 Dutch |

9

9

151

1

3

21

@kwindla

kwindla

5 months

ding ding ding. best tweet of the week.

@gorkemyurt

Gorkem Yurtseven

5 months

end of software? friends, we just invented the printing press for software

2

19

136

0

2

21

@kwindla

kwindla

6 months

Do you build things like this? Or want to learn how? Join us at the Voice/Real-time/Multi-modal AI hackathon this weekend.

3

2

22

@kwindla

kwindla

2 months

Try out different stop_secs settings: Pipecat's phrase endpointing code: The Silero VAD repo:

1

3

21

@kwindla

kwindla

7 months

My favorite thing on @ProductHunt today: Zora Learning. (Named after Zora Neale Hurston!) Configurable reading experience for kids. You pick the grade level and the kind of story, and a story is generated for a child to read. Between sections, there are questions to answer (the

Tweet media one

2

3

19

@kwindla

kwindla

5 months

After the really cool @OpenAI omnimodal announcements yesterday, everybody is talking about latency! For today's voice conversational stacks, latency is the sum of: 1. getting audio from the client to the cloud 2. transcription 3. phrase endpointing 4. LLM inference 5.

@rajivayyangar

Rajiv Ayyangar

5 months

I tried out the Humane AI Pin. Latency was the biggest issue: it took awkwardly long to respond. The latency just seemed to get in the way of everything I was trying to do Latency is a solvable issue though, @kwindla points out.

1

2

16

1

2

20

@kwindla

kwindla

2 months

Two things I did not expect when we started seriously building voice AI infrastructure and tooling a little over a year ago: 1. Each step function improvement in LLM capabilities has come with more "personality." This maybe isn't apparent if you just use basic prompts, or

@chadbailey59

Chad » @[email protected]

2 months

@kwindla @ProductHunt Meet... John. Just an average, normal human person

0

0

10

1

0

20

@kwindla

kwindla

9 months

Voice + AI Meetup next week at the @Cloudflare office in San Francisco, featuring a panel discussion with @natrugrats , @Prafulfillment , and @rajivayyangar . DM me if you want to demo.

Login to Meetup | Meetup

Not a Meetup member yet? Log in and find groups that host online or in person events and meet people in your local community who share your interests.

2

4

20

@kwindla

kwindla

3 years

Look who's in Venture Beat: @ninacali4 , interviewed by the wonderful @jcspinell !

Tweet card media

Why a return to writing is vital for video company Daily — and other hard-won lessons from founder...

Daily co-founder Nina Kuruvilla on how the startup emphasizes flexibility in everything it does -- with products and running of the business.

venturebeat.com

0

0

20