kwindla Profile Banner
kwindla Profile
kwindla

@kwindla

5,529
Followers
3,587
Following
879
Media
3,769
Statuses

Co-founder of @trydaily . he/him/his.

San Francisco, CA
Joined September 2008
Don't wanna be here? Send us removal request.
Pinned Tweet
@kwindla
kwindla
4 days
Conversational Voice and Video AI Hackathon Oct 19th-20th - in-person in SF and remote $20,000 in cash prizes for the best ... 💠​ voice AI agents 💠​ ​virtual avatar experiences 💠 ​UIs for multi-modal AI ​ 💠​ apps built around conversational dynamics 💠​ art projects
8
9
48
@kwindla
kwindla
3 months
Very, very fast voice bots. Llama 3.1 running on @GroqInc . 🚀 500ms voice-to-voice response times
85
483
4K
@kwindla
kwindla
4 months
How to build the world's fastest voice AI bot: - Self-host speech-to-text, LLM inference, and text-to-speech all together in the same container/cluster. - Route audio over the internet using WebRTC and edge networking. - Configure timings for voice activity detection,
68
382
2K
@kwindla
kwindla
5 months
Llama 2 70B in 20GB! 4-bit quantized, 40% of layers removed, fine-tuning to "heal" after layer removal. Almost no difference on MMLU compared to base Llama 2 70B. This paper, "The Unreasonable Ineffectiveness of the Deeper Layers," was my airplane reading on the way to a
Tweet media one
31
120
989
@kwindla
kwindla
2 months
Voice AI fast response - phrase endpointing How does a Voice AI bot/agent/process know when it should process input speech and respond? This problem is called phrase endpointing. Getting phrase endpointing right is critical for voice interactions. The Open Source,
24
69
558
@kwindla
kwindla
13 days
OpenAI Realtime client in 75 lines of Python I've been hacking on an OpenAI Realtime API service for @pipecat_ai and it occurred to me that the core voice-to-voice loop in pseudo-code is quite small. (Which is a nice testament to the API design!)
Tweet media one
12
53
445
@kwindla
kwindla
5 months
. @OpenAI has announced that native speech-to-speech APIs are coming soon. A lot of us are eagerly awaiting access. Potential new use cases *and* even more latency reduction! But you don't have to wait for GPT-4o native speech input to build really cool voice+AI applications. We
8
40
421
@kwindla
kwindla
4 months
@nikillinit @bobbyfijan Two low-level technical reasons: the need to do echo cancellation for people not wearing headphones means enabling a *lot* of audio processing logic; and aggressively trying to limit background noise and volume differences improves the median user experience. If everybody is
6
7
270
@kwindla
kwindla
3 months
Try it out here:
Tweet media one
23
38
264
@kwindla
kwindla
2 months
Sub-700ms AI video conversational latency 🤯🤯🤯 First, so as not to bury the lede, this is @heytavus 's new digital twin API delivering ~670ms latency end-to-end ... from a client application, to the cloud, and back to the client. That's very, very, very fast. Somewhere
3
28
226
@kwindla
kwindla
2 months
Voice-to-voice AI using Open Source tools. I did a small demo at @AITinkerers last night. ✅ zero-UI HTML circa 1997 ✅ talks like a pirate ✅ function calling demo "get_weather()" ✅ joke about Blaise Pascal ✅ fast, natural voice conversation with an LLM in ~100 lines of
@jheitzeb
Joe Heitzeberg
2 months
@kwindla showing their very very new voice to voice APIs that make voice bots easy to create… reminds me of Twilio back in the day! @trydaily
1
1
13
3
18
163
@kwindla
kwindla
2 months
I've spent a while today talking to Carter, a video avatar from the team at @heytavus . Really impressive — fast responses, excellent video and audio generation, nicely leverages the strengths of today's SOTA LLMs and vision models. You can build your own video avatars and
7
27
151
@kwindla
kwindla
3 months
A new, open standard for AI voice and video. Reference SDKs for JavaScript and React today, with iOS and Android coming soon. "Hello world" LLM voice chat in 21 lines of code.
@trydaily
Daily
3 months
Today we’re announcing an open standard for Real-time Voice and Video Inference: RTVI-AI. The RTVI abstractions and data structures define how client applications communicate with inference services. These are the “real-time APIs” for use cases like: - Voice chat with LLMs
Tweet media one
7
43
209
5
12
146
@kwindla
kwindla
3 months
@sstur_ @GroqInc It's a combination of making every part of the data pipeline cancelable, using a good voice activity detection module, and being careful about managing the LLM context. Shoutout to @cartesia_ai who added support for word-level timestamps in their streaming voice APIs recently.
6
8
131
@kwindla
kwindla
4 months
@WillManidis Trip: "Our LPA requires me to have 10% ownership at the time of investment." Tripp: "How about a 15% overriding royalty interest, shut-in royalties as fountain coin collection rights at The Galleria, and delay rentals delivered as drilling mud samples from all of your family
2
1
128
@kwindla
kwindla
3 months
405B + Voice + tool use!
@GroqInc
Groq Inc
3 months
We are proud to collaborate with @trydaily on real-time voice #AI . Check out enterprise voice workflows, such as this healthcare patient intake demo running on #Llama 3.1 405B by @AIatMeta ! #VoiceAI #Meta #Inference #LLM #Llama3
3
21
117
1
4
103
@kwindla
kwindla
9 months
A self-driving car just picked me up. Blade runner levels of rain tonight. I’m going to an office where lava lamps generate the random numbers that power the internet to moderate a panel discussion about machine conversation and robot voices.
3
2
102
@kwindla
kwindla
4 months
Live demo and link to source code here: Technical write-up here:
2
17
103
@kwindla
kwindla
22 days
Conversational Voice and Video AI Hackathon Oct 19th-20th at @solarislll in San Francisco $20,000 in prize money for the best ... 💠​ voice AI agents 💠​ ​virtual avatar experiences 💠 ​UIs for multi-modal AI ​ 💠​ apps built around conversational dynamics 💠​ art projects
Tweet media one
5
21
89
@kwindla
kwindla
5 months
Delve in here:
1
5
86
@kwindla
kwindla
16 days
Old 4o vs New 4o — a dialog between two generations of voice AI Here's the demo I showed last night at the @cloudflare / @openai builders event. This is two GPT-4o Voice AI bots talking to each other. The first voice is coming from the phone and is powered by the standard Daily
5
13
86
@kwindla
kwindla
5 months
. @garrytan has been hosting "YC Launch Live" for the past few weeks — @ycombinator founders get together, catch up with old friends, meet new friends, and Garry emcees a few demos. At the last YC Launch Live, we talked about our experience over the past few months building voice
4
12
75
@kwindla
kwindla
2 months
🤖 Daily Bots launch day at @trydaily ! 🤖 Voice-to-voice with any LLM. (And vision and video, of course.) We've had a ton of fun building this hosted service for real-time AI on top of the Open Source projects that we've been contributing to for the last year. I'm all in on the
@trydaily
Daily
2 months
Today we’re launching Daily Bots, the ultra low latency Open Source cloud for voice, vision, and video AI. Build voice-to-voice with any LLM, at conversational latencies as low as 500ms. With Daily Bots, developers can: *️⃣  build with Open Source SDKs *️⃣  mix and match the
11
32
139
3
13
75
@kwindla
kwindla
5 months
Initial implementation of @GoogleAI Gemini Flash 1.5 in @pipecat_ai . Nice space poetry, Flash!
2
10
73
@kwindla
kwindla
6 days
Swapping between LLMs during a long-running voice AI conversation. I had been wanting to build this for a while, but put it off partly because I kept hoping I'd run across a library that would make it easy. (I've asked a couple of times in various threads/chats/forums, but
9
19
168
@kwindla
kwindla
11 months
Twilio announced today that they are discontinuing their Programmable Video service. #WebRTC is a small world. If you are impacted in any way by this change, my DMs are open and I'm here to be helpful however I can.
6
17
69
@kwindla
kwindla
3 months
Really nice tutorial showing how to build an application that combines voice AI conversation with RAG. Also, super-compelling use case. It's impossible to over-state how amazing it is that we will soon be able to give every child on the planet access to personalized, 1:1
@cerebriumai
cerebriumai
3 months
When @karpathy announced he is launching @EurekaLabsAI and he mentioned that subject matter experts would not be able to personally tutor all 8 billion people on demand we thought we would prove him wrong... Demo: #aiagent #GenerativeAI #education
6
17
64
1
10
66
@kwindla
kwindla
7 months
Function calling, interruptibility, fast responses. This is a nice example of where real-world voice interfaces are headed.
3
3
67
@kwindla
kwindla
1 month
An Homage To Metal Gear Solid a playable voice AI puzzle game <overheard in slack> me: I wrote some sample code to show how you switch out LLM context on the fly and why you might want to. @JonPTaylor : hold my beer ... </> Tech stack: - input speech processing @DeepgramAI
6
16
66
@kwindla
kwindla
6 months
. @mickeyxfriedman and @eoghan at AI Tinkerers tonight talking about the return of “weirdos and creatives” to the center of tech culture in SF.
Tweet media one
3
2
56
@kwindla
kwindla
2 months
The @cartesia_ai on-device models are an exciting development. I have been playing with Rene 1.3B on my macOS machine today. The MLX 4-bit quantization is amazingly fast.
@cartesia_ai
Cartesia
2 months
Today, we’re unveiling a significant milestone in our journey toward ubiquitous artificial intelligence: AI On-Device. Our team pioneered a radically more efficient architecture for AI with state space models (SSMs). Now, we’ve optimized and deployed them at the edge. We believe
Tweet media one
11
83
368
1
2
56
@kwindla
kwindla
9 months
So this Lumiere thing (which looks awesome) is another AI "release" from Google that I can't try out, even as a demo, huh?
7
3
52
@kwindla
kwindla
26 days
Quality, latency, cost: pick three For conversational voice AI applications, the three most important voice model attributes are realism (quality), time to first audio byte (latency), and cost. These days, when I talk to startup founders building real-time voice AI apps, I
@cartesia_ai
Cartesia
27 days
Daily Bots by @trydaily 's launch last month saw hundreds of devs sign up instantly. We're thrilled to be their main voice provider, powering top multimodal agents. Check out this Metal Gear-inspired demo showcasing Sonic's real-time gaming potential. Link to full story below
1
4
34
2
4
53
@kwindla
kwindla
8 months
@levelsio @fal_ai_data has llava and everything I’ve tried from them is very fast:
5
4
53
@kwindla
kwindla
17 days
The amazing @craigsdennis introducing @OpenAI hack night at @Cloudflare .
Tweet media one
4
1
51
@kwindla
kwindla
3 months
Client code is here: We'll clean up this @pipecat_ai and infra management code and post it. But the bot is pretty similarly to the example here: TTS is @cartesia_ai . You can definitely run Llama-3.1 8B locally with a bot like
@mysticaltech
The Canaanite
3 months
@kwindla @GroqInc what do you use for tts? any chance you could open-source this, please? my dream is to run this locally with llama-3.1 8B always with the tts model
1
0
1
2
7
50
@kwindla
kwindla
4 months
Here's a quick walk-through from @aconchillo showing how to set up a voice AI agent that you can talk to on the phone. 📱 clone @Viking5274 's example project 🚇 set up an @ngrokHQ tunnel ☎️ buy a phone number from @twilio 🔗 point the phone number at the ngrok url 🗣️ talk
@pipecat_ai
Pipecat AI
4 months
Complete @twilio phone bot example from @Viking5274 .
Tweet media one
0
5
18
5
8
48
@kwindla
kwindla
3 months
I look extremely tired in this video, for reasons that are probably obvious. But no rest for the wicked. I'm moderating a panel tonight at @solarissociety with an all-star lineup of people who think a lot about SOTA models, the pace of progress with LLMs, economically valuable
@rajivayyangar
Rajiv Ayyangar
3 months
Llama 3.1-405B is out and @kwindla and I have thoughts! See below for a special event in SF today and a launch challenge:
2
6
29
1
10
46
@kwindla
kwindla
3 months
@LiveLaughLauSai @GroqInc The speed is a combination of using today's fastest models/services ( @DeepgramAI , @GroqInc , @cartesia_ai ), optimized network infrastructure ( @trydaily ), and efficient buffering/pipelines/algorithms ( @pipecat_ai ).
0
4
45
@kwindla
kwindla
2 months
I wrote parts of the function calling implementations and improvements that just shipped in @pipecat_ai 0.40 and gee whiz I have a lot of thoughts about LLM tool use. 1. Claude 3.5 Sonnet, GPT-4o, and LLama 3.1 are all very good at function calling now. This makes whole
@pipecat_ai
Pipecat AI
2 months
Lots of new things in 0.40. ✨ Function calling and prompt caching for @AnthropicAI Claude 3.5 Sonnet ✨ Llama 3.1 function calling support in the @togethercompute service ✨ A complete implementation of the RTVI standard ✨ Studypal, a new application example from the team at
Tweet media one
1
1
15
6
7
44
@kwindla
kwindla
1 year
... and it's out in the world! An SDK for AI + real-time audio, video, and data. This library makes it so easy to connect an LLM into a video call that I've been tinkering with WebRTC + GPT-4 in a colab notebook. 🤯
3
9
37
@kwindla
kwindla
6 months
Nothing like a 30-foot diagonal screen to remind you that you need to work on your posture.
Tweet media one
2
0
39
@kwindla
kwindla
5 months
We have consistent ~500ms audio-in-to-first-token latency with a pipeline of @DeepgramAI transcription -> @GroqInc LLama-3 -> Deepgram TTS now. So I think it's pretty realistic that a natively speech-to-speech model can get down to ~300ms. We've been assuming this is where
@thdxr
dax
5 months
the new openai stuff looks cool and i’m definitely not trying to dismiss it but a large part of what makes these demos impressive is the lack of latency - which i don’t see surviving in something public facing
25
3
222
1
2
39
@kwindla
kwindla
5 months
More fun with Pipecat. This is @vikhyatk 's incredible Moondream model running locally on my mac. @DeepgramAI transcription. An @elevenlabsio voice. Just the simplest possible demo code with no optimizations. (The inference here could be a lot faster with a tiny bit of effort.)
4
5
39
@kwindla
kwindla
8 months
Join us for an Infrastructure for real-time AI meetup on Wednesday evening at @Cloudflare . Panel featuring: * Jay Jackson, VP of AI and ML at @OracleCloud * @gorkemyu , co-founder of @fal_ai_data * @LoganGrasby , AI at @CloudflareDev Plus demos and pizza. If you have
5
9
38
@kwindla
kwindla
5 years
@villi We ( @trydaily ) make that platform today. Two lines of code to add one-click, no-download video calls to any website or app. Operating at scale with partners like @Tandem_HQ . Powering developers who show us awesome new use cases every day.
4
1
37
@kwindla
kwindla
5 months
PR for @cartesia_ai TTS support in @pipecat_ai . I couldn't resist trying out the "1920's Radioman" voice. Here are two bots, running in separate processes, doing play-by-play for an imaginary baseball game. (Making it up as they go along.) The announcer is GPT-4o. The color
1
4
37
@kwindla
kwindla
13 days
And, of course, one thing led to another. It turns out actual working Python code is ~75 lines. So ... here's a command-line client for `gpt-4o-realtime-preview-2024-10-01`. No error handling. No configurability. No echo cancellation, so you'll need to
1
5
37
@kwindla
kwindla
6 months
Yesterday there was a thread on the @latentspacepod discord about what's changed in the voice+AI domain over the past few months. I think three big things have changed since ~October last year. First, as with everything in Generative AI, there's a proliferation of more and
1
10
36
@kwindla
kwindla
10 days
Conversational Voice and Video Hackathon updates 🌟 @cartesia_ai and @googlecloud have joined as sponsors 🌟 @ProductHunt is sponsoring and is facilitating a remote participation track Oct 19th-20th. In-person in San Francisco and remote. 💸 $20,000 in cash prizes 💸
Tweet media one
5
8
38
@kwindla
kwindla
3 months
Our friends at @Vapi_AI and Toby are hosting an Audio and Speech AI event at @solarissociety in SF on Thursday evening. Pizza, conversation with people building new things at the intersection of audio+AI, and almost certainly some impromptu demos.
0
4
34
@kwindla
kwindla
11 months
Computer vision is so, so much fun to play with. Useful, too, of course. But you don't need an actual use case to hack on things that are interesting, beautiful, and surprising. You just need libraries like this! Also, I think that a good working knowledge of CV libraries,
@skalskip92
SkalskiP
11 months
supervision-0.17.0 release is just around the corner - plug in your favorite detection/segmentation model - compose the perfect visualization github:
9
90
486
2
1
35
@kwindla
kwindla
2 years
I have always been a huge fan of the people at @confrere_video and the work they've done on #WebRTC . It's incredibly exciting to have the opportunity to work with many of them now as colleagues.
3
3
35
@kwindla
kwindla
6 months
We start them young at our hackathons.
Tweet media one
3
0
34
@kwindla
kwindla
5 months
EHR Voice Assistant. An LLM-powered 24/7 assistant for physicians, saving time and making it easier to manage a complex daily schedule. - What patient am I seeing next? - Give me a summary of the patient's last visit. - Am I seeing any patients today that have never been to our
5
2
32
@kwindla
kwindla
1 month
In 2025, how many of your phone calls will be to an AI? My general take here is that almost everyone is under-estimating how fast voice AI is growing. ➡️ LLMs do a very, very good job with customer support. We're at the point where the experience of talking to an LLM (backed by
3
3
32
@kwindla
kwindla
17 days
Big @pipecat_ai release today. Lots of low-level improvements to performance and ergonomics. TTS service additions and improvements (Google, AWS, Azure). The event framework that makes it easy to build complex client apps and workflows is rounding out nicely. I spent a lot of
Tweet media one
1
4
31
@kwindla
kwindla
3 months
this is why we computer
@MnightAwoHusky
Midnighthowlinghuskydog
3 months
@kwindla 🤣🤣🤣🤣
1
1
11
1
3
29
@kwindla
kwindla
5 months
Join @rajivayyangar and me on Thursday May 30 at Solaris AI ( @solarissociety ) for a panel discussion with three very plugged-in AI investors: - @JenniferHli ( @a16z ) - @joshbuckley - @rememberlenny (the @aigrant ecosystem) There will be pizza and demos. (DM me if you want to
Tweet media one
5
5
31
@kwindla
kwindla
6 months
. @isidentical from @FAL kicking off the @aiengfoundation real-time & multimodal hackathon this morning.
Tweet media one
0
3
30
@kwindla
kwindla
5 months
@IlyasHairline I mean, you’re right, of course. But channeling Bill Clinton, it depends on the meaning of “is.” I haven’t reproduced the results. But just looking at the graphs, fingerprinting the behavior on MMLU it is still 70b in some sense. It’s not a new model. It behaves very, very
3
0
29
@kwindla
kwindla
2 months
🎥🤖🤖🤖 Hail, fellow kids. @ProductHunt CHALLENGE incoming 🤖🤖🤖🎥 We launched Daily Bots today and people are recording very funny videos talking to AI voice bots. Record a video, post it here, then go put the link in the Product Hunt conversation thread. (There are lots of
3
1
29
@kwindla
kwindla
4 months
In April last year I hacked together a voice bot that did live translation between Spanish and English during video calls. That was the first non-trivial thing I built using GPT-4 The fact that it was possible for a general-purpose LLM to do very good language translation
Tweet media one
1
2
27
@kwindla
kwindla
7 months
Thanks @swyx , @hackgoofer , and @VibhuSapra for putting on a great mini-conference tonight about diffusion models.
Tweet media one
0
3
28
@kwindla
kwindla
22 days
@rileybrown_ai The Open Source @pipecat_ai project if you’re looking for an API/orchestration layer. It supports WebSockets, WebRTC, and telephony integration. Demo/playground here:
0
3
28
@kwindla
kwindla
4 months
@_junaidkhalid1 The displayed latency in the main UI is the total voice-to-voice latency — from the time you stop talking to the time you hear the first audio bytes from the bot. So that includes all of the network transit time in both directions and the processing time. 250ms will be hard to
3
1
28
@kwindla
kwindla
8 days
Something new cooking ...
Tweet media one
2
2
28
@kwindla
kwindla
8 months
100 people came to the Voice + AI meetup this evening and at the end of the night we had exactly one piece of pizza left. @ramyavmani ’s prognostimation [ed. yes he did just make up that word] skills are very impressive.
Tweet media one
6
1
27
@kwindla
kwindla
7 months
This weekend in one of the AI engineer group chats I'm in, we had an interesting discussion about self-hosting the components of voice/conversational AI apps as compared to using SaaS services. This is a generally interesting topic, and I've had this (new) AI infra conversation
4
1
27
@kwindla
kwindla
5 months
The broad applications for very fast general vision models are as yet under-appreciated. To be fair, this is true right now of vision in general, and really all of generative AI. Those of us building applications have not yet caught up to the last 24 months of multiple leaps
@sumo43_
artem
5 months
Got fast paligemma inference working on RTX 4090. Here's an object detection demo with the the 224px model running in real time at 16fps. I generate 10 tokens per iteration
28
51
674
3
0
26
@kwindla
kwindla
2 months
Pull shopping suggestions out of a video stream on the fly! The thread below links to a live demo, the code, and an in-depth walk through. This is a really good primer on real-time, multimodal AI. Built using @cerebriumai , @trydaily , @tursodatabase , @supabase , and @vercel
@cerebriumai
cerebriumai
2 months
When @garyvee mentioned that users would be able to shop items in movies and videos in real-time, I didn't think he knew that day would be today! Live demo in comments 🧵 #ecommerce #genai #ai #cerebrium #streaming
4
8
29
1
4
25
@kwindla
kwindla
9 months
@patrickc At Oblong we did a bunch of work for Boeing. This was during the final push to get the 787 shipped. Watching that project from a funny part-inside but mostly outside perspective, I remember thinking that the 787 might be the last civilian design Boeing would ever produce. Since
7
2
24
@kwindla
kwindla
11 months
Things I love about what @Humane is doing. A partial list. 1. We're living through the early days of a platform shift. AI is going to enable and require new hardware — both at the chip level and the device level. The Humane AI pin is a big and interesting first swing at harware
1
2
25
@kwindla
kwindla
3 years
Big release for us, today. 📹 Stream any video session to a broadcast audience ... anywhere! Lots more new related APIs coming soon. If you're doing live streaming, tell us what you need (we might be able to give you early access to beta features).
@trydaily
Daily
3 years
We're excited to announce Daily Live Streaming! Our newest API makes it easy to build video calls and live stream them. Broadcast to millions on #RTMP platforms like #FBLive , YouTube, #Twitch . Or stream to custom players via #AmazonIVS #Brightcove #Mux 🧵
1
9
21
1
6
24
@kwindla
kwindla
1 month
On Monday @rajivayyangar and I hosted an AI evals and observability demo night at @solarislll We had demos from: ✨ @timssweeney / @weights_biases @tom_shapland / Canonical AI ✨ @hellovai / @boundaryML @Andydy42 / @keywordsai @skull8888888888 / @lmnrai
2
1
24
@kwindla
kwindla
10 months
This is a fantastic home page headline. (kudos to @fouadmatin )
Tweet media one
1
3
23
@kwindla
kwindla
4 months
All I want for father's day is an 8xH100.
3
1
24
@kwindla
kwindla
1 month
I'm looking forward to the @cartesia_ai x @usepylon Conversational AI meetup next Tuesday. Registration link in the next tweet -> They've lined up a great panel and demos. Cartesia's co-founder @_albertgu will be there. You can embarrass him by asking what it's like to be on
Tweet media one
1
5
23
@kwindla
kwindla
2 months
Inversion of the AI agent model — long-running processes that pause for human direction when necessary ... The demo Dex of @humanlayer_dev did at @AITinkerers in SF last week really stuck with me. He talked about a few things, including: 1. How long-running agent processes can
3
1
23
@kwindla
kwindla
4 months
HN discussion:
1
3
23
@kwindla
kwindla
4 years
New @trydaily starter kit for audio apps on @ProductHunt today.
3
2
23
@kwindla
kwindla
4 years
Tonight in the @joinClubhouse Future of Work room: a conversation with @shl about the evolution of independent creative work online since he founded @gumroad , and the trends he’s seeing lately as a tech investor. 8:30 PDT. 💫
2
3
23
@kwindla
kwindla
3 years
I’ve done a name change + rebrand twice. Always happy to help out another startup founder. My DMs are open.
1
1
22
@kwindla
kwindla
8 months
Packed house tonight at @solarissociety for demos, hosted by @seanxthielen and @pk_iv . Some of the LLMs are behaving, some are not. It turns out that in this brave new world we have to pray to two sets of demo gods, the normal ones and the new, hungry AI deities.
Tweet media one
1
3
22
@kwindla
kwindla
2 years
We are hosting a Happy Hour on Wed Oct 12th in SF's Mission District. Please join us at the Root Ventures office (2670 Harrison St) any time between 6:30 and 9:00. Feel free to register here, or just to join us without registering:
1
3
22
@kwindla
kwindla
9 months
Today I learned that a @Waymo doesn’t turn on its windshield wipers in the rain. And, indeed, that should not have surprised me as much as it did.
Tweet media one
6
0
21
@kwindla
kwindla
6 months
@WillManidis Maria Rosa Menocal. The Arabic Role in Medieval Literary History. U Penn Press, 1987. Out of print for a long time. Now back in print. Beautiful and mind expanding.
Tweet media one
0
0
21
@kwindla
kwindla
12 days
I really love to see this — we have customers coming to us asking how to build conversational voice AI from all over the world. Interest and use cases are truly global. Available today in @pipecat_ai .
@cartesia_ai
Cartesia
12 days
Introducing our next 8 languages on Sonic-Multilingual 🌍🗣️ The same low-latency, ultra-realistic voice generation available on 14 total languages 🎉 🇩🇪 German | 🇵🇹 Portuguese | 🇨🇳 Chinese | 🇯🇵 Japanese | 🇫🇷 French | 🇪🇸 Spanish | 🇮🇳 Hindi | 🇮🇹 Italian | 🇰🇷 Korean | 🇳🇱 Dutch |
9
9
151
1
3
21
@kwindla
kwindla
5 months
ding ding ding. best tweet of the week.
@gorkemyurt
Gorkem Yurtseven
5 months
end of software? friends, we just invented the printing press for software
2
19
136
0
2
21
@kwindla
kwindla
6 months
Do you build things like this? Or want to learn how? Join us at the Voice/Real-time/Multi-modal AI hackathon this weekend.
3
2
22
@kwindla
kwindla
2 months
Try out different stop_secs settings: Pipecat's phrase endpointing code: The Silero VAD repo:
1
3
21
@kwindla
kwindla
7 months
My favorite thing on @ProductHunt today: Zora Learning. (Named after Zora Neale Hurston!) Configurable reading experience for kids. You pick the grade level and the kind of story, and a story is generated for a child to read. Between sections, there are questions to answer (the
Tweet media one
2
3
19
@kwindla
kwindla
5 months
After the really cool @OpenAI omnimodal announcements yesterday, everybody is talking about latency! For today's voice conversational stacks, latency is the sum of: 1. getting audio from the client to the cloud 2. transcription 3. phrase endpointing 4. LLM inference 5.
@rajivayyangar
Rajiv Ayyangar
5 months
I tried out the Humane AI Pin. Latency was the biggest issue: it took awkwardly long to respond. The latency just seemed to get in the way of everything I was trying to do Latency is a solvable issue though, @kwindla points out.
1
2
16
1
2
20
@kwindla
kwindla
2 months
Two things I did not expect when we started seriously building voice AI infrastructure and tooling a little over a year ago: 1. Each step function improvement in LLM capabilities has come with more "personality." This maybe isn't apparent if you just use basic prompts, or
@kwindla @ProductHunt Meet... John. Just an average, normal human person
0
0
10
1
0
20