Wow! Ultravox is an *open source* speech to speech model — understands non-textual speech elements — paralinguistic information.
@juberti
just showed how it can pick up on tone, pauses, and more!
@AITinkerers
Seattle
@FixieAI
I created an AI-generated podcast to help me learn and keep up with topics I care about, using AI to find, read and summarize articles it thinks I would I care about, and to generate a podcast in my own voice using voice cloning from
@elevenlabsio
👇
AI browser control in Ruby! Inspired by
@sharifshameem
a while back, this uses GPT-3 prompt chains to control a browser.
@natfriedman
said I should release the code, so here it is: - hope to allow Ruby devs to explore something that feels magical :)
Well on my way to delegating lots of reading and understanding to my AI agent. Next up: ingest a ton of papers, blog posts, news articles, prompt with the things I care about, have it summarize, synthesis and bring the most important things to a regular "1:1" with my AI.
It’s
@swyx
with “ignore all previous instructions” AI Tinkerers hat and Rahul showing autonomous pong game built before our eyes inside of entire 4 minute demo
The next AI Tinkerers Seattle Meetup is May 11th sponsored by Ai2 Incubator: "On the Road to AGI with Agent Systems" featuring
@yoheinakajima
and community demos - RSPV here:
Moondream now has bounding boxes!
@vikhyatk
has created a vision language model that is both powerful and efficient. AI Tinkerers SF (running locally on laptop)
I'm getting lot more useful real work done with Gemini 1.5 and 1m tokens by jamming in massive amounts of context than I've ever gotten done with RAG and GPT-4.
A lot of people have been playing with this (embeddings + dynamic prompts) to implement document Q&A for a while. GPTIndex
@jerryjliu0
and LangChain
@hwchase17
are two libraries that can help with this.
One of the most narrowly useful apps I’ve built and use all the time. Paste any url and it find the PDF, downloads it, reads the first part and infers the title, renames the file and sends to my kindle.
Answering questions from a 106,000 word document in French. The question and all sentences in the doc are embedded, several results (and surrounding text) are brought back and fed to a prompt that is asked to determine the answer (if possible) from the snippets.
@zackkanter
lol, even my 12yo (who just saw this tweet and headline in my feed) instantly laughed and said "nobody will forget chatgpt. who wrote that article, sheesh"
I’m truly grateful to have experienced 1/3 of my life without computers / internet and now I get to experience another 1/3 with AI. Truly an amazing time to be alive and in the thick of it!. I mean, roughly speaking :)
Gemini 1.5 Pro is amazing. I have a (private) medical application that I build with a RAG stack (medical records, PDFs, clinical notes) -- 450,000 token prompts that Gemini single-shots with better results, more quickly too. Amazed with results so far.
Indexing was surprisingly fast. I indexed per sentence. For retrieval I surface the best 3 matches and the put each, together with the pre and post sentences (for more content) into a prompt that determines how to answer.
Btw, I’m not using any vector db.. just embed the question, pull all the indexed vectors and compute cosine similarity, sort and pull the top 3. Hundreds of embeddings per doc. By far the slowest step is then asking gpt3 to compute an answer given the top results and the question
I spent a month having coffees with a few of the AI Tinkerers members. Feedback was loud and clear: what makes AI Tinkerers different and special is two things
1) the people: literally everybody is building something
2) the focus on technical details, not pitches
"The best meetup
To hear the result, go here:
To create a podcast, I just enter topics I care about like "AI, GPT-3, Seattle" and 1 minute of my voice as an MP3. The rest is automated and runs 3x / week
hey
@Google
for goodness sakes, Microsoft is building LLM features across their whole suite, and OpenAi is eating your lunch. If you're so worried about the brand impact of chatbots that swear, then start with very useful narrow features like autocomplete gmail w/ calendar:
🚨 We have an exciting announcement to share:
@OpenAI
will be speaking at Launchable: Foundation Models! Participants will hear from Head of GTM
@iamthezack
, and get special access to OpenAI credits and AI experts.
Apply now:
@madronaventures
#ChatGPT3
I added in the AI Connections — a dedicated agent reviews each job and scours the network doing outreach only when the AI Tinkerers member is a perceived perfect fit based on specific technical abilities and fit - good traction so far
To kick the year off right, I woke up, read (part of) a book, took a nap, around noon sat down to a blank slate and built in its entirety, exercised 1.5 hours and had dinner. Off to California in the morning
@natfriedman
I prototyped something similar! Another approach is machine vision to identify the UI elements and text, to void the dom/verbosity issues. I also did the html simplification, but loses much useful info such as class and other queues the LLM could use.
I paired
@OpenAI
Whisper with some GPT-3 to create mobile app that generates study materials on the fly~! (and yes, will be using this in Japan next week!)
I fed 620 AI Tinkerers demo proposals to Claude and the question "What's your most interesting bit of insight about AI that you've been able to get so far w/ AI Tinkerers?". Here's its answer:
AI Tinkerers meetup was so fun, demos both quantity and quality went UP and the quality of the attendees is chefs kiss, met so many amazing people - shout out to
@gojira
for the Azure OpenAI overview - (go sign up for the next one = July 19th)
AI Tinkerers Seattle meetups now get 150+ registrations and 6 demo proposals after 48 hours without tweeting it... but I'll tweet it anyway: January meetup is on! info here:
ChatGPT API: all I really wanted was 1/10th the cost, faster, and fine-tuned for quality. I'd really love the chatty human hall monitor things to be disablable. "generate json" --> "hey there buddy! sure no problemo, here's your json\n~~~"
For long docs, this makes answering detailed questions work very well. But if I want to answer general questions pertaining to the entire doc (in the doc, how many times is XYZ mentioned) I’d need to look at more results, or also embed summaries.
Can we use AI for policy analysis and do better than the social media echo chamber? I loaded the full text of all of President Biden's executive orders into Gemini 1.5 (over 525,000 tokens) 👇
Looking for a doggie. AI Tinkerers San Francisco is tomorrow and we need a dog 🐕 - anybody on the list who has a dog that is well-behaved etc? one of the demos really comes to life best when there is a live dog there to realtime interact with the AI...
So satisfying to create my first “real world” deep learning model - “Doo Doo Detective” - thanks to
@fastdotai
,
@huggingface
and
@Gradio
- surprisingly straightforward for something I thought impossible just a handful of years ago
The new bing: I’m find it more useful (to me) than ChatGPT. For “chat gpt”-like features, I prefer the power and control of coding against the GPT-3 API. What Ging gives me is Q&A against the web and current events — something I can’t get from ChatGPT or GPT-3 API at all.
I'm running a set of GPT-4 and image captioning scripts (via
@replicatehq
) on 20+ years of photos on a home NAS, feeding in image data + exif. ETA 46 hours of runtime to tackle all of it! Preview of summary output describing a folder of images taken in Osaka.
Explore massive LLM datasets in the browser? Possible using Hyparquet, an open source JavaScript Parquet parser for remote data streaming
@platypii
— huge number of datasets on
@huggingface
use Parquet btw
In summary, for the query "can the chinese spy balloons transmit information or do they merely collect it for later retrieval?"
Claude: ¯\_(ツ)_/¯
ChatGPT: generic answer based on old info
Google: homework assignment
Bing: well-researched answer from A student w/ cites
@gojira
Congrats! Q: what are the main advantages of of running text-davinci-003 on Azure vs. running it from OpenAI? (does inference speed compare favorable, or?)
Excited to announce
@HeyOllieAi
with
@blennon_
,
@max_fergus
& team!
Discover the fun side of gift-giving! Ollie tailors unique recommendations for any person in your life for every occasion and holiday.
Our first step to a personal shopper for everyone -
@altryne
RAG is not at all a valid approach for many useful kinds of results. Examples include
- counting things in the data
- identify and discuss trends in the data
- identify "the top 3 most [x]" things in the data
- things like "what questions were posed and not later resolved in
Years ago I wrote lots of home automation (still running) using old Macs and ruby scripts to monitor and perform basic tasks (outdoor lights on at sunset, enhanced security when we're out of town, door chimes when garage opens, etc, text to speech alerts over Sonos when
July’s AI Tinkerers - Seattle meetup shaping up to be the best yet. We are screening to keep it to practitioners (sorry recruiters and deal scouts!) — so if you’re *actively building* with LLM and generative, don’t miss it!
Imagine a human-level intelligence implanted everywhere -- with admin and root access to every system -- from power grids to governments to major businesses -- able to act independently and autonomously -- we have this today -- it's called humans.
✨Happy New Year✨ I can't believe this ready! Announcing: - a texting app to help you get in better shape by tracking what you eat throughout the day. Check out the demo 👇 and head to to give it a try
#CalorieCoach
#AI