Joanfihu Profile Banner
Joanfihu Profile
Joanfihu

@joanfihu

548
Followers
841
Following
291
Media
4,237
Statuses

Tech Lead & Founder @SpecifiedBy - Independent Machine Learning Researcher. #InternetML

Edinburgh, Scotland
Joined February 2011
Don't wanna be here? Send us removal request.
Pinned Tweet
@joanfihu
Joanfihu
9 months
📖 Can LLMs find knowledge gaps in the Internet? 🚀 Excited to share new work on “Harnessing Retrieval-Augmented Generation (RAG) for Uncovering Knowledge Gaps“. In this paper, I simulate how users search the Internet but instead of searching for content that exists through
1
2
8
@joanfihu
Joanfihu
2 years
@NanouuSymeon A git is an unpleasant person. GitHub is where all the unpleasant people gather and peer review their unpleasantries.
Tweet media one
2
12
173
@joanfihu
Joanfihu
2 years
@fchollet From what I read, Japanese homes aren't build to last whereas in Europe, most homes can last a few generations. Particularly old builds or non-timber builds. To inherit a property, which isn't uncommon in Europe, is quite a life changing thing.
8
10
167
@joanfihu
Joanfihu
11 months
Lil trick for building LLM apps. Forget about latency and costs in the beginning. Do as many LLM calls as needed. Store those calls in a way that a dataset could be made. Train a smaller model with the dataset and unplug the LLM. I’ve replaced 4 LLM UX bridges for SVMs and
3
10
153
@joanfihu
Joanfihu
2 years
@tunguz @Microsoft @OpenAI Clippy coming back with chatGPT’s 🧠 would be the biggest product comeback we have ever seen. It’s “memeable” too.
Tweet media one
7
3
106
@joanfihu
Joanfihu
1 year
Tweet media one
1
5
68
@joanfihu
Joanfihu
2 years
@dacey_nolan Code challenges check your engineering thought process. You should be able to do them if you want the prestige the be an engineer. But yeah, it should have been in JS though.
12
0
59
@joanfihu
Joanfihu
10 months
@jm_alexia LLMs often use Layer Normalization. These mitigated the need for bias terms by stabilizing the mean and variance of the layers' outputs. By ensuring that the activations have a controlled distribution, normalization can partially or fully compensate for the absence of a bias
1
2
58
@joanfihu
Joanfihu
2 years
@Carnage4Life I talked with a CTO of one of the largest development agencies in Edinburgh and he said that projects for native app development are decreasing. Companies’ digital transformation strategy is web first. It’s too dangerous to give so much power to Apple. Web prevails.
4
1
56
@joanfihu
Joanfihu
3 years
@iamtrask Through of disillusionment. That’s good news though. Hype is dying and common sense is back. Good time to re-engage.
Tweet media one
1
3
53
@joanfihu
Joanfihu
1 month
@giffmana WebP might be technically better but it's not as widely supported as JPG or PNG.
1
0
33
@joanfihu
Joanfihu
5 years
@chriscoyier You don’t have to be an average programmer to make mistakes. In fact, if you don’t make mistakes, you’re not learning fast enough or are already settled on your lees.
1
1
32
@joanfihu
Joanfihu
2 years
I’ve been training and deploying production ML and data systems for almost a decade. This is the process I follow:
3
7
30
@joanfihu
Joanfihu
4 months
@julien_c The plateau…
Tweet media one
1
0
32
@joanfihu
Joanfihu
1 year
@svpino Interpolating proxies to bypass the WAF is a form of abuse. This puts Bright Data in a dodgy situation. Be a good net citizen and disclose a genuine User-Agent and IP and let the website owner choose if they want their site to be crawled or not.
2
1
27
@joanfihu
Joanfihu
7 years
@lopp Q: When should I buy? A: Never Spend your time working on real problems!
5
1
15
@joanfihu
Joanfihu
2 years
Google’s reply to current events in AI and why they haven’t engaged in the same open manner as OpenAI, StabilityAI, etc. And, hint that chatGPT-like functionality is coming to Google Search this year. “Why we focus on AI”
4
5
27
@joanfihu
Joanfihu
2 years
@nathanbenaich From a scientific point of view -> Google + Deepmind. From a product/industry point of view -> MSFT + Open AI.
2
0
29
@joanfihu
Joanfihu
3 years
@iamdevloper if passive-aggressive was an image...
Tweet media one
0
0
27
@joanfihu
Joanfihu
2 years
@nikitabier People don’t fully understand how entrenched G is in search. FB threatened to dethrone G in search. It didn’t work. Idem w/ TikTok. The only way to dethrone G is to be able to provide 10x better results in a general purpose search engine app. No one is close to 1x. Plus 👇
Tweet media one
0
4
24
@joanfihu
Joanfihu
3 years
@jspeiser What would you have done differently? Diversify earlier?
1
1
24
@joanfihu
Joanfihu
3 months
@drewwilson Laravel + MySQL. All hosted in Fortrabbit because I'm too lazy to deal with servers
1
0
22
@joanfihu
Joanfihu
2 years
@pitdesi Definition of over-engineering.
0
0
20
@joanfihu
Joanfihu
1 year
@abacaj Infinite memory is coming This is the same group OAI got inspiration from to get inputs up 32K tokens.
0
1
19
@joanfihu
Joanfihu
2 years
@makispoke If you aim for power, the opposite also holds true. The longer you’re in the source company the more power you have. That’s why most successful C{X}Os have been in their companies for a while. Guess what comes alongside power? Money.
2
1
20
@joanfihu
Joanfihu
2 years
@Appyg99 They are: papers, PyTorch, Tensorflow, TPUs, Collab, etc. OpenAI makes closed sourced technology. I can build LaMDA and BlenderBot. I can’t build chatGPT. If I build a business around chatGPT and they pull the rug -> ☠️
4
0
20
@joanfihu
Joanfihu
7 years
Tweet media one
1
0
17
@joanfihu
Joanfihu
5 months
@abacaj Someone seems to be prioritising users needs vs dopamine from technical challenges… Good for you!
1
0
19
@joanfihu
Joanfihu
9 months
@itsgeorgepi @bindureddy It's like an ensemble. 8 independent models trained in specific tasks (code, QA, etc). Given a prompt, the model routes (intent detection) it to the best expert to generate an answer.
0
0
18
@joanfihu
Joanfihu
1 month
@johnloeber Review scores are subjective. I don’t think there should be score but rather a comment. Then do sentiment analysis and clustering to see what people are saying. This is also user friendly. All you need is a comment.
1
1
18
@joanfihu
Joanfihu
11 months
@tech_instigator Well… people are used to charge the phone every night and the current battery already lasts a day, at least the new ones. A longer lasting battery would break that 1 day pattern that people is already used to. That’s the whole reason I never got used to Fitbit, I had to
10
1
16
@joanfihu
Joanfihu
3 years
@JessTheVC Google Meet is the only videoconferencing app that doesn't crash my 2015 Macbook Pro. This tells you who's got the best engineering and product behind. I'm not left out with GMeet.
0
1
17
@joanfihu
Joanfihu
3 years
@tunguz For DS, I would break down deliverables by pipe stages: - Dataset creation - EDA - Feature Engineering - Model Selection - Training - Deploy Max 3 months for a beta release. After that, optimise individual modules. Add refactoring and testing.
1
0
17
@joanfihu
Joanfihu
2 years
@here_for_code @catalinmpit Scrum is literally the worst project management tool for startups. It’s a strong enough signal to not invest in a young company.
3
2
16
@joanfihu
Joanfihu
5 years
@RobotAndAIWorld It would have been more interesting if the bowl had gone in the opposite direction. \(◎o◎)/
0
1
15
@joanfihu
Joanfihu
2 years
@EMostaque I would say most academics can't really compete with Big Tech's well funded labs. AI is deeply rooted in academia.
1
0
15
@joanfihu
Joanfihu
1 year
@cwolferesearch @AnthropicAI @MosaicML Vector DBs are hard drives. Context length is RAM
2
3
14
@joanfihu
Joanfihu
2 years
@Jason I’ve had the Quest 2 for a year. Content is rapidly growing in entertainment and games. Horizons and Workrooms (MetaVerse) are a flop. I spent ~£150 on Quest content. I spent £0 on iOS Apps this year. Apple entering the market validates Meta. It’s the ATT moment for Meta.
1
2
13
@joanfihu
Joanfihu
3 years
@svpino I recently trained a product classifier and then embedded it into our products. 20% of the effort was on the model, 80% was on making it useful to users and customers. HOWEVER, @TensorFlow and @fchollet (keras) made it effortlessly to build the model, which is great.
1
0
13
@joanfihu
Joanfihu
2 years
@Carnage4Life TikTok is an entertainment content based recommender system not a social network
0
0
13
@joanfihu
Joanfihu
2 years
@AndyChenML What am I missing? This is terrible.
Tweet media one
1
0
12
@joanfihu
Joanfihu
2 years
@Wolfof53rdSt Guess who invented the technology (transformers) behind chatGPT in 2017? You’re correct, Google. Guess who already has generated answers internally? You’re correct, Google. People don’t fully understand how entrenched Google is in search. 👇
Tweet media one
0
0
12
@joanfihu
Joanfihu
10 months
@MIT_CSAIL @JonErlichman 1980 👌🎸🎸🎸🤘
1
0
12
@joanfihu
Joanfihu
1 year
@Carnage4Life Honestly, tech has a problem overhyping things. Now “the current thing” is AI, I’ve been involved in it for over 10 years and the amount of hype isn’t good for us.
1
0
11
@joanfihu
Joanfihu
11 months
@bindureddy Closed source have the best performing models and are more convenient to use than open source. I tried over 20 models and none is close to GPT for real world application utility. It’s really hard to put guardrails to LLM to build products and using the best tools for the job
2
1
10
@joanfihu
Joanfihu
2 years
How startups should be built. @PlausibleHQ
Tweet media one
1
0
11
@joanfihu
Joanfihu
2 years
@RamaswmySridhar @Neeva Another interesting point for Neeva is that Microsoft has imposed a restriction in using LLMs output in the context of search results. So any other search engine who wants to provide straight answers like NeevaAI can’t do it. Having your own index has paid off here Neeva.
0
2
11
@joanfihu
Joanfihu
4 months
@mark_cummins Common crawl is a 3B web index sample. The public Internet is estimated to have 150B web pages. There are lots of tokens out there, that’s why OpenAI has set their own crawling infrastructure, hence the speculation about a search product…
0
0
10
@joanfihu
Joanfihu
2 years
@iamdevloper I don't know, I need to look into it in more detail. Templated answer when I don't know things.
0
0
10
@joanfihu
Joanfihu
4 months
@fchollet Common Crawl is actually just a 3B out 150B so there is still a lot to learn from the Internet
@joanfihu
Joanfihu
4 months
@mark_cummins Common crawl is a 3B web index sample. The public Internet is estimated to have 150B web pages. There are lots of tokens out there, that’s why OpenAI has set their own crawling infrastructure, hence the speculation about a search product…
0
0
10
1
0
10
@joanfihu
Joanfihu
5 months
Sharing my entry for the @kaggle QA w/ Gemma (2B/7B) competition. @GoogleAI @GoogleDeepMind The notebook (link at the end) covers the basic building blocks to adapt LLMs for your own use case: - Data collection/generation/augmentation. - Fine tuning Gemma with a P100 GPU and
0
0
9
@joanfihu
Joanfihu
11 months
@dmimno Scientists fall in love with the problem. Engineers fall in love with the utility.
0
1
9
@joanfihu
Joanfihu
3 years
@Web3Magnetic @SwiftOnSecurity @deletebot @primalanomaly @moxie How can you explain that the NFT disappeared from the wallet too?
2
0
8
@joanfihu
Joanfihu
1 year
@cwolferesearch Note that you don’t need vector embeddings to do it. A simpler approach is to generate a search query and use BM25 or TF-IDF to retrieve docs to be used as context. With the ever growing context size, chunking will eventually be unnecessary. Docs are already chunks.
1
1
9
@joanfihu
Joanfihu
7 years
Cuando las liebres se enteran que @AdrianMChef5 y @OdkhuuMChef5 estan en casa #MasterChef
1
2
7
@joanfihu
Joanfihu
2 years
@unsorsodicorda I use unit testing against evaluation metrics to help automating pipelines and also a bit of QA. It’s useful.
@joanfihu
Joanfihu
2 years
I’ve been training and deploying production ML and data systems for almost a decade. This is the process I follow:
3
7
30
1
0
9
@joanfihu
Joanfihu
6 years
@e_hothersall @jwiechers @Mummydoc1 This is the most confusing GDPR explanation I've seen so far.
0
0
9
@joanfihu
Joanfihu
11 months
@StasBekman ML libraries have been adapted to exploit Mac’s M GPU. You can get quite far with it before having to move to the cloud.
0
0
8
@joanfihu
Joanfihu
2 years
@jachiam0 As someone who creates 3D models for gaming, Generative AI art is great for inspiration. I have tons of books I use for reference. Now I can also just craft a prompt and get ideas. This empowers artist in the same way Tensorflow and PyTorch empower AI researchers.
0
1
9
@joanfihu
Joanfihu
11 months
@nathanbenaich @aaronpholmes Another idea to save costs
@joanfihu
Joanfihu
11 months
Lil trick for building LLM apps. Forget about latency and costs in the beginning. Do as many LLM calls as needed. Store those calls in a way that a dataset could be made. Train a smaller model with the dataset and unplug the LLM. I’ve replaced 4 LLM UX bridges for SVMs and
3
10
153
1
0
9
@joanfihu
Joanfihu
8 months
@bindureddy The biggest impact LLMs have is in how we search for information. Search engines will eventually be answer engines + navigation. Someone will eventually put ads there. Paid subscriptions to access free information make no sense.
0
0
6
@joanfihu
Joanfihu
2 years
@alexandr_wang Google has been using ~LLMs since the introduction of search snippets in 2014. They own distribution (i.e G pays Apple $12B to be Safari’s and Siri’s default engine. Android and Chrome were made to keep Google as a default engine. Being platform’s default means everything.
0
0
9
@joanfihu
Joanfihu
2 years
@benwbear Kubernetes was made by frontend engineers to troll their backend colleagues.
0
0
9
@joanfihu
Joanfihu
3 years
@tunguz In research, yes. In industry, no. The whole point of agile is to not work away in a cave for one year without a deliverable. As long as you don't do that, then you're already doing aGiLe! You're meant to adapt it to your needs. It's not a step-by-step process.
1
0
8
@joanfihu
Joanfihu
10 months
@alexgraveley You don’t even need a GPU for model distillation 👇
@joanfihu
Joanfihu
11 months
Lil trick for building LLM apps. Forget about latency and costs in the beginning. Do as many LLM calls as needed. Store those calls in a way that a dataset could be made. Train a smaller model with the dataset and unplug the LLM. I’ve replaced 4 LLM UX bridges for SVMs and
3
10
153
0
0
6
@joanfihu
Joanfihu
2 years
@Carnage4Life You miss an important one. Revenue. In a M&A, the acquirer also buys the revenue, which adds to the bottom line of the parent company. I can see why you might have missed since Tech companies don’t focus on revenue. But in the real world, revenue is important.
1
0
6
@joanfihu
Joanfihu
2 years
@Appyg99 If I were a VC, I would not invest in companies that rely on OpenAI. A backup plan to develop their own foundational models must be in place should OpenAI pull the rug. Having said that, a Stable Diffusion moment (weight sharing) is imminent.
0
0
8
@joanfihu
Joanfihu
1 year
@Scobleizer Apple has hardware. I hope they are working really hard on a Siri revamp because an assistant with access to low level hardware APIs plus software APIs would be very useful. OAI doesn’t have hardware.
0
0
8
@joanfihu
Joanfihu
8 months
@cwolferesearch Note that you don’t need to use embeddings to do RAG, you can use an LLMs to generate search queries against a traditional search engine like ElasticSearch + BM25. In fact, you could even use SQL and the LIKE statement for small datasets. This is easier to implement and doesn't
3
0
8
@joanfihu
Joanfihu
9 months
@jobergum RAG without a vector index is easier to implement. I’m not sure who decided to use a vector index as the standard way to do RAG?
4
1
8
@joanfihu
Joanfihu
8 months
@AndrewYNg I want to learn how to make AI agents work reliably :)
0
0
8
@joanfihu
Joanfihu
7 months
@Carnage4Life AI is an assistive technology due to its probabilistic nature. Full self driving is not assistive, it replaces the driver. It’s more about leveraging expectations, there are plenty of utilities for gen. AI. Startups don’t raise money without obnoxious claims though….
0
0
7
@joanfihu
Joanfihu
2 years
This step reveals how you need to prepare the data and hints model architectures. This requires deep thinking. Assume you know nothing about the world. When it comes to architectures, go from simple to SOTA architectures. Apply @AndrewYNg logic for development.
Tweet media one
1
0
8
@joanfihu
Joanfihu
2 years
Refactor. Code is probably a mess at this point 😂 Unit tests (UT) helps refactoring. UT on eval. metrics can be used as a quality gate. At this point, the pipe can be automated and run periodically. If there is a regular data stream, the model automatically gets better.
1
0
8
@joanfihu
Joanfihu
1 year
@greglinden A small and committed team can beat any large corporation. Startups are the best setup for product innovation.
1
0
8
@joanfihu
Joanfihu
25 days
@simonw @aiDotEngineer A big thing that kills the AI buzz is latency, people are used to ms and now they have to wait seconds for a response. I think Google did a study on bounce rate and response times.
2
0
8
@joanfihu
Joanfihu
8 years
@NandoDF you can find some data here:
0
6
8
@joanfihu
Joanfihu
1 year
@levelsio As a catalan, I'm sorry you experienced it. Some catalans can be very patriotic. Reason for being this way is that there are and have been many attempts to forbid catalan from being spoken so some people refuse to speak any other language.
9
1
6
@joanfihu
Joanfihu
8 months
@sirbayes Data augmentation?
1
0
0
@joanfihu
Joanfihu
2 years
@dennishegstad Profit margin is 64% 👀 but market cap is still 10x. Probably still bloated. Amazon is 7% margin and 2x market cap.
Tweet media one
4
0
6
@joanfihu
Joanfihu
3 years
@elonmusk Lord of the ducks - The fellowship of the park.
2
0
6
@joanfihu
Joanfihu
1 year
@svpino Bright Data passes the responsibility to the customer as you would expect… To bypass a WAF as you suggest and Bright Data enables is wrong. If a website owner has blocked an IP or User-Agent, there is a reason that must be respected.
2
0
7
@joanfihu
Joanfihu
10 months
@thesephist You don’t need to chunk… for most real world applications, documents are already chunks. You will rarely find documents with more than 350 pages (128K context).
2
0
7
@joanfihu
Joanfihu
9 years
#Swimathon15 done, 2,5Km in 00:49.33 and achieved the fundraising goal. Thanks @swimathon and @mariecurieuk 👌 http://t.co/YqPTEx5QzD
Tweet media one
2
2
7
@joanfihu
Joanfihu
11 months
@AnthropicAI Excellent work! Do you think AI assistants are sycophantic by design because end-users don’t want to be “roasted” by a machine?
0
0
6
@joanfihu
Joanfihu
7 months
@simonw You can prompt the LLM to generate function calls as HTML, most LLMs are good at it given in-context examples: <function name=“$function_name”> <param name=“$param_name”> <value>$value</value> </param> </function> You’re not vendor locked with this.
1
0
4
@joanfihu
Joanfihu
2 years
If APIs + heuristics don’t work, train your own model. A trick is to start manually labelling the data. Whilst doing that, identify your brain’s logic and what data/features the brain is paying attention to. This is non trivial as most brain processes are unconscious.
1
0
6
@joanfihu
Joanfihu
5 years
@estarianne @meganromer Billionaires, more often than not are self-made through very hard work that nobody else is willing to undertake. The majority of them invest back to society through foundations. Bill & Melinda Gates, Messi foundation, etc. They don't have to but they do. That says a lot...
7
0
6
@joanfihu
Joanfihu
19 days
@_philschmid @JinaAI_ Tip: if you ask ChatGPT to translate it, write unit tests and execute them. It’ll improve the code itself.
0
0
6
@joanfihu
Joanfihu
6 years
@biggyAC "Don't take my word for it" 😏
Tweet media one
0
0
6
@joanfihu
Joanfihu
5 years
@dmimno @chrmanning It would be easier if teachers learned student's current pop culture...
1
0
6
@joanfihu
Joanfihu
6 years
@drewsonix @MsMarvelous123 @KingMo786 People at the Ikea shop were laughing at it...
0
0
5
@joanfihu
Joanfihu
2 years
If data stream significantly changes, quality gates flag when model architecture and/or data engineering needs to be changed. Manual QA false negative/positives is a good source of potential improvements. Please share if your found this useful. Thanks ☺️
0
0
6
@joanfihu
Joanfihu
5 years
@cleantechnica @elonmusk Jesus! I thought the model 3 was meant to be for the masses. This is a luxury price tag! I will have to continue using my mountain bike to get to places 😭😭😭
0
0
6
@joanfihu
Joanfihu
2 years
@patloeber As much as I don't like JS and also as a full stack engineer, being able to use the same language front and back is a nice perk.
0
0
6
@joanfihu
Joanfihu
2 years
@ID_AA_Carmack It feels powerless when someone with more authority make decisions in which you’re undoubtedly the subject expert. But, you now have the authority and freedom to make decisions at Keen. Best of luck John.
0
0
6