Georges Harik Profile
Georges Harik

@gharik

3,886
Followers
3,096
Following
1
Media
274
Statuses

early google employee. worked on ai, gmail. like to invest and think about ai.

Joined May 2007
Don't wanna be here? Send us removal request.
@gharik
Georges Harik
4 months
tesla fsd 12.3 is quite good
9
10
210
@gharik
Georges Harik
1 year
I’ve been training LMs and wanted to contribute to an open source LM. @vpj and I are releasing a 9 billion parameter LM with open licenses to be useful in a variety of settings. Its been trained on 70 billion tokens and we will release checkpoints every 20b or so tokens.
@labmlai
labml.ai
1 year
We are open sourcing GeoV-9b, a 9 billion parameter causal language model designed and trained by @gharik 🖥 Code (apache 2.0): 📀 Model weights (bigscience-openrail-m): 📗 Google Colab notebook: 🧶👇
7
120
479
6
32
162
@gharik
Georges Harik
1 year
Updated the Geov-9B model to 98 billion tokens trained.
2
1
18
@gharik
Georges Harik
7 years
The first consequence of ai will be a rapid loss of jobs requiring motion, vision and simple speech recognition.
3
6
17
@gharik
Georges Harik
1 year
This Web LLM is pretty impressive Got it to work on my MacBook m2 with Chrome canary. Good quality and great speed using the gpu.
3
1
16
@gharik
Georges Harik
3 years
the gavel of time is swift
3
1
13
@gharik
Georges Harik
7 months
Bard is better now with Gemini Pro. Nice work.
1
0
11
@gharik
Georges Harik
9 months
This is a great team working on inference and other infrastructure to make it easier to launch intelligent applications and web sites.
@DeepInfra
DeepInfra
9 months
We just closed $8M seed round from A Capital Ventures, @felicis , @gharik , @svangel and others to scale our inference platform and continue to provide simple, low cost, production API to the top open AI models.
2
5
22
0
0
15
@gharik
Georges Harik
2 years
Happy 18th birthday @gmail @paultoo sanjeev and the whole team!
1
0
13
@gharik
Georges Harik
1 year
another open instruction dataset. Will start instruction tuning Geov with this and the databricks and anthropic data soon. access to open data is awesome and these are great projects.
@vagabondjack
Mike Conover
1 year
The Open Assistant chat corpus just dropped, 100k-scale chat/instruction dataset from thousands of participants. The era of open data is upon us. High quality metadata, incl. toxicity scores, attached to each record.
13
134
651
0
1
12
@gharik
Georges Harik
2 years
Athelas is a great place to consider working at. Their company will move healthcare from reactive to proactive, from expensive to affordable and from a hospital to your home. Their vision is expansive, it started with using Vision ML, and will go much further.
@tanay_tandon
Tanay Tandon
2 years
Today, @dpcbod and I are pumped to announce the recent @athelas fundraise: $132mm across two consecutive rounds to build digital tools for healthcare providers. Honored to be partnered with @htaneja @Alfred_Lin @arjunsethi @garrytan @gharik @jhong and other incredible folks.
Tweet media one
11
23
203
0
1
12
@gharik
Georges Harik
1 year
I should mention I have some initial indication that the Roper technique is good for language modeling. On a couple of 3b parameter runs, up to 20000 steps of batch size 256, Roper shows an advantage over standard Rope encodings for attention, in terms of log likelihood.
0
1
12
@gharik
Georges Harik
1 year
this seems pretty nice and a good license.
@vitaliychiley
Vitaliy Chiley
1 year
Our team at @MosaicML has been working on releasing something special: We're proud to announce that we are OPEN SOURCING a 7B LLM trained to 1T tokens The MPT model outperforms ALL other open source models! Code: Blog: 🧵
27
221
1K
0
1
11
@gharik
Georges Harik
1 year
Just got around to reading this and it seems like a good way to get better answers, using more calls to an LLM to simulate multi agent and multi round debate in order to reach consensus.
@ShuangL13799063
Shuang Li
1 year
Check our latest paper, "Improving Factuality and Reasoning in Language Models through Multiagent Debate" .
0
1
18
0
1
8
@gharik
Georges Harik
1 year
I’ve been looking for more data to train on. This seems like it might help a lot.
@togethercompute
Together AI
1 year
Announcing RedPajama — a project to create leading, fully open-source large language models, beginning with the release of a 1.2 trillion token dataset that follows the LLaMA recipe, available today! More in 🧵 …
Tweet media one
38
408
2K
1
1
10
@gharik
Georges Harik
10 months
Looking forward to this great team improving healthcare for everyone!
@tanay_tandon
Tanay Tandon
10 months
Announcing the merger of Athelas and Commure, along with fresh capital in a fundraise. I’m excited to be taking over the combined $6b company as CEO, @dpcbod as COO, and @dhruvp as CTO. Working with @htaneja in this next phase will be exhilarating
31
26
309
0
0
8
@gharik
Georges Harik
4 years
Every week we don’t act to halt the spread of the virus is two times as many deaths.
1
3
10
@gharik
Georges Harik
4 months
One of the reasons I'm interested in this area is to allow LLMs, post training, to use variable and possibly highly increased compute in producing answers where the answers are super important to get right.
1
0
11
@gharik
Georges Harik
1 year
The TOS of bard and open ai say you can't build ML (bard) or foundation models (open ai) using their services. Is that for distillation only or does that include writing training code? What would someone who wants to use code completion / synthesis to train models use?
1
0
8
@gharik
Georges Harik
1 year
pretty amazing execution
@character_ai
Character.AI
1 year
Character fam, we're climbing the charts!! 📈 #CharacterAI is #3 in the Top Free Entertainment Apps on the @AppStore !! Thank you to our amazing community for the continuous support!❤️
Tweet media one
120
88
1K
0
0
6
@gharik
Georges Harik
4 months
New results showing Quiet Star also helps COT output - and an open source training script to try this on your own models :)
@ericzelikman
Eric Zelikman
4 months
A couple exciting updates! First, we quantitatively evaluated the improvement from combining Quiet-STaR with chain-of-thought (i.e. letting the model think before each CoT token). We found it improves zero-shot CoT accuracy on GSM8K by over 7%!
Tweet media one
9
22
155
0
1
9
@gharik
Georges Harik
2 years
free ai models is the next open source
0
0
9
@gharik
Georges Harik
3 years
Happy Thanksgiving!
0
0
9
@gharik
Georges Harik
4 months
Right now each 1000 token answer from a frontier LLM is probably produced using no more than around .1c to .2c of allocated capital spend plus power. But for situations where I really care about an answer, this cannot be increased without a large increase in capital and spend to
1
0
8
@gharik
Georges Harik
1 year
this seems pretty awesome
@ylecun
Yann LeCun
1 year
This is huge: Llama-v2 is open source, with a license that authorizes commercial use! This is going to change the landscape of the LLM market. Llama-v2 is available on Microsoft Azure and will be available on AWS, Hugging Face and other providers Pretrained and fine-tuned
423
4K
16K
0
1
6
@gharik
Georges Harik
1 year
Talking to friends Matt Smith and Craig Silverstein, we were thinking maybe one way to align AIs better is to come up with lots of AI positive literature and videos, since their self image and personality as AIs largely developed based on these characterizations.
3
0
8
@gharik
Georges Harik
1 year
This seems pretty interesting for risk reduction for Alzheimer's
@PGeldsetzer1
Pascal Geldsetzer
1 year
Biggest thing to ever come out of my little group. Pls help spread this finding! We found clean, CAUSAL evidence that the shingles vaccine prevents a good chunk of dementia cases. So, could a virus cause Alzheimer’s->YES! Hear me out & see preprint: 🧵1/
Tweet media one
335
4K
13K
0
0
5
@gharik
Georges Harik
4 years
Great progress from Deepmind on protein folding.
@demishassabis
Demis Hassabis
4 years
Thrilled to announce our first major breakthrough in applying AI to a grand challenge in science. #AlphaFold has been validated as a solution to the ‘protein folding problem’ & we hope it will have a big impact on disease understanding and drug discovery:
162
2K
8K
1
0
8
@gharik
Georges Harik
4 months
Thought generation I believe may be an additional tool (or evolution of) COT prompting and more complex techniques to elicit more correct answers, and may provide us a way of getting much better answers but at the cost of higher compute. I think ultimately these techniques will
1
0
9
@gharik
Georges Harik
2 months
Nice work on releasing SGE to more people @GoogleAI !
0
0
10
@gharik
Georges Harik
1 year
open ai is going to draw the web into chatgpt.
1
0
7
@gharik
Georges Harik
4 years
This was a pretty interesting read.
0
0
6
@gharik
Georges Harik
4 months
@MrGoldBro I'm not sure really, just seems to see and react to other vehicles, people and the road pretty well, even on surface streets.
0
0
7
@gharik
Georges Harik
4 years
Because we’re a connected country the only thing that will work is coordinated social distancing by everyone simultaneously. We can work towards that and to protect those who would be negatively impacted by such a move, or see half the country infected, and millions dead.
0
0
7
@gharik
Georges Harik
7 years
We need some solution for this. It probably involves lots of education in a scalable way, job training, job placement and mobility
1
2
6
@gharik
Georges Harik
7 months
This is a great analysis of what you need to perform associative recall with various NN architectures.
@EyubogluSabri
Sabri Eyuboglu
7 months
Curious whether sub-quadratic LMs like RWKV and Hyena will replace Transformers? We find that Transformers are still much better at associative recall (AR): a simple task known to be essential for in-context learning.
Tweet media one
4
38
145
1
0
6
@gharik
Georges Harik
1 year
this seems pretty cool. includes an open instruction tuning dataset.
@alighodsi
Ali Ghodsi
1 year
Free Dolly! Introducing the first *commercially viable*, open source, instruction-following LLM. Dolly 2.0 is available for commercial applications without having to pay for API access or sharing data with 3rd parties.
55
448
2K
0
0
6
@gharik
Georges Harik
1 year
link here
0
0
6
@gharik
Georges Harik
2 years
using a phone is destroying my spine, but its the best way to access the internet for now. any alternatives? wearing 2lbs or really any weight on my head seems like not a great replacement.
4
1
5
@gharik
Georges Harik
1 year
@stephenbalaban I'm using Lambda to train. I'm also an investor and advisor.
1
0
5
@gharik
Georges Harik
10 months
after playing with it some on deepinfra (plug) this is quite an amazing model.
@GuillaumeLample
Guillaume Lample @ ICLR 2024
10 months
Mistral 7B is out. It outperforms Llama 2 13B on every benchmark we tried. It is also superior to LLaMA 1 34B in code, math, and reasoning, and is released under the Apache 2.0 licence.
Tweet media one
52
481
3K
0
0
5
@gharik
Georges Harik
1 year
Simon is assembling a great board for an awesome company.
@jhong
james hong
1 year
Nimble is a super cool company, not surprised by this! The AI stuff is going to get even more insane stuff when we start seeing the robotics really happen
0
0
7
0
0
5
@gharik
Georges Harik
2 years
chatgpt prompt: you were just appointed speaker of the house …
1
0
4
@gharik
Georges Harik
1 year
So if cash gets worse to hold, and there's a rush to treasuries, yields go down, and the most valuable things to own become equities, medium and long term bonds and ... SVB?
1
0
5
@gharik
Georges Harik
1 year
this is a great addition to open models
@AlphaSignalAI
Lior⚡
1 year
BREAKING: StabilityAI just released their own LLM, called StableLM. "The Alpha version of the model is available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameter models to follow." The models are available on GitHub! Repo:
Tweet media one
23
189
934
0
1
4
@gharik
Georges Harik
6 months
Awesome progress from Google/Deepmind
@lmsysorg
lmsys.org
6 months
🔥Breaking News from Arena Google's Bard has just made a stunning leap, surpassing GPT-4 to the SECOND SPOT on the leaderboard! Big congrats to @Google for the remarkable achievement! The race is heating up like never before! Super excited to see what's next for Bard + Gemini
Tweet media one
154
626
3K
2
0
4
@gharik
Georges Harik
2 months
2
0
4
@gharik
Georges Harik
7 years
red moon rising
Tweet media one
3
0
4
@gharik
Georges Harik
2 years
A nice Santa message for the kids for the holidays!
@redsh
Francesco Rossi
2 years
Tired of the endless lines at the mall for meeting Santa Claus? I have the coolest app to make videos for your little ones. It’s called “BeSanta”and it runs in real time on your phone, changes the voice too
0
2
13
0
0
4
@gharik
Georges Harik
4 months
@WilliamWeishuh2 @ericzelikman @EchoShao8899 @vpj @nickhaber @noahdgoodman hopefully by reasoning about the narrative and writing before emitting words
1
0
4
@gharik
Georges Harik
1 year
This seems really great for instruction tuning a model
@ShayneRedford
Shayne Longpre @ICML
1 year
✨New Paper✨What’s the best completely public competitor to #ChatGPT ? Flan-T5 beats all public models we tested: Flan-T5 3B ▶️ T0++ 3B ▶️ OPT-IML 175B ▶️ GLM-130B ▶️ Flan 2021 3B ▶️ NIv2 3B We release the @GoogleAI 🌟Flan Collection🌟data + methods for Instruction Tuning! 1/
Tweet media one
Tweet media two
24
250
1K
1
0
3
@gharik
Georges Harik
2 years
lambda just a few days ago got more a100 40GB cards and they are still available
0
1
4
@gharik
Georges Harik
4 years
@bling0 @blingcapital Wow that's awesome Ben!
1
0
3
@gharik
Georges Harik
5 years
It seems that the use of antibiotics and antifungals on livestock and in farming outweigh their usage on humans. Maybe not the best idea to help in the development of resistant bacteria and fungi.
1
0
2
@gharik
Georges Harik
8 months
Great set of announcements and releases by Google today!
@JeffDean
Jeff Dean (@🏡)
8 months
I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,
Tweet media one
Tweet media two
276
3K
13K
0
0
3
@gharik
Georges Harik
4 months
@vpj I'll check tomorrow, and try to get 12.3.2 if I don't already have it
0
0
3
@gharik
Georges Harik
1 year
@jhong @stephenbalaban I haven’t seen anything grow as fast as lambda cloud.
0
0
3
@gharik
Georges Harik
2 years
@simonkalouche @jhong @Jason yeah! maybe manouche zaatar first?
0
0
3
@gharik
Georges Harik
1 year
I think this has real promise, it's using 3.5G of memory, while you can get pretty high memory configurations on m2s, as high as 96G and I think you can run much bigger models than this on a laptop and likely will soon.
0
0
3
@gharik
Georges Harik
1 year
@bindureddy pretty cool and nice post!
0
0
2
@gharik
Georges Harik
7 years
The first negative consequence that is. There will be lots of positive consequences. Cheaper housing, goods, services, better health care.
0
2
3
@gharik
Georges Harik
1 year
@A__Diack @labmlai To get output that follows your instructions you may want to use alpaca training or other instruction following methods. We released the weights so this, and many other post processing methods are possible using finetuning or RLHF or other methods.
1
0
3
@gharik
Georges Harik
1 year
In particular the data preparation on the datasets saves a lot of time.
0
0
3
@gharik
Georges Harik
1 year
@lxuechen good idea, we'll try it at some point
0
0
3
@gharik
Georges Harik
9 months
This seems pretty interesting, especially the causal modeling component.
@realDanFu
Dan Fu
9 months
Excited about models that are sub-quadratic in sequence length and model dimension? Our Monarch Mixer paper is now on arXiv -- and super excited to present it as an oral at #NeurIPS2023 ! Let's dive in to what's new with the paper and the new goodies from this release: Monarch
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
60
292
0
0
3
@gharik
Georges Harik
3 years
Is there someone building a usable watch (android a plus) that continuously detects co2 levels? Seems like awareness would be useful to in the long term trigger good air quality changes in indoors spaces.
0
0
3
@gharik
Georges Harik
1 year
@dsivakumar I just mean its autoregressively trained to predict the next token.
0
0
3
@gharik
Georges Harik
1 year
@jackclarkSF another thing to make it slightly more comparable is to use a fraction of the HH prompt that fits in opts context since the instruction tuned models have kind of prompt ingested being helpful.
0
0
2
@gharik
Georges Harik
1 year
@olcan @Francis_YAO_ or by sparsely activating parameters
0
0
2
@gharik
Georges Harik
1 year
@olcan @rasbt @EMostaque @OpenAI My guess is they dont have their h100sxm delivered yet so theres no point training a larger model on the same hardware as gpt4, it wouldn’t terminate in time compared to waiting for probably a lot more h100s that are also each individually faster.
1
0
2
@gharik
Georges Harik
9 months
@ylecun @joanfihu I think one thing that would be required would be a way to coordinate funding large training runs between multiple, possible quite a few, participants
0
0
2
@gharik
Georges Harik
1 year
@A__Diack @labmlai Good observation, glad it worked on Colab! I'm not sure at this stage of training and without instruction following it will do quite what you ask it to do. As for accuracy even larger models have issues with that.
1
1
2
@gharik
Georges Harik
1 year
@ShayneRedford @GoogleAI The dataset preparation script is nice. But does a lot of the training turn into either predicting a sentence whole, or a suffix based on a prefix? If so maybe just a big file of what you used in FLAN, with json markings for prefix/suffix, would get adopted more easily.
3
0
2
@gharik
Georges Harik
5 months
@AravSrinivas or possibly heating
0
0
2
@gharik
Georges Harik
1 year
@jackclarkSF seems reasonable but higher temp like 1 might get you less repetitive output.
2
0
2
@gharik
Georges Harik
1 year
@perplexity_ai this is pretty cool
1
0
2
@gharik
Georges Harik
4 years
0
1
2
@gharik
Georges Harik
1 year
@arthurmensch @vpj yes the plan is to get to around 300 billion tokens or so
0
0
2
@gharik
Georges Harik
4 months
@ptiberry @LePoint sure go ahead :)
0
0
1
@gharik
Georges Harik
1 year
@laion_ai this is a good idea
0
0
2
@gharik
Georges Harik
6 months
@olcan but did you get the vision pro?
1
0
1
@gharik
Georges Harik
1 year
hey @elon please fix deep links on ios. when i see a tweet on safari and select open app it doesnt take me to the tweet, so that button isnt as useful as it could be.
1
0
2
@gharik
Georges Harik
7 years
My guess is this includes transportation, construction, manufacturing. Independently many retail jobs are disappearing.
0
2
2
@gharik
Georges Harik
1 year
@arvinds Currently our plan is to get to around 300 billion tokens for this run. We will assess and correct if need be.
0
0
1
@gharik
Georges Harik
2 months
@olcan let me know how you like it
1
0
1
@gharik
Georges Harik
2 years
@sama I want you to train it with an reward model, and STAR on GSM8K so it can solve math problems at a middle school level.
0
0
1
@gharik
Georges Harik
2 years
@jhong @gmail @paultoo I think it's got a good shot of making it ;)
0
0
1
@gharik
Georges Harik
1 year
@OfficialLoganK can you all build a solution for indexing and querying ones own documents or files.
1
0
1
@gharik
Georges Harik
6 months
@olcan i tried to lie down to reduce neck strain but the immersive experiences dont work lying down.
0
0
1
@gharik
Georges Harik
1 year
@vagabondjack Not sure its that goofy, the instructions are shorter than the output so might be easier to learn actually, and produce data for. Also the training begins to look like ul2 maybe where you’re training to produce infills.
1
0
1
@gharik
Georges Harik
2 months
@olcan congratulations!
0
0
1
@gharik
Georges Harik
2 months
@olcan seems cool
1
0
1
@gharik
Georges Harik
1 year
@olcan @elon lol I thought it autocompleted correctly @elonmusk
0
0
1
@gharik
Georges Harik
2 years
@parindam @JeffDean @quocleix I thought it was you “promoting” chain of thought prompting :)
1
0
1
@gharik
Georges Harik
1 year
@vgoklani_ai @labmlai @_akhaliq @AiEleuther Will try to do those things may take a bit.
0
0
1
@gharik
Georges Harik
2 years
@David_desJ yeah that works well when im at a desk and monitor but not for large parts of the day when Im not
1
0
1
@gharik
Georges Harik
1 year
@vagabondjack your open model releases are great btw thanks!
1
0
1