๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau Profile
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau

@DBahdanau

5,945
Followers
36
Following
9
Media
356
Statuses

Research Scientist & Research Lead at ServiceNow Research Adjunct Prof @ McGill. Member of Mila, Quebec AI Institute. Stream of consciousness is my own.

Joined August 2017
Don't wanna be here? Send us removal request.
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
I spent 1000s of hours on competitive programming (proof-link: ). This makes me qualified to comment on #AlphaCode by @DeepMind The result is nice, the benchmark will be useful, some ideas are novel. But human level is still light years away. 1/n
26
217
1K
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 years
I'm excited to share with you that yesterday I became a PhD! Coming next: doing some good research science at fabulous @element_ai ๐Ÿ‘จโ€๐Ÿ”ฌ
Tweet media one
32
14
648
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
RIP AI Research Long Live AI Rat Race
6
31
434
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
While the whole twitter is going nuts about ChatGPT, let me just say that the HELM paper by @StanfordCRFM and @StanfordHAI is an incredible scholarship masterpiece. Make sure all your students read it and see what good research actually looks like.
5
60
345
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
You shut down your nuclear plants - you have to buy Russian gas. You don't want AI for killer drones - prepare to hide from Russian ones. Being overly virtuous and progressive in 21st century is suicide. Ukraine is a sober wake-up call. AI for Western armies? Hell yes!!
21
31
284
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
To sum up: AlphaCode is a great contribution, and AI for coding is a very promising direction with lots of great applications ahead. But this is not AlphaGo in terms of beating humans and not AlphaFold in terms of revolutionizing an entire field of science. We've got work to do.
4
11
226
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
reinforcement learning is data generation
10
7
197
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
one great thing about writing a PhD thesis is learning just how vast the history of AI is and just how little related work most papers cite...
6
8
197
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
Just received an email from AAAI organizers, saying that the reviewer load will be 5-10 (10!!!) papers, that all requests to lower the load were ignored, and that "Unless you are able to take on a full load, you should withdraw from the PC". Strikes me as not constructive.
12
7
192
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
Are you curious about systematic generalization? Do you like small, carefully controlled studies with intriguing conclusions? Check out our latest paper: . Code & data at . Work done by @MILAMontreal with the help of @Element_ai
Tweet media one
1
52
189
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
I am excited to share that as an Adjunct Prof at @mcgillu and member of @Mila_Quebec , I am looking to take 1-2 fully-funded MsC or PhD students this Fall. How to apply: (read carefully!). For possible research topics, see the thread.
3
41
178
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 years
URGENT. Make noise about Belarus. Tell your friends in the media. Rubber bullets and flash grenades are used against people in the streets.
@xnuinside
Iuliia Volkova
4 years
Belarus this night #Belarus #Belaruselection people vs dictator
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
55
63
2
81
161
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
Do you find it hard to reason about the scale of compute required for training large LMs? I have just written a tutorial for you:
2
25
154
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 years
Do you need to remove comments from the source code before uploading it to CMT for ICML? Try this: find . -type f -name "*.py" -print0 | xargs -0 sed -i '/^[[:blank:]]*#/d;s/#.*//' P.S.: cudos to stackoverflow as usual
3
14
151
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
can I really just code the most complicated part of my code in numpy without obfuscating the code with TF pyfuncs or new Theano ops? OMG pytorch, you made deep learning too easy!!
2
26
133
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
I found it very important to learn basics of LISP to start understanding symbolic AI literature. It seems like this programming language for many decades structured the way people thought and communicated with each other.
8
8
124
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
9 months
I want to try something different this year. I am looking for driven MsC students / interns who want to work on impact-oriented applied LLM projects. Bring your positive impact idea. Tell me how working under my supervision can accelerate you. Details and context below. ๐Ÿงต
4
20
120
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Importantly, the vast majority of the programs that #AlphaCode generates are wrong (Figure 8). It is the filtering using example tests that allows #AlphaCode to actually solve something. Example tests are part of the input (App. F), yet most sampled programs can't solve them.
3
5
115
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
If you want to do research on instruction following and/or language grounding, consider using our BabyAI platform: 10^19 synthetic instructions, 19 levels of varying difficulty. Work done by @MILAMontreal with the help of @Element_AI .
1
23
111
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Let me also dilute these critical remarks with a note of appreciation. AlphaCode uses a very cool โ€œclusteringโ€ method to marginalize out differently-written but semantically equivalent programs. I think forms of this approach can become a code generation staple.
1
4
106
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Using example tests is a fair game for comp. programming and perhaps for some of real world backend development. But for much of the real-world code (e.g. code that defines front-end behavior) crafting tests is not much easier than coding itself.
3
2
93
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
the perfect argument supporting my belief that large scale language modelling won't deliver robust language understanding
@tallinzen
Tal Linzen
5 years
So there's a Facebook model similar to BERT (). The paper has better experiments, e.g. this one varying the amount of data. I calculated that at this rate we'll need a corpus of 2.14e+29 tokens to get to human performance on MNLI. Get scraping!
Tweet media one
6
66
241
4
19
91
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
7 years
Dear Twitter, can you recommend a linguistics textbook for people with DL background?
15
24
88
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Sec. 6.1 makes a point that #AlphaCode does not exactly copy sequences from training data. Thatโ€™s a low bar for originality: change a variable name and this is no longer copying. It would be interesting to look at nearest neighbor solutions found using neural representations.
2
1
87
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
Idea: conferences should send small gifts (e.g. a cup) to good (not just best!) reviewers. E.g. those who write decent reviews and reply at least once to author feedback. Small symbolic incentives could go a long way in encouraging people to participate, IMHO.
5
4
86
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
The system ranks behind 54.3% participants. Note that many participants are high-school or college students who are just honing their problem-solving skills. Most people reading this could easily train to outperform #AlphaCode , especially if time pressure is removed...
2
2
82
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
What a move, copy-left license! Things are heating up in the world of LLMs. Seriously though, congratulations to @MetaAI for great results and unwavering commitment to actually open AI!
@ylecun
Yann LeCun
1 year
LLaMA is a new *open-source*, high-performance large language model from Meta AI - FAIR. Meta is committed to open research and releases all the models the research community under a GPL v3 license. - Paper: - Github:
91
434
2K
4
12
84
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
The paper emphasizes creative aspects of competitive programming, but from my experience it does involve writing lots of boilerplate code. Many problems involve deployment of standard algorithms: Levenstein-style DP, DFS/BFS graph traversals, max-flow, and so on.
1
0
70
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Limited time (e.g. 3 hours to solve 6 problems) is a key difficulty in comp. programming. The baseline human is very constrained in this model-vs-human comparison. For #AlphaCode the pretraining data, the fine-tuning data, the model size, the sampling - all was nearly maxed out.
3
1
68
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
11 months
there will be no superhuman AI, because we train AI on data and reward it by code that is written by humans, not superhumans not until we let bio-robots roam free, make randomized copies of themselves and compete for survival
37
2
65
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Impressions of @naaclmeeting : - live poster sessions are energizing and helpful! No comparable virtual alternative at the moment. - live talks are boring. Let's just watch videos! - sad to use keyword-based Underline paper search at a conference with 20+ fancy retrieval papers
2
1
61
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
An existential threat for compositional/systematic generalization research is that we select our models on the test set. The in-distribution perf. that would be best to use for model selection is at 99+%, so we select models based on the hold-out OOD data. How can we do better?
8
5
61
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
Researchers, you don't know it yet, but y'all want to take 2 days off and *really* learn to use git. Not just remember 2 basic commands, but understand how this beautiful piece of software works and how much it can help with reproducibility and collaboration.
1
5
58
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
@janleike do you think LLMs can ever get that good? what is your evidence? is there enough quality text to make them that smart? oh, I forgot you can't tell me, cause everything at OpenAI is a secret meanwhile, I can't help but note that restrictions on LLMs mean extra $$ for "Open"AI
3
0
49
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
They call deep learning a black box, often deservedly. But deep RL is many times more opaque. You change a hyperparameter of the optimizer, this affects your exploration, which in turn affects the training signal, which changes the optimization problem you are trying to solve!!!
1
7
55
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
@FelixHill84 @kchonyc No, it's not. Unless you are a famous Swiss researcher. The whole deep learning is based on a few easy, cheap ideas. It is natural that they come to many people independently. And then it is just the execution that matters.
2
1
54
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
We present CLOSURE, a systematic generalization test for visual reasoning models trained on the CLEVR dataset. Come to the poster session at Visually Grounded Interaction and Language Workshop to learn more!
1
10
54
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
@ylecun Yann, don't be like certain Swiss researchers...
2
0
50
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
For me, the turning point was reading this article on `git` internals: . It's was like reading a linear algebra textbook and all of a sudden understanding what this PyTorch thing actually does ;)
0
6
50
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
@yoavgo Two ultimate positive NLP applications: 1) Help advanced knowledge workers (think climate scientists or MDs writing metareviews) to deal with deluge of information 2) Personalized education with explanations that work for *you*. Both are not great for quickly making dough.
5
2
48
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
My colleagues received a rejection notification from ACL after the arXiV freeze has started for EMNLP. Now they again can't publicly share their work with others. The effective publication date is thus shifted by 6 months. Working as intended??
3
5
47
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 years
How to align academic research on natural language interfaces with needs of real human users has been long on the minds of @harm_devries and mysef. But now, together with @chrmanning , we wrote a paper about it. Comments welcome!
@chrmanning
Christopher Manning
4 years
The need for open data & benchmarks in modern ML research has led to an outpouring of #NLProc data creation. But @harm_devries , @DBahdanau & I suggest the low ecological validity of most of this data undermines the resulting research. Comments welcome!
Tweet media one
11
84
335
0
4
48
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
You can think various things about Meta and about energy-based models, but @ylecun 's position on the LLMs is very reasonable. Policy-makers have limited time and energy, public has limited attention span. Making them think about hypothetical dangers is wasteful.
1
3
44
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
@drjwrae Good point, but at the current level of safety and controllability Chat-GPT is only entertainment. Few real dialog applications would tolerate its unpredictable and creative behavior. People like their FSTs because they know what they do. We'll see in a few years, ofc.
1
2
45
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
FYI: our BabyAI paper has been updated to contain more accurate sample efficiency estimates.
0
9
42
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Excited to be in Seattle at @naaclmeeting , so nice to be at a conference in-person after a 2.5 year break. Please feel free to DM if you'd like to meet or catch up!
1
3
39
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
10 months
fun to be at that delightful and lovely stage of life when you're exchanging baby pics with other fellow nerds, with who you used to talk only about relative advantages of neural architectures ๐Ÿ‘ถ
2
0
39
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
11 months
@deliprao the real issue is that cramming research on Human Language Technologies with Computational Linguistics in one conference no longer works the cultures are just incompatible basically LLM research needs another publishing venue, one that respects empiricism and tolerates the rush
2
3
39
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
I have just written to my MP and asked that Canada stops buying any Russian oil and gasoline. Consider writing to your political representative. Demand the strongest possible response. #Ukraine #NoWar #RussiaUkraineWar
1
1
40
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
@NandoDF Hmm. In my experience, best research is often made almost impossible when you can't rerun the code. Research is not always about new ideas. It's often about rigorously testing existing ones. And rigorous testing is best done when you have the original code.
1
2
39
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
Wanna know what I worked on at DeepMind? Check this out!
@egrefen
Edward Grefenstette
6 years
Happy to share our new @DeepMindAI paper on AGILE, a method for training agents to follow language instructions by jointly learning a reward model from examples. No more template languages, or problems with hard/impossible to code reward functions!
2
51
210
0
7
38
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
The closest appointment slot for US visitor visa in Canada is August 2024 in Vancouver. Any ideas how international researchers in Canada can attend @icmlconf and @NeurIPSConf this year?
4
2
38
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
Of the many famous smart people I was privileged to meet, I found @geoffreyhinton to be the warmest and the kindest. It is heartwarming that he now joins the public AI discourse. It gives me hope.
0
0
38
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
So cool to see this happening. Back in my undergrad years in Belarus, I would sell my soul to the devil to attend such a summer school!
@EEMLcommunity
EEML
6 years
We are proud to announce the 2019 edition of EEML summer school, 1-6 July, Bucharest, Romania. Topics covered: DL, RL, computer vision, bayesian learning, medical imaging, and NLP. An amazing set of speakers confirmed so far! More info coming soon! Check !
Tweet media one
0
72
157
0
2
37
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
Super proud of @BigCodeProject final deliverable - capable and fast StarCoder! Numbers don't lie, this model truly feels like a leap forward for small open code+lang models. It was humbling to see how much work of how many amazing people this took. CONGRATS!!!
@BigCodeProject
BigCode
1 year
Introducing: ๐Ÿ’ซStarCoder StarCoder is a 15B LLM for code with 8k context and trained only on permissive data in 80+ programming languages. It can be prompted to reach 40% pass @1 on HumanEval and act as a Tech Assistant. Try it here: Release thread๐Ÿงต
Tweet media one
76
668
3K
0
5
36
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
I used to be proud that I started my career in Yandex. Now I am ashamed. contains nothing but propaganda. @yandex , where is Meduza and Novaya Gazeta on your website? Where is the video of a rocket hitting Kharkiv Freedom Square?
@yaroslavvb
Yaroslav Bulatov
2 years
Yandex is a key tool in shaping the alternative reality that allows Ukraine war to continue with popular support. Many people are associated with @yandex or @YandexAI and remain silent on the issue. Silence is complicity.
7
36
96
0
0
33
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
10 months
Human evaluation in AI is like particle accelerators in physics. Difficult โœ”๏ธ Messy โœ”๏ธ Laborious โœ”๏ธ The ultimate and the only source of truth โœ”๏ธ๐Ÿง‘โ€๐Ÿ”ฌ
1
3
33
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
I am very excited to share the research () & applied research () openings that we have at @element_ai , the research lab of @servicenow . See the thread to learn more. Also, this week Iโ€™m at ACL, so donโ€™t hesitate to reach out!
1
5
32
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Are you excited about large language and code models?Do you like doing research? Do you like to make GPU go brrr? Come join my team as Senior Research Developer!
2
5
32
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Thrilled that @BigCodeProject is live! Come join an open effort led by @ServiceNowRSRCH and @huggingface to help us train a big code model on an open dataset, with open preprocessing pipeline, and with insightful ablations along the way. Data and first results are coming soon!
@ServiceNowRSRCH
ServiceNow Research
2 years
We're excited to announce our collaboration with @huggingface to develop state-of-the-art LLMs for code. Code LLMs enable the completion & synthesis of code & work across a wide range of domains, tasks, & programming languages. #BigCodeProject Read more:
0
8
21
0
0
31
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
@OpenAI @ilyasut please don't let down 100s of grad students currently using Codex for research. You are ruining their projects right now. Phase out Codex at the end of 2023 if you want to. If you want humanity to trust you to lead AGI, it's good to show empathy sometimes.
2
2
29
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
9 months
while it's not too late, can we redefine RLHF to mean getting feedback directly from humans, not from the reward model? plz what is currently called RLHF, should be called RLAIF what is currently called RLAIF, should be called zero-shot RLAIF, as no feedback examples are used
1
3
29
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
10 months
No one ever brought peace closer by murdering civilians and party goers. My thoughts are with people of Israel today.
1
0
29
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
11 months
There is Research Scientist opening in my team! We are Conversational AssistanT team (๐Ÿ˜ผ) , we do R&D on turning LLMs into radically grounded and safe assistants for enterprise. Apply at We work with product. We use cutting-edge stuff. We write papers.
0
7
28
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
I don't feel like reviewing for NIPS next year. 30% of reviewers is an arbitrary threshold. Everyone who did a due diligence and wrote reasonable reviews should be able to attend. #NIPS2018
1
2
27
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
@lmthang @GoogleAI Great results, but is it really a new era? Any chance such pretraining can give us models that are not brittle, generalize systematically and can not be broken with trivial adversarial examples?
0
0
27
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 months
what are 3 key papers / demos that I should talk about in a lecture on LLM agents?
11
2
27
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
9 months
To add to my previous tweet about impact-oriented research, if you want do fundamental research on LLMs and you think you can keep up with the frantic pace of this super-crowded and overheated field, you can apply to work with me as well:
1
3
26
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Your PhD in NLP is almost done? You need a change and you want to explore another research lab? Come join us as a Research Scientist at @ServiceNowRSRCH ! Why ServiceNow? Check out the piece I wrote: Apply here:
1
4
24
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
In heated discussions about foundation models people confuse 2 different kinds of merit: theoretical appropriateness and economic impact. Denying that these models will have important applications because they donโ€™t work the way you want is kinda missing the whole point.
3
1
25
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
And BTW, MILA is hiring!
0
9
24
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
1 year
Seriously, come work with my colleague @harmdevries77 at @ServiceNowRSRCH 1โƒฃ ServiceNow loves open AI science and contributes back ๐Ÿค“ 2โƒฃ We serve 85% of Forbes 500 and many governments ๐Ÿง‘โ€๐Ÿ’ผ 3โƒฃ The work culture and work-life balance are๐Ÿ‘๐Ÿ‘๐Ÿ‘
@harmdevries77
Harm de Vries
1 year
We have a research engineer position open in my team at @ServiceNowRSRCH ! - Join the @BigCodeProject and help push the open and responsible development of cutting-edge LLMs - Publish and open-source your work - Amsterdam/Montreal
2
21
59
0
6
24
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
11 months
any evidence RLHF can improve performance on binary yes/no classification tasks like hallucination detection? my intuition is that it should have little to no impact compared to vanilla SFT
7
2
21
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 months
hacking on a gradio demo between two weeks full of meetings is basically therapy
1
3
22
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
@samipddd @DeepMind Up to a point - yes, symbolic reasoning of all kinds. At some point grounding might be needed. I think the most daring jumps of human problem-solving are grounded in our real would experience. But even now code generation seems ready to help humans. Exciting times!
2
0
21
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
@BlancheMinerva @ClementDelangue @huggingface I have the opposite opinion. The all-modeling-in-1-file approach in HF Transformers is a key reason why the library is a success. Abstractions and hierarchies just don't work in fast moving fields. Copy-paste is not ideal but better than unreadable jungle of obscure concepts.
2
0
21
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
11 months
LLMs are a thing not because of any AI godfather When we - knew that brains contain interconnected neurons - had semiconductor transistors - had computer networks the path was already charted All individuals along the way were at the right place at the right time
1
2
18
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
@karpathy People: we want to hang out with other people who live close by. Also people: I want my own house with a gigantic lawn and fences. No contradiction at all!
0
0
20
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
ะ ัƒััะบะธะต, ะพัั‚ะฐะฝะพะฒะธั‚ะต ะŸัƒั‚ะธะฝะฐ!
1
1
19
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
8 months
if you are into compositional generalization and LLMs, come check out @arkil_patel 's poster! It's MAGNIFICent!
@arkil_patel
Arkil Patel
8 months
Presenting tomorrow at #EMNLP2023 : MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations w/ amazing advisors and collaborators @DBahdanau , @sivareddyg , and @satwik1729
Tweet media one
2
17
45
0
1
19
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
The former Element AI research group is now @ServiceNowRSRCH ! Very happy with my decision to stay at @ServiceNow after the acquisition. We've got an amazing balance between curiosity-driven research and proximity to product over here.
@ServiceNowRSRCH
ServiceNow Research
2 years
1/10 You may have noticed a few changes on our channel today! Itโ€™s been a year since the acquisition of Element AI by @ServiceNow . While we have given our account a new name, weโ€™re still as committed as ever to making socially responsible contributions to the AI community.
Tweet media one
2
11
37
0
1
19
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
11 months
how do you use AI to help search and read papers? I'd pay $$ for an assistant that digs out relevant papers from my Zotero bibliography and helps me read them
2
0
18
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
Don't forget to come to our Deep Learning for Code workshop this Friday! You can submit your questions beforehand.
@DL4Code
Deep Learning For Code Workshop
2 years
Feel free to submit your questions for the talks early Or ask them in our RocketChat Channel! The full schedule with the abstracts: #ICLR22 #ICLR2022
0
3
6
0
2
18
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
@RichardSocher Is it GPU, or the general principle that more FLOPs can only be achieved through massive parallelism?
0
1
18
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
does the paper with SoTA without ablation test and without source code contribute anything at all towards to the progress in AI?
1
0
18
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
Please come check our Edge Transformer paper at NeurIPS, 7:30pm EST on Thursday, . We present a new neural model inspired by Transformers and logic (see thread). Joint work by Leon Bergen (UCSD), Tim Oโ€™Donnell (McGill) and myself ( @element_ai ).
1
6
17
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
7 months
life extension is the most dangerous technology under development do you want likes of Putin, Stalin, Mao Zedong to live forever?
3
1
15
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
9 months
This whole thing about models that are 100x GPT-4 must be a bluff, no? 25K A100 for 3 months, multiplying that by 100 is not an easy feat. I'm not even talking about inference cost and the training data required.
3
0
15
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 years
It has been 3 month of non-stop repressions and torture in Belarus after a fake election. The international pressure to end this must be stronger!
@nexta_tv
NEXTA
4 years
People who have been detained at todayโ€™s march in Minsk are still standing with their hands up in the courtyard of one of the police stations in Minsk. Theyโ€™ve been standing like that for over 5 hours now. In total, over 640 people have been detained today.
Tweet media one
Tweet media two
33
378
808
0
5
16
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
8 months
at first, our paper with @dem_fier and @ILaradji may seem modest to you but then you realize it tackles a key challenge in practical AI: simulating challenging world configurations before you they hit you in the face post-deployment go chat with @dem_fier to learn more!
@dem_fier
Gaurav Sahu
8 months
Excited to present our #EMNLP2023 paper, PromptMix: Class Boundary Augmentation Method for Large Language Model Distillation! Iโ€™m presenting it in the East Foyer. Come say hi! paper: code: #UWCheritonCS #ServiceNowResearch
Tweet media one
2
7
19
0
1
14
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
11 months
I would totally love to have 20 different RLHF papers that carefully document RLHF applications to slightly different problems in slightly different ways. But ML confs would accept the 1st one and reject the 19 others for being not novel. That's how they become irrelevant.
1
0
14
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 years
@_rockt @kchonyc There is no better indicator of success than being Schmidhubered!
0
0
16
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
2 years
it is horrifying to see what happens in Iran and #Sharif_University it is really about to time for this bloody regime to crash and burn
0
3
14
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
7 months
My New Year challenge is to re-learn to love humans as they actually are. Aggressive, competitive, status-seeking.
0
0
14
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
When deep learning drives you crazy, there is only one thing you can do. BUILD. MORE. PLOTS.
1
0
15
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
3 years
That.
@jbhuang0604
Jia-Bin Huang
3 years
Me: Reviewing CS PhD/internship applications... Also me: Yep, I am absolutely sure that I will not get into any graduate programs and would get zero internship offers if I were the applicant now. Sooooo many talented candidates!
13
13
462
0
0
15
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
6 years
My new bedtime story
Tweet media one
0
0
14
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
5 years
I have to post this as well:
Tweet media one
0
2
15
@DBahdanau
๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau
4 months
@ylecun @StevenLevy @kchonyc the problem is deeply cultural here - the audience expects a certain kind of story first there was stone age, and then came Prometheus with the fire a.k.a. Transformers and the Modern AI people love simple narratives, but I'd expect more texture from Wired
0
3
14