🇺🇦 Dzmitry Bahdanau @DBahdanau Twitter profile

Last Seen Profiles

@SPG9814

@1_00_proof

@yavuzaksan

@Avaxsaurs

@kpg1428

@fadelomani

@BErika0523

@Ernest91350335

@stillwithva

@ServantFreddi

@JaimeKacie274

@KasihLudah

@N0VASZN

@kluvspjw

@convowithcream

@utanmayansinem

@Mario_BravoA

@Nadeshot

@slimerxt

@WeComics_Office

@shayshaybtch

@ebony_mp4

@rovathia

@RaniaElgharaba2

@PasutriAsli2

@Coach_Cluley

@StockGravy

@blueraider93

@aramisx

@JoachimWedel

@f1nest_viv

@ChildfreeTanya

@PoliceNat77

@JackysHideout

@BasedDimo

@lndraft

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

I spent 1000s of hours on competitive programming (proof-link: ). This makes me qualified to comment on #AlphaCode by @DeepMind The result is nice, the benchmark will be useful, some ideas are novel. But human level is still light years away. 1/n

26

217

1K

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 years

I'm excited to share with you that yesterday I became a PhD! Coming next: doing some good research science at fabulous @element_ai 👨‍🔬

32

14

648

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

RIP AI Research Long Live AI Rat Race

6

31

434

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

While the whole twitter is going nuts about ChatGPT, let me just say that the HELM paper by @StanfordCRFM and @StanfordHAI is an incredible scholarship masterpiece. Make sure all your students read it and see what good research actually looks like.

Holistic Evaluation of Language Models

Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation...

arxiv.org

5

60

345

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

You shut down your nuclear plants - you have to buy Russian gas. You don't want AI for killer drones - prepare to hide from Russian ones. Being overly virtuous and progressive in 21st century is suicide. Ukraine is a sober wake-up call. AI for Western armies? Hell yes!!

21

31

284

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

To sum up: AlphaCode is a great contribution, and AI for coding is a very promising direction with lots of great applications ahead. But this is not AlphaGo in terms of beating humans and not AlphaFold in terms of revolutionizing an entire field of science. We've got work to do.

4

11

226

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

reinforcement learning is data generation

10

7

197

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

one great thing about writing a PhD thesis is learning just how vast the history of AI is and just how little related work most papers cite...

6

8

197

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

Just received an email from AAAI organizers, saying that the reviewer load will be 5-10 (10!!!) papers, that all requests to lower the load were ignored, and that "Unless you are able to take on a full load, you should withdraw from the PC". Strikes me as not constructive.

12

7

192

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

Are you curious about systematic generalization? Do you like small, carefully controlled studies with intriguing conclusions? Check out our latest paper: . Code & data at . Work done by @MILAMontreal with the help of @Element_ai

1

52

189

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

I am excited to share that as an Adjunct Prof at @mcgillu and member of @Mila_Quebec , I am looking to take 1-2 fully-funded MsC or PhD students this Fall. How to apply: (read carefully!). For possible research topics, see the thread.

3

41

178

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 years

URGENT. Make noise about Belarus. Tell your friends in the media. Rubber bullets and flash grenades are used against people in the streets.

Iuliia Volkova

@xnuinside

4 years

Belarus this night #Belarus #Belaruselection people vs dictator

1

55

63

2

81

161

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

Do you find it hard to reason about the scale of compute required for training large LMs? I have just written a tutorial for you:

The FLOPs Calculus of Language Model Training

Extremely large language models like the famous GPT-3 by OpenAI are all the rage. Many of us are now trying to get a sense of scale of the…

medium.com

2

25

154

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 years

Do you need to remove comments from the source code before uploading it to CMT for ICML? Try this: find . -type f -name "*.py" -print0 | xargs -0 sed -i '/^[[:blank:]]*#/d;s/#.*//' P.S.: cudos to stackoverflow as usual

3

14

151

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

can I really just code the most complicated part of my code in numpy without obfuscating the code with TF pyfuncs or new Theano ops? OMG pytorch, you made deep learning too easy!!

2

26

133

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

I found it very important to learn basics of LISP to start understanding symbolic AI literature. It seems like this programming language for many decades structured the way people thought and communicated with each other.

8

124

🇺🇦 Dzmitry Bahdanau

@DBahdanau

9 months

I want to try something different this year. I am looking for driven MsC students / interns who want to work on impact-oriented applied LLM projects. Bring your positive impact idea. Tell me how working under my supervision can accelerate you. Details and context below. 🧵

4

20

120

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Importantly, the vast majority of the programs that #AlphaCode generates are wrong (Figure 8). It is the filtering using example tests that allows #AlphaCode to actually solve something. Example tests are part of the input (App. F), yet most sampled programs can't solve them.

3

5

115

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

If you want to do research on instruction following and/or language grounding, consider using our BabyAI platform: 10^19 synthetic instructions, 19 levels of varying difficulty. Work done by @MILAMontreal with the help of @Element_AI .

GitHub - mila-iqia/babyai: BabyAI platform. A testbed for training agents to understand and execute...

BabyAI platform. A testbed for training agents to understand and execute language commands. - mila-iqia/babyai

github.com

1

23

111

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Let me also dilute these critical remarks with a note of appreciation. AlphaCode uses a very cool “clustering” method to marginalize out differently-written but semantically equivalent programs. I think forms of this approach can become a code generation staple.

1

4

106

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Using example tests is a fair game for comp. programming and perhaps for some of real world backend development. But for much of the real-world code (e.g. code that defines front-end behavior) crafting tests is not much easier than coding itself.

3

2

93

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

the perfect argument supporting my belief that large scale language modelling won't deliver robust language understanding

Tal Linzen

@tallinzen

5 years

So there's a Facebook model similar to BERT (). The paper has better experiments, e.g. this one varying the amount of data. I calculated that at this rate we'll need a corpus of 2.14e+29 tokens to get to human performance on MNLI. Get scraping!

6

66

241

4

19

91

🇺🇦 Dzmitry Bahdanau

@DBahdanau

7 years

Dear Twitter, can you recommend a linguistics textbook for people with DL background?

15

24

88

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Sec. 6.1 makes a point that #AlphaCode does not exactly copy sequences from training data. That’s a low bar for originality: change a variable name and this is no longer copying. It would be interesting to look at nearest neighbor solutions found using neural representations.

2

1

87

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

Idea: conferences should send small gifts (e.g. a cup) to good (not just best!) reviewers. E.g. those who write decent reviews and reply at least once to author feedback. Small symbolic incentives could go a long way in encouraging people to participate, IMHO.

5

4

86

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

The system ranks behind 54.3% participants. Note that many participants are high-school or college students who are just honing their problem-solving skills. Most people reading this could easily train to outperform #AlphaCode , especially if time pressure is removed...

2

82

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

What a move, copy-left license! Things are heating up in the world of LLMs. Seriously though, congratulations to @MetaAI for great results and unwavering commitment to actually open AI!

Yann LeCun

@ylecun

1 year

LLaMA is a new *open-source*, high-performance large language model from Meta AI - FAIR. Meta is committed to open research and releases all the models the research community under a GPL v3 license. - Paper: - Github:

91

434

2K

4

12

84

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

The paper emphasizes creative aspects of competitive programming, but from my experience it does involve writing lots of boilerplate code. Many problems involve deployment of standard algorithms: Levenstein-style DP, DFS/BFS graph traversals, max-flow, and so on.

1

0

70

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Limited time (e.g. 3 hours to solve 6 problems) is a key difficulty in comp. programming. The baseline human is very constrained in this model-vs-human comparison. For #AlphaCode the pretraining data, the fine-tuning data, the model size, the sampling - all was nearly maxed out.

3

1

68

🇺🇦 Dzmitry Bahdanau

@DBahdanau

11 months

there will be no superhuman AI, because we train AI on data and reward it by code that is written by humans, not superhumans not until we let bio-robots roam free, make randomized copies of themselves and compete for survival

37

2

65

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Impressions of @naaclmeeting : - live poster sessions are energizing and helpful! No comparable virtual alternative at the moment. - live talks are boring. Let's just watch videos! - sad to use keyword-based Underline paper search at a conference with 20+ fancy retrieval papers

2

1

61

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

An existential threat for compositional/systematic generalization research is that we select our models on the test set. The in-distribution perf. that would be best to use for model selection is at 99+%, so we select models based on the hold-out OOD data. How can we do better?

8

5

61

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

Researchers, you don't know it yet, but y'all want to take 2 days off and *really* learn to use git. Not just remember 2 basic commands, but understand how this beautiful piece of software works and how much it can help with reproducibility and collaboration.

1

5

58

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

@janleike do you think LLMs can ever get that good? what is your evidence? is there enough quality text to make them that smart? oh, I forgot you can't tell me, cause everything at OpenAI is a secret meanwhile, I can't help but note that restrictions on LLMs mean extra $$ for "Open"AI

3

0

49

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

They call deep learning a black box, often deservedly. But deep RL is many times more opaque. You change a hyperparameter of the optimizer, this affects your exploration, which in turn affects the training signal, which changes the optimization problem you are trying to solve!!!

1

7

55

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

@FelixHill84 @kchonyc No, it's not. Unless you are a famous Swiss researcher. The whole deep learning is based on a few easy, cheap ideas. It is natural that they come to many people independently. And then it is just the execution that matters.

2

1

54

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

We present CLOSURE, a systematic generalization test for visual reasoning models trained on the CLEVR dataset. Come to the poster session at Visually Grounded Interaction and Language Workshop to learn more!

CLOSURE: Assessing Systematic Generalization of CLEVR Models

The CLEVR dataset of natural-looking questions about 3D-rendered scenes has recently received much attention from the research community. A number of models have been proposed for this task, many...

arxiv.org

1

10

54

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

@ylecun Yann, don't be like certain Swiss researchers...

2

0

50

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

For me, the turning point was reading this article on `git` internals: . It's was like reading a linear algebra textbook and all of a sudden understanding what this PyTorch thing actually does ;)

0

6

50

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

@yoavgo Two ultimate positive NLP applications: 1) Help advanced knowledge workers (think climate scientists or MDs writing metareviews) to deal with deluge of information 2) Personalized education with explanations that work for *you*. Both are not great for quickly making dough.

5

2

48

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

My colleagues received a rejection notification from ACL after the arXiV freeze has started for EMNLP. Now they again can't publicly share their work with others. The effective publication date is thus shifted by 6 months. Working as intended??

3

5

47

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 years

How to align academic research on natural language interfaces with needs of real human users has been long on the minds of @harm_devries and mysef. But now, together with @chrmanning , we wrote a paper about it. Comments welcome!

Christopher Manning

@chrmanning

4 years

The need for open data & benchmarks in modern ML research has led to an outpouring of #NLProc data creation. But @harm_devries , @DBahdanau & I suggest the low ecological validity of most of this data undermines the resulting research. Comments welcome!

11

84

335

0

4

48

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

You can think various things about Meta and about energy-based models, but @ylecun 's position on the LLMs is very reasonable. Policy-makers have limited time and energy, public has limited attention span. Making them think about hypothetical dangers is wasteful.

1

3

44

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

@drjwrae Good point, but at the current level of safety and controllability Chat-GPT is only entertainment. Few real dialog applications would tolerate its unpredictable and creative behavior. People like their FSTs because they know what they do. We'll see in a few years, ofc.

1

2

45

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

FYI: our BabyAI paper has been updated to contain more accurate sample efficiency estimates.

BabyAI: A Platform to Study the Sample Efficiency of Grounded...

We present the BabyAI platform for studying data efficiency of language learning with a human in the loop

openreview.net

0

9

42

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

OMG, why is this not the default???

Argparse: Way to include default values in '--help'?

Suppose I have the following argparse snippet: diags.cmdln_parser.add_argument( '--scan-time', action = 'store', nargs = '?', type...

stackoverflow.com

1

4

41

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Excited to be in Seattle at @naaclmeeting , so nice to be at a conference in-person after a 2.5 year break. Please feel free to DM if you'd like to meet or catch up!

1

3

39

🇺🇦 Dzmitry Bahdanau

@DBahdanau

10 months

fun to be at that delightful and lovely stage of life when you're exchanging baby pics with other fellow nerds, with who you used to talk only about relative advantages of neural architectures 👶

2

0

39

🇺🇦 Dzmitry Bahdanau

@DBahdanau

11 months

@deliprao the real issue is that cramming research on Human Language Technologies with Computational Linguistics in one conference no longer works the cultures are just incompatible basically LLM research needs another publishing venue, one that respects empiricism and tolerates the rush

2

3

39

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

I have just written to my MP and asked that Canada stops buying any Russian oil and gasoline. Consider writing to your political representative. Demand the strongest possible response. #Ukraine #NoWar #RussiaUkraineWar

1

40

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

@NandoDF Hmm. In my experience, best research is often made almost impossible when you can't rerun the code. Research is not always about new ideas. It's often about rigorously testing existing ones. And rigorous testing is best done when you have the original code.

1

2

39

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

Wanna know what I worked on at DeepMind? Check this out!

Edward Grefenstette

@egrefen

6 years

Happy to share our new @DeepMindAI paper on AGILE, a method for training agents to follow language instructions by jointly learning a reward model from examples. No more template languages, or problems with hard/impossible to code reward functions!

2

51

210

0

7

38

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

The closest appointment slot for US visitor visa in Canada is August 2024 in Vancouver. Any ideas how international researchers in Canada can attend @icmlconf and @NeurIPSConf this year?

4

2

38

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

Of the many famous smart people I was privileged to meet, I found @geoffreyhinton to be the warmest and the kindest. It is heartwarming that he now joins the public AI discourse. It gives me hope.

0

38

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

So cool to see this happening. Back in my undergrad years in Belarus, I would sell my soul to the devil to attend such a summer school!

EEML

@EEMLcommunity

6 years

We are proud to announce the 2019 edition of EEML summer school, 1-6 July, Bucharest, Romania. Topics covered: DL, RL, computer vision, bayesian learning, medical imaging, and NLP. An amazing set of speakers confirmed so far! More info coming soon! Check !

0

72

157

0

2

37

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

Super proud of @BigCodeProject final deliverable - capable and fast StarCoder! Numbers don't lie, this model truly feels like a leap forward for small open code+lang models. It was humbling to see how much work of how many amazing people this took. CONGRATS!!!

BigCode

@BigCodeProject

1 year

Introducing: 💫StarCoder StarCoder is a 15B LLM for code with 8k context and trained only on permissive data in 80+ programming languages. It can be prompted to reach 40% pass @1 on HumanEval and act as a Tech Assistant. Try it here: Release thread🧵

76

668

3K

0

5

36

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

I used to be proud that I started my career in Yandex. Now I am ashamed. contains nothing but propaganda. @yandex , where is Meduza and Novaya Gazeta on your website? Where is the video of a rocket hitting Kharkiv Freedom Square?

Yaroslav Bulatov

@yaroslavvb

2 years

Yandex is a key tool in shaping the alternative reality that allows Ukraine war to continue with popular support. Many people are associated with @yandex or @YandexAI and remain silent on the issue. Silence is complicity.

7

36

96

0

33

🇺🇦 Dzmitry Bahdanau

@DBahdanau

10 months

Human evaluation in AI is like particle accelerators in physics. Difficult ✔️ Messy ✔️ Laborious ✔️ The ultimate and the only source of truth ✔️🧑‍🔬

1

3

33

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

I am very excited to share the research () & applied research () openings that we have at @element_ai , the research lab of @servicenow . See the thread to learn more. Also, this week I’m at ACL, so don’t hesitate to reach out!

ServiceNow is looking for a Senior AI Applied Research Scientist - ATG in 6650 St. Urbain Street...

jobs.smartrecruiters.com

1

5

32

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Are you excited about large language and code models?Do you like doing research? Do you like to make GPU go brrr? Come join my team as Senior Research Developer!

ServiceNow is looking for a Senior Research Developer - ATG in 6650 St. Urbain Street Suite 500,...

jobs.smartrecruiters.com

2

5

32

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Thrilled that @BigCodeProject is live! Come join an open effort led by @ServiceNowRSRCH and @huggingface to help us train a big code model on an open dataset, with open preprocessing pipeline, and with insightful ablations along the way. Data and first results are coming soon!

ServiceNow Research

@ServiceNowRSRCH

2 years

We're excited to announce our collaboration with @huggingface to develop state-of-the-art LLMs for code. Code LLMs enable the completion & synthesis of code & work across a wide range of domains, tasks, & programming languages. #BigCodeProject Read more:

0

8

21

0

31

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

@OpenAI @ilyasut please don't let down 100s of grad students currently using Codex for research. You are ruining their projects right now. Phase out Codex at the end of 2023 if you want to. If you want humanity to trust you to lead AGI, it's good to show empathy sometimes.

2

29

🇺🇦 Dzmitry Bahdanau

@DBahdanau

9 months

while it's not too late, can we redefine RLHF to mean getting feedback directly from humans, not from the reward model? plz what is currently called RLHF, should be called RLAIF what is currently called RLAIF, should be called zero-shot RLAIF, as no feedback examples are used

1

3

29

🇺🇦 Dzmitry Bahdanau

@DBahdanau

10 months

No one ever brought peace closer by murdering civilians and party goers. My thoughts are with people of Israel today.

1

0

29

🇺🇦 Dzmitry Bahdanau

@DBahdanau

11 months

There is Research Scientist opening in my team! We are Conversational AssistanT team (😼) , we do R&D on turning LLMs into radically grounded and safe assistants for enterprise. Apply at We work with product. We use cutting-edge stuff. We write papers.

0

7

28

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

I don't feel like reviewing for NIPS next year. 30% of reviewers is an arbitrary threshold. Everyone who did a due diligence and wrote reasonable reviews should be able to attend. #NIPS2018

1

2

27

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

@lmthang @GoogleAI Great results, but is it really a new era? Any chance such pretraining can give us models that are not brittle, generalize systematically and can not be broken with trivial adversarial examples?

0

27

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 months

what are 3 key papers / demos that I should talk about in a lecture on LLM agents?

11

2

27

🇺🇦 Dzmitry Bahdanau

@DBahdanau

9 months

To add to my previous tweet about impact-oriented research, if you want do fundamental research on LLMs and you think you can keep up with the frantic pace of this super-crowded and overheated field, you can apply to work with me as well:

Mila | Supervision Requests

Each student at Mila is supervised by one of our affiliated professors. Applicants are selected through the supervision request process.

mila.quebec

1

3

26

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Your PhD in NLP is almost done? You need a change and you want to explore another research lab? Come join us as a Research Scientist at @ServiceNowRSRCH ! Why ServiceNow? Check out the piece I wrote: Apply here:

From Big Bench to Happier Users

It is 2022 and giant models have exceeded all expectations. The Big Bench benchmark is barely holding up against PaLM.

www.linkedin.com

1

4

24

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

In heated discussions about foundation models people confuse 2 different kinds of merit: theoretical appropriateness and economic impact. Denying that these models will have important applications because they don’t work the way you want is kinda missing the whole point.

3

1

25

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

And BTW, MILA is hiring!

0

9

24

🇺🇦 Dzmitry Bahdanau

@DBahdanau

1 year

Seriously, come work with my colleague @harmdevries77 at @ServiceNowRSRCH 1⃣ ServiceNow loves open AI science and contributes back 🤓 2⃣ We serve 85% of Forbes 500 and many governments 🧑‍💼 3⃣ The work culture and work-life balance are👍👍👍

Harm de Vries

@harmdevries77

1 year

We have a research engineer position open in my team at @ServiceNowRSRCH ! - Join the @BigCodeProject and help push the open and responsible development of cutting-edge LLMs - Publish and open-source your work - Amsterdam/Montreal

2

21

59

0

6

24

🇺🇦 Dzmitry Bahdanau

@DBahdanau

11 months

any evidence RLHF can improve performance on binary yes/no classification tasks like hallucination detection? my intuition is that it should have little to no impact compared to vanilla SFT

7

2

21

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 months

hacking on a gradio demo between two weeks full of meetings is basically therapy

1

3

22

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

@samipddd @DeepMind Up to a point - yes, symbolic reasoning of all kinds. At some point grounding might be needed. I think the most daring jumps of human problem-solving are grounded in our real would experience. But even now code generation seems ready to help humans. Exciting times!

2

0

21

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

@BlancheMinerva @ClementDelangue @huggingface I have the opposite opinion. The all-modeling-in-1-file approach in HF Transformers is a key reason why the library is a success. Abstractions and hierarchies just don't work in fast moving fields. Copy-paste is not ideal but better than unreadable jungle of obscure concepts.

2

0

21

🇺🇦 Dzmitry Bahdanau

@DBahdanau

11 months

LLMs are a thing not because of any AI godfather When we - knew that brains contain interconnected neurons - had semiconductor transistors - had computer networks the path was already charted All individuals along the way were at the right place at the right time

1

2

18

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

@karpathy People: we want to hang out with other people who live close by. Also people: I want my own house with a gigantic lawn and fences. No contradiction at all!

0

20

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Русские, остановите Путина!

1

19

🇺🇦 Dzmitry Bahdanau

@DBahdanau

8 months

if you are into compositional generalization and LLMs, come check out @arkil_patel 's poster! It's MAGNIFICent!

Arkil Patel

@arkil_patel

8 months

Presenting tomorrow at #EMNLP2023 : MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations w/ amazing advisors and collaborators @DBahdanau , @sivareddyg , and @satwik1729

2

17

45

0

1

19

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

The former Element AI research group is now @ServiceNowRSRCH ! Very happy with my decision to stay at @ServiceNow after the acquisition. We've got an amazing balance between curiosity-driven research and proximity to product over here.

ServiceNow Research

@ServiceNowRSRCH

2 years

1/10 You may have noticed a few changes on our channel today! It’s been a year since the acquisition of Element AI by @ServiceNow . While we have given our account a new name, we’re still as committed as ever to making socially responsible contributions to the AI community.

2

11

37

0

1

19

🇺🇦 Dzmitry Bahdanau

@DBahdanau

11 months

how do you use AI to help search and read papers? I'd pay $$ for an assistant that digs out relevant papers from my Zotero bibliography and helps me read them

2

0

18

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

Don't forget to come to our Deep Learning for Code workshop this Friday! You can submit your questions beforehand.

Deep Learning For Code Workshop

@DL4Code

2 years

Feel free to submit your questions for the talks early Or ask them in our RocketChat Channel! The full schedule with the abstracts: #ICLR22 #ICLR2022

0

3

6

0

2

18

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

@RichardSocher Is it GPU, or the general principle that more FLOPs can only be achieved through massive parallelism?

0

1

18

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

does the paper with SoTA without ablation test and without source code contribute anything at all towards to the progress in AI?

1

0

18

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

Please come check our Edge Transformer paper at NeurIPS, 7:30pm EST on Thursday, . We present a new neural model inspired by Transformers and logic (see thread). Joint work by Leon Bergen (UCSD), Tim O’Donnell (McGill) and myself ( @element_ai ).

1

6

17

🇺🇦 Dzmitry Bahdanau

@DBahdanau

7 months

life extension is the most dangerous technology under development do you want likes of Putin, Stalin, Mao Zedong to live forever?

3

1

15

🇺🇦 Dzmitry Bahdanau

@DBahdanau

9 months

This whole thing about models that are 100x GPT-4 must be a bluff, no? 25K A100 for 3 months, multiplying that by 100 is not an easy feat. I'm not even talking about inference cost and the training data required.

3

0

15

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 years

It has been 3 month of non-stop repressions and torture in Belarus after a fake election. The international pressure to end this must be stronger!

NEXTA

@nexta_tv

4 years

People who have been detained at today’s march in Minsk are still standing with their hands up in the courtyard of one of the police stations in Minsk. They’ve been standing like that for over 5 hours now. In total, over 640 people have been detained today.

33

378

808

0

5

16

🇺🇦 Dzmitry Bahdanau

@DBahdanau

8 months

at first, our paper with @dem_fier and @ILaradji may seem modest to you but then you realize it tackles a key challenge in practical AI: simulating challenging world configurations before you they hit you in the face post-deployment go chat with @dem_fier to learn more!

Gaurav Sahu

@dem_fier

8 months

Excited to present our #EMNLP2023 paper, PromptMix: Class Boundary Augmentation Method for Large Language Model Distillation! I’m presenting it in the East Foyer. Come say hi! paper: code: #UWCheritonCS #ServiceNowResearch

2

7

19

0

1

14

🇺🇦 Dzmitry Bahdanau

@DBahdanau

11 months

I would totally love to have 20 different RLHF papers that carefully document RLHF applications to slightly different problems in slightly different ways. But ML confs would accept the 1st one and reject the 19 others for being not novel. That's how they become irrelevant.

1

0

14

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 years

@_rockt @kchonyc There is no better indicator of success than being Schmidhubered!

0

16

🇺🇦 Dzmitry Bahdanau

@DBahdanau

2 years

it is horrifying to see what happens in Iran and #Sharif_University it is really about to time for this bloody regime to crash and burn

0

3

14

🇺🇦 Dzmitry Bahdanau

@DBahdanau

7 months

My New Year challenge is to re-learn to love humans as they actually are. Aggressive, competitive, status-seeking.

0

14

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

When deep learning drives you crazy, there is only one thing you can do. BUILD. MORE. PLOTS.

1

0

15

🇺🇦 Dzmitry Bahdanau

@DBahdanau

3 years

That.

Jia-Bin Huang

@jbhuang0604

3 years

Me: Reviewing CS PhD/internship applications... Also me: Yep, I am absolutely sure that I will not get into any graduate programs and would get zero internship offers if I were the applicant now. Sooooo many talented candidates!

13

462

0

15

🇺🇦 Dzmitry Bahdanau

@DBahdanau

6 years

My new bedtime story

0

14

🇺🇦 Dzmitry Bahdanau

@DBahdanau

5 years

I have to post this as well:

0

2

15

🇺🇦 Dzmitry Bahdanau

@DBahdanau

4 months

@ylecun @StevenLevy @kchonyc the problem is deeply cultural here - the audience expects a certain kind of story first there was stone age, and then came Prometheus with the fire a.k.a. Transformers and the Modern AI people love simple narratives, but I'd expect more texture from Wired

0

3

14