Shashank Rajput @shashank_r12 Twitter profile

Last Seen Profiles

@Mitch_Custer77

@LillyKuVT

@Eliesawaya5

@shanhbie

@Aabii1986

@crossemoxHS

@_rubywang

@charfetchd

@dbmk33919777

@TheFlashCentre

@Tomiko_1

@DavidSc2369

@BionicBuzz

@onp_cn

@Belzinsy

@SpringerSurgery

@bokeplokalmalam

@AliRaza45241580

@YurAular

@poetrystripes

@AnernSolar

@alan80330561

@soymaraok

@Abdulfagge

@ArreadorOficial

@cadancash

@412gongju

@LausitzMama

@notpotmongs

@SukaIbuIbuTua2

@UTACollegePark

@DoctorWoctor_

@CuttAden

@regitoncoin

@misterperfekt

@poderdohokuto

Shashank Rajput

@shashank_r12

6 months

To all my friends: Over the past 3 months, whenever I said "Sorry, can't make it tonight, gotta work" or "Sorry, I'm busy this weekend", but then couldn't really say what exactly we were working on, THIS was the monster we were building @DbrxMosaicAI . #DBRX

Jonathan Frankle

@jefrankle

6 months

Meet DBRX, a new sota open llm from @databricks . It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.

33

262

1K

13

12

169

Shashank Rajput

@shashank_r12

1 year

Thank you @DimitrisPapail for being such an amazing advisor!!! Your help and guidance have been invaluable throughout my phd! I am also extremely lucky to have had the opportunity to collaborate with some really brilliant researchers from various universities and organizations!

Dimitris Papailiopoulos

@DimitrisPapail

1 year

PapaiLLM lab represent. Congrats to @shashank_r12 for defending a 600 page tour de force of a thesis!!

1

4

59

2

3

64

Shashank Rajput

@shashank_r12

2 years

Our paper was recently featured in the @tldrnewsletter : !!! Feeling lucky to have collaborated on this project with @madiator , @_nikhilmehta , @YiTayML , @vqctran , and other amazing people at Google Brain!

Databricks Dolly 🐑, Character AI new funding 💰, generated text is hard to detect 🔍

tldr.tech

Mahesh Sathiamoorthy

@madiator

2 years

Happy to share our recent work "Recommender Systems with Generative Retrieval"! Joint work with @shashank_r12 , @_nikhilmehta , @YiTayML , @vqctran and other awesome colleagues at Google Brain, Research, and YouTube. Preprint: #GenerativeAI 🧵 (1/n)

13

72

477

1

4

30

Shashank Rajput

@shashank_r12

4 months

Thank you @sopharicks for inviting me for the talk! It was great talking to you and the other members of BuzzRobot!

Sophia

@sopharicks

4 months

It was great to have @shashank_r12 share with the BuzzRobot community details about DBRX, the large language model created by @databricks . Shashank walked us through the architecture of the model, hyperparameter choices, the software and hardware issues the team experienced

0

2

14

2

3

24

Shashank Rajput

@shashank_r12

6 months

So true!!! 🥲

Cody Blakeney

@code_star

6 months

Me at work for the past 2 weeks

2

13

381

1

0

24

Shashank Rajput

@shashank_r12

5 months

A big advantage of two column paper format is that people can comfortably read your paper on their phones. Kind of embarrassing that I only realized this today, after years of reading papers on my phone 😅

2

1

19

Shashank Rajput

@shashank_r12

11 months

@OfirPress @BlancheMinerva @xlr8harder In our (preliminary) experiments we also see ALiBi and RoPE have matching training curves (in fact ALiBi converges a bit faster initially). Performance is also similar for eval on seq lens less than the max seq len seen during training. After that, ALiBi extrapolates better.

4

1

16

Shashank Rajput

@shashank_r12

6 months

@DbrxMosaicAI Feeling ecstatic that all the hard work by the team payed off! It was the greatest experience working with all the talented and hardworking folks @DbrxMosaicAI ! Looking forward to building even bigger and better LLMs!

0

11

Shashank Rajput

@shashank_r12

4 months

@HongyiWang10 is one of the best researchers that I've worked with. I was really lucky to have him as a senior phd student in the lab when I started my phd. He is one of the few people I know who has comprehensive expertise in both ML and Systems. Congratulations Hongyi!!!

Hongyi Wang

@HongyiWang10

4 months

[1/n] I'm thrilled to share that I will join the Rutgers CS Department @RutgersCS as a tenure-track Assistant Professor in the summer of 2025! I'm excited about and looking forward to this new chapter of my career journey!

42

8

266

1

9

Shashank Rajput

@shashank_r12

5 months

Even GPT2-chatbot isn't immune to the Sharma-ji-ka-beta syndrome

1

0

6

Shashank Rajput

@shashank_r12

11 months

@DimitrisPapail @OfirPress @BlancheMinerva @xlr8harder Yes, but we observed that it is very sensitive to learning rates.

0

4

Shashank Rajput

@shashank_r12

2 years

@srchvrs @madiator @tingchenai @_nikhilmehta @YiTayML @vqctran Yes, the amazon dataset is a particularly difficult dataset in terms of recall. Other product recommendation datasets like REES46 and YOOCHOOSE are 'easier' where a recall of ~0.5 can be reached.

0

5

Shashank Rajput

@shashank_r12

6 months

@sashadoubov @maxisawesome538 @DbrxMosaicAI @tessybarton and @jjgort , the king of scripting! 🕷️🕸️

1

0

5

Shashank Rajput

@shashank_r12

5 months

@Gradient_AI_ @AIatMeta @huggingface @CrusoeEnergy Wow! Amazing work! It seems that you used 2.8 Billion as the RoPE theta (), which is much, much bigger than any RoPE thetas seen in other models. How did you come up with that value?

config.json · gradientai/Llama-3-8B-Instruct-Gradient-1048k at 2ed19031be087ef8ab53dd395d188a4cbc...

huggingface.co

1

0

5

Shashank Rajput

@shashank_r12

6 months

@mvpatel2000 @DbrxMosaicAI Says you, who doesn't really understand what a vacation is!

0

5

Shashank Rajput

@shashank_r12

1 year

@DimitrisPapail @mraginsky You should join us at #COLT2023 in Bangalore if you want to try some more 😋

14 Famous Sweets in Karnataka You Must Try

The famous sweets of Karnataka are known all over the world for its delicious flavour and aroma. Check out the must-try sweets and desserts list here.

www.ntraveladvisor.com

2

0

5

Shashank Rajput

@shashank_r12

2 years

@madiator Thank you for hosting me @madiator ! It was amazing working with you and rest of the Google Brain team!

0

4

Shashank Rajput

@shashank_r12

6 months

@maxisawesome538 @DbrxMosaicAI no, you the captain 😂

1

0

4

Shashank Rajput

@shashank_r12

6 months

@sashadoubov @maxisawesome538 @DbrxMosaicAI @tessybarton @jjgort and @vitaliychiley ! 🕸️🕷️

0

4

Shashank Rajput

@shashank_r12

4 years

Do single author papers qualify to have a Discussion section?

1

0

3

Shashank Rajput

@shashank_r12

2 years

@JeffDean @DimitrisPapail Thank you @JeffDean ! And thank you @GoogleAI for the opportunity! Looking forward to a productive research collaboration with Google!

0

3

Shashank Rajput

@shashank_r12

4 years

Can we bring the mullet back into fashion? Cutting hair on back of the head is difficult! #Quarantine

0

3

Shashank Rajput

@shashank_r12

4 years

If WeChat is banned then how will Chinese people working in the US communicate with their families back home? I've been told of alternatives like QQ and Skype, but what is the guarantee that the US or even China (in retaliation) wouldn't ban these in the future!

0

1

2

Shashank Rajput

@shashank_r12

6 months

@SunnySanyal9 @DbrxMosaicAI For sure!

0

2

Shashank Rajput

@shashank_r12

2 years

@zeroXmusashi @madiator @_nikhilmehta @YiTayML @vqctran Yes, in order to use this to build a recommender system for a dataset, you only need two things: user session data, and some semantic data about each item. For tiktok, the latter could be stuff like video title, tags, caption or creator's name.

1

0

1

Shashank Rajput

@shashank_r12

8 months

@maxisawesome538 Business plan for today:

0

1

Shashank Rajput

@shashank_r12

6 months

@divy93t @DbrxMosaicAI Thanks Divy!

0

1

Shashank Rajput

@shashank_r12

3 years

@unsorsodicorda @aminkarbasi @DimitrisPapail @ten10_93 Yes, this paper by Park et al. - - provides results for memorization using ReLU networks, and uses a separation assumption similar to ours.

1

0

1

Shashank Rajput

@shashank_r12

2 years

@DimitrisPapail Thank you @DimitrisPapail , it has been great working you! Looking forward to a lot more of productive and fun research ahead!

0

1

Shashank Rajput

@shashank_r12

4 years

@LekhakAnurag @RealPushpendra Fake news indeed

Ministry of Ayush

@moayush

4 years

It is clarified that the @moayush has not removed any doctor or medical officer from duty or service at any time in the recent past.

197

285

822

1

0

1

Shashank Rajput

@shashank_r12

3 years

Congratulations Dr. Hongyi Wang!!!! @HongyiWang10

Dimitris Papailiopoulos

@DimitrisPapail

3 years

My first PhD student defended today, and it filled my heart with bittersweet joy. Congratulations Dr. Hongyi Wang @HongyiWang10 ! It has been an incredible honor to serve as your advisor. I can't wait to see the great things you will accomplish.

10

1

350

1

0

1

Shashank Rajput

@shashank_r12

4 years

@tripathirishii @RealPushpendra Fake news indeed

Ministry of Ayush

@moayush

4 years

It is clarified that the @moayush has not removed any doctor or medical officer from duty or service at any time in the recent past.

197

285

822

0

1

Shashank Rajput

@shashank_r12

1 year

@KartikSreeni Thank you Kartik!

0

1

Shashank Rajput

@shashank_r12

4 months

@mvpatel2000 Keeps selecting the same person for random search because nobody bothered randomizing the global seed.

0

1

Shashank Rajput

@shashank_r12

4 months

@sopharicks I would love to! :)

0

1

Shashank Rajput

@shashank_r12

3 years

@unsorsodicorda @aminkarbasi @DimitrisPapail @ten10_93 Great question! We have a margin assumption on the points, and in fact, the "exponential improvement" is in the dependence on the margin. Indeed, modern DNN can interpolate training datasets, and this was one of the motivations for our work.

1

0

1