Zhuang Liu Profile
Zhuang Liu

@liuzhuang1234

3,702
Followers
1,019
Following
32
Media
230
Statuses

Research Scientist @MetaAI (FAIR, at NYC). machine learning, computer vision, neural networks. PhD from @Berkeley_EECS

New York
Joined April 2016
Don't wanna be here? Send us removal request.
Pinned Tweet
@liuzhuang1234
Zhuang Liu
4 months
LLMs are great, but their internals are less explored. I'm excited to share very interesting findings in paper “Massive Activations in Large Language Models” LLMs have very few internal activations with drastically outsized magnitudes, e.g., 100,000x larger than others. (1/n)
Tweet media one
32
171
1K
@liuzhuang1234
Zhuang Liu
6 months
How to choose a vision model for your specific needs? How do ConvNet / ViT, supervised / CLIP models compare with each other on metrics beyond ImageNet? Our work comprehensively compares common vision models on "non-standard" metrics. (1/n)
Tweet media one
10
149
757
@liuzhuang1234
Zhuang Liu
4 months
Diffusion models have achieved remarkable results in visual generation. We demonstrate it can also generate neural networks parameters, in our new paper: "Neural Network Diffusion" (1/n)
@_akhaliq
AK
4 months
Neural Network Diffusion Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also generate high-performing neural network parameters. Our approach is simple, utilizing an autoencoder and a
23
250
1K
21
87
583
@liuzhuang1234
Zhuang Liu
2 years
Filed my Ph.D. dissertation "Efficient and Scalable Neural Architectures for Visual Recognition" yesterday! Hope this can be helpful to anyone who is interested in neural network architectures, especially if you are looking for a different angle.
Tweet media one
11
65
554
@liuzhuang1234
Zhuang Liu
2 years
Happy to share that ConvNeXt ("A ConvNet for the 2020s") is accepted at #CVPR2022 ! Also, check out our arXiv v2 version where we:
@_akhaliq
AK
2 years
A ConvNet for the 2020s abs: github: Constructed entirely from standard ConvNet modules, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers on COCO detection and ADE20K segmentation
Tweet media one
11
210
976
9
83
483
@liuzhuang1234
Zhuang Liu
2 years
Meta AI is hiring research interns on computer vision and deep learning for 2023 summer and fall! Apply using the link below. If you are interested in working with me, please also send me an email with your CV and research interests :)
4
22
178
@liuzhuang1234
Zhuang Liu
4 months
Very excited to share one of the most interesting projects I've ever worked on, but first, a small game: Here are 15 images from three of the largest and most diverse modern image datasets: YFCC100M, CC12M and DataComp-1B. Can you guess which images are from which datasets?
Tweet media one
10
23
149
@liuzhuang1234
Zhuang Liu
1 year
Since AlexNet, dropout has been recognized for reducing overfitting. But did you know it can also mitigate underfitting? Excited to share our recent paper - "Dropout Reduces Underfitting". We find early dropout can lead to a lower train loss. ⬇️
Tweet media one
2
19
123
@liuzhuang1234
Zhuang Liu
1 year
Check out our latest work on pruning LLMs! Reduces size of LLM to half without retraining or weight update, while largely maintaining zero-shot performance. My favorite is its simplicity - multiplying weights and activations and you get the metric.
@_mingjiesun
Mingjie Sun
1 year
How to reduce the size of a Large Language Model? Sharing our latest work on pruning LLMs - “A Simple and Effective Pruning Approach for Large Language Models”. We show LLMs have effective sparse networks without weight update or retraining. 🧵⬇️
Tweet media one
2
53
192
0
25
95
@liuzhuang1234
Zhuang Liu
1 year
I'm here at Hawaii too for ICML! The same place where I entered US for CVPR 2017 and also to start grad school. Looking to connect with old and new friends! Ping me if you'd like to :)
Tweet media one
4
3
84
@liuzhuang1234
Zhuang Liu
2 months
With 4 borderline reject and 1 borderline accept after rebuttal (lower before it), I feel incredibly lucky to have this paper accepted to ICML'24 Really appreciate the hard decision from the AC, to accept a paper with no new methods, and the feedback from the reviewers
@liuzhuang1234
Zhuang Liu
6 months
How to choose a vision model for your specific needs? How do ConvNet / ViT, supervised / CLIP models compare with each other on metrics beyond ImageNet? Our work comprehensively compares common vision models on "non-standard" metrics. (1/n)
Tweet media one
10
149
757
3
6
85
@liuzhuang1234
Zhuang Liu
4 months
While they are very rare, massive activations cannot be set to zero - this will destroy the model. But they can be set to input agnostic constant mean values, without hurting the model. This means massive activations act as fixed but important bias terms in LLMs.
Tweet media one
1
5
78
@liuzhuang1234
Zhuang Liu
2 months
The greater the paper is, the easier it is to find a reason to reject it? (e.g., not SOTA, too trivial/not novel, no theory/experiment, or hard to understand) Looking back at history I find this may be true? for papers that are above a certain low threshold.
7
3
75
@liuzhuang1234
Zhuang Liu
5 months
Diffusion models can do more than generation. Check out our new work on analyzing what's useful in diffusion models for visual representation learning! @endernewton @sainingxie
@_akhaliq
AK
5 months
Meta presents Deconstructing Denoising Diffusion Models for Self-Supervised Learning paper page: examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to
Tweet media one
2
110
507
0
5
72
@liuzhuang1234
Zhuang Liu
6 months
Lesson: look beyond pure accuracies! Instead, choose what suits your needs. Project led by our amazing Kirill Vishniakov @kirill_vish , who is seeking a PhD position. Hire him! (n/n) paper: code: web:
1
12
69
@liuzhuang1234
Zhuang Liu
8 months
Exactly! In the ConvNeXt paper, we did convey the same message two years ago on ImageNet-21k, with step by step experiments on what contributed to the ViT > ConvNet misconception Check out "A ConvNet for the 2020s" if you haven't
@i_ikhatri
Ishan Khatri
8 months
Yeah I think people were/are caught up in the hype. It’s cool that google proved out the scaling laws in a way that only google can, but the ConvNeXt paper from Trevor Darrell’s group (and @Meta ) in 2020 had the same conclusion on ImageNet:
0
0
9
2
6
62
@liuzhuang1234
Zhuang Liu
4 months
Massive activations are closely connected to self-attention. They lead to the concentration of attention probabilities to their sequence dimensions.
Tweet media one
1
2
49
@liuzhuang1234
Zhuang Liu
4 months
We call them "massive activations", and they appear in various model sizes and families. They appear at particular sequence dimensions (e.g., start of sequence, period or newline tokens) and feature dimensions.
Tweet media one
1
2
46
@liuzhuang1234
Zhuang Liu
11 months
Come to poster 629 now to see how dropout can reduce underfitting!
Tweet media one
0
1
38
@liuzhuang1234
Zhuang Liu
2 years
I don't know who need to hear this, but arxiv-utils is the browser extension everyone should use! It shows you the *actual* titles on the tabs and when you download, not xxxx.yyyyy.pdf. Also can go from pdf to the abs page.
@colinraffel
Colin Raffel
2 years
Tweet media one
18
139
2K
0
2
39
@liuzhuang1234
Zhuang Liu
4 months
LLMs use such concentrated attention patterns to enforce an implicit form of bias terms in the attention output.
Tweet media one
1
1
33
@liuzhuang1234
Zhuang Liu
7 months
Check our latest work on initializing a model with a larger, pretrained one! Faster learning with no added cost
@oscar_zhiqiu_xu
Zhiqiu (Oscar) Xu
7 months
You don’t have to train from scratch whenever developing a smaller model of an existing model family. Sharing our latest work - “Initializing Models with Larger Ones” arxiv preprint: code:
Tweet media one
6
53
360
1
5
33
@liuzhuang1234
Zhuang Liu
4 months
Massive activations also exist in many Vision Transformers, but not all. When they do exist, their function is similar - fixed but important biases.
Tweet media one
1
0
32
@liuzhuang1234
Zhuang Liu
4 months
Joint work with Kaiming He Check the paper for more! (non-)code: arxiv: (Answer to the game: YFCC: 1, 4, 7, 10, 13; CC: 2, 5, 8, 11, 14; DataComp: 3, 6, 9, 12, 15)
0
4
33
@liuzhuang1234
Zhuang Liu
11 months
Given the bad situation for ML reviews, should we make paper-reviewer matching a high-stake AI/NLP challenge (like ImageNet/COCO in vision)? If we use the winner solutions, we might get less random reviews and assignments? I feel the matching system is not optimized enough...
6
0
29
@liuzhuang1234
Zhuang Liu
1 year
Excited to be a part of the ImageBind project with the team! Our latest model embeds data from multiple modalities into a shared representation space, enabling representation arithmetic, generations, and more.
@DrJimFan
Jim Fan
1 year
Wow, @MetaAI is on open-source steroids since Llama. ImageBind: Meta's latest multimodal embedding, covering not only the usual suspects (text, image, audio), but also depth, thermal (infrared), and IMU signals! OpenAI Embedding is the foundation for AI-powered search and
41
373
2K
0
2
27
@liuzhuang1234
Zhuang Liu
1 year
Congratulations to the LLaMA 2 team! A big event for research and applications across academia and industry
@_akhaliq
AK
1 year
Meta releases Llama 2: Open Foundation and Fine-Tuned Chat Models paper: blog: develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion
Tweet media one
35
571
2K
0
3
27
@liuzhuang1234
Zhuang Liu
2 months
@aaron_defazio This point closely relates to two of my previous papers: using L1 sparsity for pruning/slimming a convnet: demonstrates structured pruning is actually about architectures not weights, and you can train it from scratch
0
1
22
@liuzhuang1234
Zhuang Liu
6 months
Robustness and Transferability: 1) Supervised models are superior in robustness benchmarks that are ImageNet variants. But when it comes to feature transferability, CLIP models are better. 2) Surprisingly, supervised ConvNeXt almost matches CLIP in transferability! (8/n)
Tweet media one
1
1
22
@liuzhuang1234
Zhuang Liu
4 months
We train 1) an Autoencoder for projecting NN parameters to a latent space (and back), and 2) a standard LDM to learn the distribution of high-performing parameters in the latent space. The new parameter generation process then follows standard LDMs. (3/n)
Tweet media one
1
1
20
@liuzhuang1234
Zhuang Liu
6 months
Exploring synthetic data performance on PUG-ImageNet: ConvNeXt stands out! It consistently outperforms ViT. (5/n)
Tweet media one
1
0
21
@liuzhuang1234
Zhuang Liu
2 years
1. Added ImageNet-22k ConvNeXt-Tiny/Small models and results 2. Modified Figure 1 so now ResNet & ViT results are with improved training settings 3. Added EfficientNet-V2 into ImageNet result comparison and discussion
2
2
18
@liuzhuang1234
Zhuang Liu
4 months
Is p-diff only memorizing the neural network parameters used in its training? Through multiple experiments, we show the answer is no. p-diff generated networks are not identical or similar copies to the models used for training. (5/n)
Tweet media one
1
0
19
@liuzhuang1234
Zhuang Liu
8 months
Battle of pretrained models, many ConvNets and Transformer variants Happy to see ConvNeXt perform well on many tasks!
@micahgoldblum
Micah Goldblum
8 months
🚨Excited to announce a large-scale comparison of pretrained vision backbones including SSL, vision-language models, and CNNs vs ViTs across diverse downstream tasks ranging from classification to detection to OOD generalization and more! NeurIPS 2023🚨🧵
6
93
414
0
0
19
@liuzhuang1234
Zhuang Liu
6 months
Exploring model mistake factors using ImageNet-X: (1) CLIP models make fewer mistakes relative to their ImageNet accuracy than supervised. (2) All models suffer mostly from complex factors like occlusion. (3) Texture is the most challenging factor for all models. (4/n)
Tweet media one
1
0
18
@liuzhuang1234
Zhuang Liu
6 months
We analyze a wide range of behaviors for 1) ViT and ConvNeXt architectures, 2) supervised and CLIP training methods. With almost identical ImageNet accuracy within each training method, models can have vastly different behaviors, detailed below. (3/n)
Tweet media one
1
0
18
@liuzhuang1234
Zhuang Liu
6 months
Exploring model calibration on ImageNet and ImageNet-R: 1) CLIP models tend to be overconfident, and supervised models are slightly underconfident. 2) Supervised ConvNeXt outperforms ViT, challenging previous beliefs that ViTs are better calibrated than ConvNets. (6/n)
Tweet media one
1
0
18
@liuzhuang1234
Zhuang Liu
8 months
Is it me or does chatgpt become really slow and crash/get stuck too often
2
0
18
@liuzhuang1234
Zhuang Liu
4 months
1. Neural network training and diffusion generation processes are both transitions from random to highly-specific distributions. 2. High-performing NN parameters and high-quality images can both degrade to simple noise distributions, through compounded noise additions. (2/n)
Tweet media one
3
0
18
@liuzhuang1234
Zhuang Liu
2 months
Thinking AlexNet/ResNet/ViT/GPTs/Transformers, it probably takes less time to come up with a reason to reject them than an average accepted paper…
2
0
17
@liuzhuang1234
Zhuang Liu
6 months
Transformation Invariance (scale, shift, and resolution transform): 1) Supervised ConvNeXt is the most invariant model for all of the transforms. 2) Overall, models are more robust to shift than to scale/resolution. (9/n)
Tweet media one
1
1
17
@liuzhuang1234
Zhuang Liu
6 months
Exploring shape/texture bias on cue-conflict images: CLIP models are more shape-biased, showing improvement for 7% and 12% over supervised ViT and ConvNeXt. (7/n)
Tweet media one
1
0
15
@liuzhuang1234
Zhuang Liu
2 years
@giffmana We observed this too in our 2016 Stochastic Depth project. Even loss is plateauing, it's still better to wait a bit before step decaying lr. We didn't document this on paper though. Curious if there's anything explanation
1
0
15
@liuzhuang1234
Zhuang Liu
6 months
Why go beyond ImageNet accuracy? Choosing a model for practical tasks with different conditions naturally demands looking beyond standard performance measures. As more models achieve similarly high ImageNet accuracy, the number also becomes a little saturated. (2/n)
1
1
14
@liuzhuang1234
Zhuang Liu
4 months
Our results suggest these large-scale modern vision datasets are still incredibly biased in the eyes of neural networks. We hope our discovery will inspire the community to rethink the issue involving dataset bias and model capabilities.
1
0
12
@liuzhuang1234
Zhuang Liu
4 months
Back to 2011, the Torralba and Efros paper below called for a battle against dataset bias in the community, right before the dawn of the deep learning revolution. They found an SVM can classify images' dataset identity from 12 datasets much better than random guessing.
Tweet media one
1
0
10
@liuzhuang1234
Zhuang Liu
11 months
100% agree. I find there are situations where 1. maximizing paper's impact for general readers, and 2. trying to get it accepted, leads to different ways of writing. This shouldn't be a choice at all but sometimes it is... very unfortunate
@MattNiessner
Matthias Niessner
11 months
A structural issue in research is the short-focus on getting papers accepted. The optimization for good reviews, however, can be very local and is often uncorrelated with long-term impact.
5
7
83
0
0
11
@liuzhuang1234
Zhuang Liu
4 months
p-diff obtains favorable results compared to original SGD or ensemble baselines. (Table shows accuracy in the order of SGD / ensemble / p-diff) (4/n)
Tweet media one
1
0
10
@liuzhuang1234
Zhuang Liu
4 months
Motivated by this, we propose neural network diffusion (or p-diff, p=parameter). The approach is very simple.
1
0
10
@liuzhuang1234
Zhuang Liu
10 months
@jxmnop We have a very relevant discussion on dropout and overfitting / underfitting at the intro of our paper "Dropout Reduces Underfitting". Recommend a read for anyone interested in this topic
@liuzhuang1234
Zhuang Liu
1 year
Since AlexNet, dropout has been recognized for reducing overfitting. But did you know it can also mitigate underfitting? Excited to share our recent paper - "Dropout Reduces Underfitting". We find early dropout can lead to a lower train loss. ⬇️
Tweet media one
2
19
123
0
1
9
@liuzhuang1234
Zhuang Liu
1 year
There is so much we still don't know about the most basic components of deep learning. Curious to learn & explore more! Joint work with @OscarXu96574719 , Joseph Jin, @szq0214 , @trevordarrell We are excited to present our findings at ICML 2023! Code:
1
0
10
@liuzhuang1234
Zhuang Liu
2 years
(2/2) We redesign dense prediction vision models so that they output early results progressively, and use the confidence values at different spatial locations to guide later computations. It can save up to 50% total computation while giving additional early predictions
Tweet media one
0
0
7
@liuzhuang1234
Zhuang Liu
4 months
Our further experiments show that such a dataset classifier could learn semantic features that are generalizable and transferable, which cannot be simply explained by memorization.
1
0
8
@liuzhuang1234
Zhuang Liu
4 months
For example, we report 84.7% accuracy on held-out validation data for the three-way classification problem consisting of the YFCC, CC, and DataComp datasets, which samples were shown at the start of this thread.
1
0
8
@liuzhuang1234
Zhuang Liu
4 months
In this work, we revisit this “dataset classification” experiment suggested by Torralba and Efros, in the new era with large-scale, diverse, and hopefully less biased datasets as well as more capable neural network architectures.
1
0
7
@liuzhuang1234
Zhuang Liu
4 months
@tienhaophung Yes that's a great paper and the most relevant! The main difference is they generate parameters step by step, more like an optimizer, taking a previous checkpoint as input. We directly generate the whole set of parameters without previous weights as inputs.
0
0
7
@liuzhuang1234
Zhuang Liu
4 months
Though the game on modern datasets might seem hard for humans, surprisingly, we observe that modern neural networks can achieve excellent accuracy in classifying which dataset an image is from.
1
0
5
@liuzhuang1234
Zhuang Liu
1 year
@rasbt @francoisfleuret I'd like to clarify a bit: our paper finds *early dropout* reduces underfitting, and it's not necessarily only for ViTs, but also for other models. Thanks for bringing our paper though!
0
0
5
@liuzhuang1234
Zhuang Liu
6 months
@sidgairo18 We experimented with SSL models - MAE (ViT) and ConvNeXt V2, in our initial tests. They have similar behaviors as our supervised models, possibly because they are also pure vision models, and fine-tuned on ImageNet-1K (needed for many evaluations). So we didn't include them.
2
0
6
@liuzhuang1234
Zhuang Liu
2 years
(1/2) Come to our paper "Anytime Dense Prediction with Confidence Adaptivity" at ICLR 2022 Poster Session 1 today! Paper: Code: Video & poster:
Tweet media one
1
0
5
@liuzhuang1234
Zhuang Liu
1 year
thoughts: human and many other species seem to be trained to reproduce themselves and in that process we gained intelligence. If we somehow train models using "reproducing themselves" as the objective and if they indeed learn very well, soon we'll be in danger zone?
3
0
6
@liuzhuang1234
Zhuang Liu
1 year
Our analysis of network training dynamics revealed an interesting insight - using dropout in early training can reduce mini-batch gradient variances. It effectively balances the stochasticity of SGD, enabling more consistent, whole-dataset aligned updates
Tweet media one
1
2
4
@liuzhuang1234
Zhuang Liu
1 year
Our experiments gave promising results on ImageNet classification (more results in paper):
Tweet media one
Tweet media two
1
0
4
@liuzhuang1234
Zhuang Liu
4 months
@AlexGDimakis Great question! They do appear since early training but we haven't followed their changing trend closely. We'll observe it and plan to add it
1
0
4
@liuzhuang1234
Zhuang Liu
1 year
@sirbayes So true. The concern I have is if big players don't pick up your methods/papers it'll go unnoticed even if the method is scalable..
0
0
4
@liuzhuang1234
Zhuang Liu
6 months
@karpathy I find it so hard to press all keys together, same for pasting without formatting under many microsoft office products. They are disasters for ergo concerns I mapped the screenshot 4 keys to a single key on my keyboard with my logitech keyboard
1
0
2
@liuzhuang1234
Zhuang Liu
11 months
@jd92wang Yeah unprofessional or ill-intended reviewers is a big problem too
0
0
3
@liuzhuang1234
Zhuang Liu
4 months
@thecharlieblake Thanks for the pointer. We indeed cited this work but may have missed this paragraph. We'll discuss it more
1
0
3
@liuzhuang1234
Zhuang Liu
1 year
@giffmana Thank you for the suggested experiment! We added it into our new arxiv and camera-ready:
Tweet media one
1
0
2
@liuzhuang1234
Zhuang Liu
1 year
Is it just me or ChatGPT crashes more and more often? Most of my past 3-hour 25 message quota for GPT-4 went into waste (error in generating response)
0
0
3
@liuzhuang1234
Zhuang Liu
1 year
Inspired by this insight, we introduce "early dropout" for enhancing the fitting capabilities of smaller/underfitting models. We also propose a complementary method - "late dropout" for a more refined regularization of larger/overfitting models.
Tweet media one
1
0
2
@liuzhuang1234
Zhuang Liu
4 months
@peroxycarbonate It's a highly related impressive work. The main difference is our network is for recognition while theirs is for further generating 3D data, so in some sense their usage of diffusion models is ultimately for visual generation.
0
0
3
@liuzhuang1234
Zhuang Liu
11 months
It seems not right that, the design of the system that affects many people's education and careers are only driven by the goodwill of OpenReview and TPMS authors... we as a community should give more attention to this task
0
0
3
@liuzhuang1234
Zhuang Liu
6 months
@ahatamiz1 Our comparisons are contextualized in each property, and we try not to make generic statements. Our overarching message is to choose models based on specific needs, rather than to recommend concrete models
1
0
0
@liuzhuang1234
Zhuang Liu
4 months
@hjy836 Thanks! Not yet, it's still a very fixed setting - but that is definitely worth exploring further
1
0
2
@liuzhuang1234
Zhuang Liu
1 year
@YangYou1991 Congrats!
0
0
2
@liuzhuang1234
Zhuang Liu
4 months
@Luck30893653 It's faster and cheaper for each generated network than SGD. Also the fact that it can be done is interesting
0
0
2
@liuzhuang1234
Zhuang Liu
4 months
0
0
2
@liuzhuang1234
Zhuang Liu
11 months
@thegautamkamath @shadow_dnv @rasbt Someone can never prove themselves to be capable of doing "independent research" if all their papers have more than one authors. It is a impossible criterion to evaluate in my opinion, so should be deprecated or at least improved.
0
0
2
@liuzhuang1234
Zhuang Liu
11 months
@rohitgUCF Maybe some papers will be leftovers that no one wants to review lol, but interesting proposal! This would make reviewers happy
0
0
2
@liuzhuang1234
Zhuang Liu
2 years
@TalSchuster @GoogleAI @adamjfisch @_jai_gupta @dara_bahri @m__dehghani @vqctran @YiTayML Great work! Impressed by the depth of the experiments. Check out our related exploration in vision & ConvNets if interested!
@liuzhuang1234
Zhuang Liu
2 years
@_akhaliq For computer vision, we had a related ICLR paper on dense prediction tasks :) Anytime Dense Prediction with Confidence Adaptivity
Tweet media one
1
0
1
0
0
2
@liuzhuang1234
Zhuang Liu
4 months
@jaeho_lee_ Great point! They are different - section 2.3's discussion addresses this
2
0
2
@liuzhuang1234
Zhuang Liu
1 year
@giffmana @giffmana basically, yes it is not effective when we double the default batch size We couldn't reply right to you during the ICML review period, because of social media ban. But thank you for volunteered reviewing!
1
0
2
@liuzhuang1234
Zhuang Liu
2 years
@de_JQK @thegautamkamath @shortstein @icmlconf Thank you for bringing up our "Rethinking" project. I just would like to add that in that project we also experimented with and discovered the effects of learning rates on LTH :)
0
0
2
@liuzhuang1234
Zhuang Liu
11 months
@thegautamkamath @shadow_dnv @rasbt I get this point on developing a unique research vision but sometimes I don't get the emphasis on "independence". Almost all papers have more than one authors. If someone is truly 100% independent and did 100% of work then they should write single-author papers.
2
0
2
@liuzhuang1234
Zhuang Liu
5 years
@soumithchintala @arimorcos @WonderMicky @tydsh We had this transferring pruned structure experiment in: . We didn’t use the original init but used random reinit. The sparsity pattern is also visualized and has pretty clear patterns. Also we showed only “avg pattern” is needed, not the exact pattern.
0
0
2
@liuzhuang1234
Zhuang Liu
11 months
@LinjieXu @thegautamkamath @shadow_dnv @rasbt I agree, that is the right thing to strive for! The word "independence" seems to convey a different thing, at least in a literal sense, and we may want to change the word :)
0
0
2
@liuzhuang1234
Zhuang Liu
6 months
@LoadingALIAS @anshulkundaje Thank you! Yeah for adversarial example related stuff we only have ImageNet-A as part of the robustness. Yes it would be interesting to see the conventional adversarial results
1
0
2
@liuzhuang1234
Zhuang Liu
6 months
@ahatamiz1 It's hard to include models of all sizes. We prioritized the number of properties in this work, so only used 4 models we think are most representative for more clarity
1
0
2
@liuzhuang1234
Zhuang Liu
11 months
@thegautamkamath @LinjieXu @shadow_dnv @rasbt I've also seen people talking that the most important thing of being *admitted* to a PhD program is to demonstrate you can do independent research.. what? 🤣 That is also why I feel this criterion is often abused
0
0
2
@liuzhuang1234
Zhuang Liu
1 year
@alaaelnouby Congrats, Alaa!!
1
0
1
@liuzhuang1234
Zhuang Liu
1 month
@lofiMRI It was great to discuss with you all!
0
0
1