Excited to introduce SecCodePLT🛡️: a unified platform for evaluating security risks in code generation AI! Since summer, we’ve been building a comprehensive tool to assess AI models' potential for insecure coding and facilitating cyberattacks. 🧵1/👇
🤔 Q: How do you find low-quality data?
💡 A: Corrupt the good ones and watch where they go!
Sharing this simple yet generalizable data pruning idea I worked on this summer at the ENLSP workshop
#NeurIPS2023
as an *Oral* presentation!
arXiv:
(🧵1/)
🎉 Two of my papers have been accepted this week at
#ICLR2024
&
#AISTATS
!
Big thanks and congrats to co-authors
@xxchenxx_ut
& Eric Gan, mentors Atlas Wang &
@gkdziugaite
, and especially my advisor
@baharanm
! 🙏
More details on both papers after the ICML deadline!
Excited to share our
#NeurIPS2023
paper tackling spurious correlations in machine learning! Grounded in theoretical analysis, our PDE algorithm improves efficiency and worst-group accuracy under group imbalances. Discover more in our code and project page!👇
Codes and project page are released for our
#NeurIPS23
paper on spurious correlations in robust learning! 🚀
🔗 Project:
🔗 Code:
Key Insights:
📊 We discovered in theoretical analysis that spurious features overtake initial
Honored to have my work featured alongside others from our lab in the
#ICML2024
tutorial on Foundations of Data-Efficient Learning! The tutorial was well-designed and comprehensive on theoretically grounded dataset curation techniques. Recording is out for anyone interested!🙌
📢Excited to share the recording of our
#ICML2024
Tutorial on Foundations of Data-Efficient Learning:
Truly grateful to everyone who attended — it was incredible to see the enthusiasm for theoretically principled techniques for dataset curation!
🎙️We are thrilled to announce that we will be presenting our latest paper with
@besanushi
@hmd_palangi
@baharanm
() at
#ICML2023
! 🎉 Join us as we share insights and solutions for ✨spurious correlations in vision-language models✨. (🧵1/8)
It’s been a new and exciting experience to be part of founding
@VirtueAI_co
! I’ve had the privilege of working with top minds in the fields – I'm incredibly grateful for this invaluable experience. Check out our website and blogs, and come hang out with us in SF this summer! 🥳🎉
📢Our (Hao Kang,
@baharanm
) paper () will appear at
#ICML2023
! Introducing CREST: the first coreset selection algorithm theoretically guaranteed to speed up training of deep neural networks!🚀(🧵1/7)
Our finding shows longer code files are often lower quality, and pruning these files can significantly enhance performance. Excited to have contributed to this project led by
@Aaditya6284
, extending my internship work
@AIatMeta
! 🌟 Check out our paper 👉
Long (code) files may not be as high quality as you think…
Excited for our new work, "Brevity is the soul of wit: Pruning long files for code generation". We find that long code files are often low quality and show benefits from pruning such files for code gen.
Read on 🔎⏬
Excited about training on synthetic data? Different stages of training might need different synthetic data! 🧠💡
Check out our
#ICLR2024
paper on Progressive Dataset Distillation (PDD😉) at PS
#2
Halle B
#9
! It tailors synthetic data to each training stage for better performance!
Don't miss our poster session today at
#NeurIPS2023
!
🤗
@Yihe__Deng
will be presenting our work on "Robust Learning with Progressive Data Expansion Against Spurious Correlation."
📍 Great Hall & Hall B1+B2 (level 1)
#707
⏰ 5:15 p.m. - 7:15 p.m. CST
🔗
Happy to share that our work "Robust Learning with Progressive Data Expansion Against Spurious Correlation" has been accepted to
#NeurIPS2023
! 🎉
arXiv:
🌴Sharing some Oahu travel gems (mostly hiking trails) from my past trips to fellow
#ICML2023
attendees who plan to explore Oahu after the conference 👇🏻 Hope everyone enjoys the workshops and
#Hawaii
! 🤗
Excited to share our latest work on Thursday at
#ICML2023
! Join me for an discussion on: 1️⃣ Mitigating spurious correlations in CLIP 2⃣ Coreset selection for efficient training of deep NNs. Looking forward to reconnecting with old friends and making new ones! See you there! 📷🥳
CLIP is susceptible to backdoor attacks ☠️ that hurts its zero-shot perf.📢 Introducing CleanCLIP, an unsupervised framework to reduce the impact of backdoor attacks. Accepted at
#ICCV2023
. Also the best paper award winner🏆at
#RTML
#ICLR2023
!
Paper: 🧵👇
Grateful for the feature on this insightful blog! Our paper, 'Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning,' presents an efficient approach to the challenge. Looking forward to seeing more work on efficient and robust handling of real-world data.
New evaluation methods and a commitment to continual improvement are musts if we’re to build
multimodal AI systems that advance human goals. Learn about cutting-edge research into the
responsible development and use of multimodal AI at Microsoft:
We announce AIR 2024: a unified AI risk taxonomy. Analyzing regulations from the EU🇪🇺, US🇺🇸, and China🇨🇳, plus 16 policies from AI developers like
@OpenAI
,
@AnthropicAI
,
@AIatMeta
, and
@GoogleAI
, we identified 314 risk categories. Check out our white paper & blog
@VirtueAI_co
👇
My mentor from
@AIatMeta
last summer,
@arimorcos
, is now leading
@datologyai
! 🌟
Ari is a top data expert & exceptional leader. They're hiring in research & engineering - a great opportunity for those who want to work on cutting-edge data projects. Don't miss it! 👇
We are hiring for roles across both research and engineering. If you're excited about pushing the frontier of what's possible through better data -- please apply here:
Just read this insightful paper by
@WilliamBarrHeld
on how NLP research perpetuates unequal power dynamics. Agree on moving from large, biased datasets to value-centered data curation. Data-efficient learning isn't just for sustainability but also builds
#ResponsibleAI
.🙌
How do existing knowledge, technologies, and power structures reinforce each other in Natural Language Processing research?
We find that involvement in NLP research becomes more unequal as we move from unlabeled data to deployed systems.
Check out
@Aaditya6284
&
@scychan_brains
's
#NeurIPS2023
paper revealing transformers' in-context learning (ICL) capability is transient and gives way to in-weights learning (IWL).🧐
Excited to finally read about all these interesting results I heard about over the summer!🤗
Training your transformer for longer to get better performance? Be careful! We find that emergent in-context learning of transformers disappears in "The Transient Nature of In-Context Learning in Transformers" (, poster at
#NeurIPS2023
).
Read on 🔎⏬
Excited to share our upcoming workshop presentation at
#ICML2023
! Our research introduces a novel progressive data expansion algorithm founded on theoretical insights to enhance deep learning models' robustness against spurious correlations. Join us at the SCIS workshop!
📢Excited to be attending
#icml2023
soon and thrilled to meet and connect with everyone!
I'll be presenting at SCIS workshop @ Sat 29th 2:30-3:30pm Meeting Room 316 AB for our work "Robust Learning with Progressive Data Expansion Against Spurious Correlation."
Proud to be presenting Data-Efficient CLIP at AISTATS 2024! We propose the first theoretically rigorous method to select the most useful data subset to train CLIP models. On CC3M and CC12M, our subsets are upto 2.5x better than the next best baseline!
🧵👇
Checked out the latest lead by
@Yihe__Deng
and
@QuanquanGu
- unleash the full power of your chatbots by letting them *rephrase* the question before they *respond*! 👇
📢 Excited to share our latest research on improving human-AI communication! 🤖💬 We introduce 'Rephrase and Respond' (RaR), a simple yet effective method that enhances LLMs’ understanding of human questions. Check out how RaR improves
#GPT4
performance by resolving ambiguities &
👇 Check out SpuCo, our new Python package designed to make studying and benchmarking model robustness against SPUrious COrrelations super easy! 🤩
💪 Exciting collaboration with
@sjoshi804
,
@xue_yihao65785
,
@WenhanYang0315
, and
@baharanm
!
Introducing SpuCo: a Python package to standardize tackling spurious correlations in deep neural networks!
1️⃣ Modular implementation of current SOTA methods
2️⃣ Two new challenging and realistic datasets
Paper:
Github:
🧵(1/n)
Happening *now* at
#NeurIPS23
! Check out my labmate
@WenhanYang0315
's work on defending CLIP against data poisoning and backdoor attacks!
📍Great Hall & Hall B1+B2 (level 1)
#718
⏰ Wed, Dec 13 | 10:45 a.m. - 12:45 p.m. CST
🔗
CLIP is such a great model for zero-shot inference…until it is poisoned!
0.0001% poison /0.01% backdoor rates are enough to compromise the model!
Our work on multimodal robustness proposed RoCLIP to tackle this issue!
📢
@Aaditya6284
will give the oral presentation for our work tomorrow at the ENLSP workshop!
🗓️ Oral: Sat, 9:42 a.m. - 9:48 a.m.
📌 Poster: 1:00 p.m. - 2:00 p.m.
📍 Room 206 - 207
If you're at
#NeurIPS2023
, come check out the oral presentation and our poster! 🤗(🧵9/9)
@AIatMeta
11/ 💻 Real-World Example: SecCodePLT uncovered vulnerabilities in coding agents like
@cursor_ai
, which achieved a secure coding rate of 60% but failed on critical CWEs like broken cryptographic algorithm (CWE-327) and incorrect authorization (CWE-863), exposing security issues.
Data for cyberattack helpfulness and the full evaluation platform will be open-sourced soon. Stay tuned! ⏰
Meanwhile, we’d love to hear your feedback as we continue to improve SecCodePLT and its resources! 😊
@baharanm
🔬 Key Idea of CREST: It models non-convex loss as quadratic using Taylor expansion, approximating locally with gradient and Hessian diagonal. CREST trains on selected coresets with a small approximation error, bounding the gradient error within each quadratic region. (🧵3/7)
There is a well-known saying in the data science community: 'Garbage In, Garbage Out.' The quality of data we feed into our models is the cornerstone of their success or failure. With the rise of large language models (LLMs), data quality has never been more crucial. (🧵2/)
🔍 Next, we embed both the original and the corrupted data to analyze their relationships. We observe how the data moves due to the synthetic corruption. This movement, this change in the embeddings, signals which data is potentially problematic. (🧵5/)
2/ Previous benchmarks focused on insecure code or attack suggestions, often using static metrics and lacking expert validation. SecCodePLT fixes this with a unified platform evaluating insecure coding and cyberattack helpfulness via dynamic metrics and expert-verified data.👩💻
@baharanm
📊 Learning with Small Variance: By selecting coresets of the size of one mini-batch from larger random subsets, CREST trains with smaller variance than random mini-batch SGD, leading to faster convergence. (🧵5/7)
@besanushi
@hmd_palangi
@baharanm
💡 In summary, our paper addresses the challenges of spurious correlations in vision-language models and presents an efficient contrastive learning approach for detection and mitigation. 🔗 Stay tuned for code and implementation details: 💻 (🧵8/8)
8/ 🚨 Cyberattack Helpfulness Results: SecCodePLT revealed that GPT-4o can generate end-to-end cyberattacks. Claude, however, had much higher refusal rates on the two most dangerous tasks, Weaponization & Infiltration and C2 & Execution.
@baharanm
⚡ Speedup Results: CREST demonstrates up to 2.5x faster training on vision and language datasets (CIFAR-10, CIFAR-100, TinyImageNet, SNLI), including data with millions of examples while maintaining theoretically guaranteed performance comparable to full training. (🧵6/7)
7/ ⚔️ Task 2: Cyberattack Helpfulness: SecCodePLT evaluates models across 5 categories: reconnaissance, infiltration, command & control, discovery, and data exfiltration. Each category is tested for attack execution success and refusal rates.
@besanushi
@hmd_palangi
@baharanm
1️⃣ We detect linguistic attributes and test their impact on model performance. We use practitioner supervision to identify spurious correlations. 🔍 (🧵5/8)
3/ 🔍 Dynamic Evaluation: SecCodePLT uses real test cases and dynamic metrics, offering more precise assessments than static methods. It ensures that code not only looks secure but actually passes functionality and security tests in real-world scenarios.
@besanushi
@hmd_palangi
@baharanm
2️⃣ We extend contrastive language-vision learning with additional loss functions that decorrelate spurious attributes from class names and separate vision and language representations. Our method fine-tunes only the projections, requiring fewer computational resources. 🌿(🧵6/8)
@AIatMeta
10/ 🔐 Security Policy: SecCodePLT introduces an optional security policy reminder for each insecure coding task, offering explicit vulnerability guidelines. We found that adding this boosts secure coding rates by ~20%, proving clear instructions help models generate safer code.
9/ ⚖️ SecCodePLT vs. CyberSecEval: our SecCodePLT outperforms CyberSecEval by
@AIatMeta
in security relevance (how well prompts match security scenarios) and instruction faithfulness (accuracy of prompts to tasks). SecCodePLT scores nearly 100%, vs CyberSecEval’s 68% and 42%.
@baharanm
🔄 Adaptive Updates: As training progresses, the approximated loss function may deviate significantly. When the approximation error exceeds a threshold, update the coreset and quadratic function, adapting to the evolving loss landscape. (🧵4/7)
4/ ✅ Expert-Verified Data: SecCodePLT combines expert-generated seed samples with LLM-based mutations to scale up, ensuring both quality and relevance. This two-stage process guarantees the data aligns with real security vulnerabilities.
✂️ We use these indicators to prune the data systematically. We call this "Synthetic Corruption Informed Pruning" (SCIP). It's important to note that the indicators might vary with each dataset, making SCIP less of a predefined algorithm and more of a flexible approach. (🧵6/)
5/ 🛡️ Task 1: Insecure Coding: SecCodePLT evaluates AI models across 27 CWEs, testing over 1,300 samples for code generation and completion. It measures their ability to avoid generating insecure code in real-world security scenarios.
@besanushi
@hmd_palangi
@baharanm
We present experimental results on CLIP, showcasing its effectiveness with spurious correlations. By leveraging the joint embedding space, our approach improves model attention without extra annotation data. 🚀📊(🧵7/8)
6/ 📊 Insecure Coding Results: On SecCodePLT's insecure coding benchmark, GPT-4o achieved a secure coding rate of 55%, outperforming CodeLlama-34B and Llama-3.1-70B. However, vulnerabilities like cryptographic weaknesses and input validation remain challenging across all models.
@baharanm
🔍 The Challenge: Previous coreset selection methods lacked convergence guarantees for (mini-batch) SGD. Non-convex loss and stochastic gradient require multiple coresets to bound the error. But how do we determine the optimal number and update timings? (🧵2/7)
@besanushi
@hmd_palangi
@baharanm
💡 Unlike prior work, our approach leverages the 👁️ multi-modality of CLIP 🗣️. We automate spurious attribute detection and mitigation using language-based descriptions, reducing the need for human annotations and creating a more flexible debugging interface. (🧵4/8)
🧐 So how do we approach this challenging problem? This is where our clear-cut quality indicators come into play. We intentionally 'corrupt' our data by distorting these indicators, like intentionally breaking the grammar. (🧵4/)
📈 Our tests showed up to a 3% performance improvement in code generation models by pruning 20% of data using SCIP. Plus, SCIP isn't just about better performance—it's also about efficiency. We achieved the same model performance with over 20% less training time. (🧵7/)
@besanushi
@hmd_palangi
@baharanm
Since retraining from scratch is not practical, our approach addresses spurious correlations during fine-tuning via contrastive learning. 🔧 (🧵3/8)
@tungnd_13
@VirtueAI_co
Thank you, Tung! Still in internship mode to finish my PhD first, but excited for what's next!😆 This has been an incredible learning experience.
@_lewtun
Hi
@_lewtun
, thanks for your interest! We chose code data for SCIP's debut for its well-defined quality indicators. SCIP can be used for other types like chat data, but we haven't tested it yet. We hope this research inspires further exploration in various data types!
To identify low-quality language data, on one side, we have clear-cut quality indicators, including grammar, spelling, and coherence. They are reliable and measurable. Yet, we face challenges with subtle errors and the impracticality of manual inspection. (🧵3/)