🎉 I'm happy to share my first book "LLM Prompt Engineering Simplified"
🎉 This book covers all the basic and intermediate-level concepts related to LLM prompt engineering.
🎉 The book is completely free and available online.
- Book link:
- Github Repo
🚀
@huggingface
Model Memory Calculator 🚀
✅ This tool will help you calculate how much GPU RAM is needed to
- train a model and
- perform big model inference on a model hosted on the Hugging Face Hub.
✅ Currently, this tool supports all models hosted that use transformers
@LangChainAI
in one picture
🚀LangChain is a framework for developing applications powered by language models.
🚀 This framework consists of several parts.
⚡️LangChain Libraries: The Python and JavaScript libraries.
⚡️LangChain Templates: A collection of easily deployable
🎉I am happy to receive citations from the research papers of
@GoogleDeepMind
and
@Microsoft
. 🎉
🏅 Recently, when I checked my Google Scholar Profile, I saw that one of my papers received citations from the papers of two top companies, Google's Deep Mind and Microsoft.
🚀
@LangChainAI
in action 🚀
LangChain is a framework for developing applications powered by language models.
✅ This framework consists of several parts.
⚡️LangChain Libraries: The Python and JavaScript libraries.
⚡️LangChain Templates: A collection of easily deployable
🚀LangChain Templates in Action 🚀
➡️ LangChain is a framework for developing LLM application.
➡️LangChain templates are pre-defined recipes for generating prompts for LLMs.
➡️LangChain templates include
- instructions,
- few-shot examples,
- specific context
- questions
🚀 LLaMA Beyond English
✅ This research paper explores the challenge of extending Llama to non-English languages.
☑️The authors conducted an extensive empirical investigation to study various options like
- vocabulary expansion,
- further pretraining,
- instruction tuning.
🚀 Airavata - Instruction Tuned Hindi LLM
✅ Airavata - an instruction-tuned model for Hindi built by finetuning OpenHathi LLM.
☑️ OpenHathi is an open-source foundational model for Hindi, developed by extending Llama 2.
✅ OpenHathi was introduced by Sarvam AI, a promising AI
@MasterJeongK
GPT3, GPT-3.5 and the recent GPT-4 are really good for general domain. However, the performance of these models in specific domains like biomedical is not so great. Apart from this, there is still a lot of room for improvement in many aspects.
🚀SQLCoder beats GPT-4 in Text-to-SQL Generation
✅ SQLCoder is a state-of-the-art LLM for converting natural language questions to SQL queries.
✅ SQLCoder-34B outperforms gpt-4 and gpt-4-turbo for natural language to SQL generation tasks on our sql-eval framework.
✅
I'm a third year Ph.D. student working in Clinical Natural Language Processing (social media text). What are the things I have to do apart from publishing papers in reputed conferences and journals, to get postdoc after my Ph.D.?
@annargrs
@sarkerabeed
@seb_ruder
@cocoweixu
@partha_p_t
[1] Machine Translation related research work done by researchers from
@ai4bharat
. For example, "IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages"
🚀 How Code Empowers LLMs to Serve as Intelligent Agents (Survey)
✅ The survey paper discusses the integration of code into large language models (LLMs) and its impact.
☑️ It highlights that modern LLMs are not only larger but also trained on a mix of natural language and
🚀 Extend Llama without Catastrophic forgetting
❌ Drawbacks ❌
❗️Catastrophic forgetting - When trained on new data, existing knowledge degrades significantly ("forgetting"). This is evident in the LLaMA family - LLaMA to CodeLLaMA.
✅ Proposed Solution
🔅 The authors
🚀 Cheetah - Multilingual LLM for 517 African Languages
The paper introduces Cheetah, a multilingual NLG model for African languages addressing low-resource challenges.
Cheetah supports 517 African languages, outperforming other models in five out of seven generation tasks.
LLMs for Information Extraction (Survey)
✅ Information Extraction (IE) focuses on extracting structural knowledge, such as entities, relations, and events, from natural language texts.
✅ This survey paper explores the recent trend of utilizing generative Large Language Models
🚀 Excellent demo of Online LLMs
➡️ Recently
@perplexity_ai
introduced Online LLMs, the first of its kind.
➡️ Drawbacks of existing LLMs
- Freshness: LLMs often struggle to share up-to-date information.
- Hallucinations: LLMs can also output inaccurate statements.
➡️
⚡️ OpenAI GPT Store Set to Launch Next Week
OpenAI plans to launch a store for GPTs, custom apps based on its text-generating AI models (e.g. GPT-4)
The GPT Store was announced last year during OpenAI’s first annual developer conference, DevDay.
GPTs don’t require coding
@SharonYixuanLi
The main reason for this race is the commercial benefits of these large language models.
-> 2013 - word2vec, 2014-Glove, 2017- FastText (slow and steady progress)
🚀 MedLM - a family of foundation models fine-tuned for the healthcare industry
✅ MedLM models are built on the top of MedPaLM-2.
✅ There are two models under MedLM.
➡️ The first MedLM model is larger, designed for complex tasks.
➡️ The second is a medium model, able to be
The rise of multiple open-source LLMs like Llama2, Falcon,
@llm360
, Mistral etc. supports
@ylecun
claims.
Many of these open-source LLMs already outperformed the proprietary GPT-3.5 model on multiple benchmarks.
2024 will surely witness more advanced open-source LLMs which may
Koala - new chatbot model approaching ChatGPT quality. Koala is initialized from LLaMA-13B and then trained over dialogue data scraped from the web and public datasets
Link:
LLM Life Cycle
LLM Life Cycle involves four important stages
- Data Collection
- Pretraining
- Meta Training (instruction tuning or RLHF)
- Model Serving
Picture Credit: "Training and Serving System of Foundation Models: A Comprehensive Survey"
#llms
#opensource
#generativeai
🚀LLM API Pricing Calculator 🚀
✅Large language models are powerful artificial intelligence systems that have the ability to analyze and generate human-like text.
✅LLMs gained a lot of popularity with the release of models like ChatGPT and GPT-4.
✅As these models are
♨️ Build with Gemini (Pro and Pro Vision)
1️⃣ The first version of Gemini Pro and Gemini Pro Vision are now accessible via the Gemini API.
2️⃣ Gemini API comes with a range of features
- function calling,
- embeddings,
- semantic retrieval
- custom knowledge grounding,
- chat
Google Gemini to Open AI Q* Survey
This survey paper covers
- evolving landscape of generative AI, with a focus on MoE, multimodal learning, and AGI.
- impact of innovations like Google's Gemini and OpenAI's Q* project on research and applications.
- computational challenges,
BloombergGPT - 50B parameter language model for Finance domain.
- Mixed dataset training leads to good performance on finance tasks without sacrificing performance on general NLP tasks.
@business
@TechAtBloomberg
- Paper link:
🚀 Finance with LLMs: An Overview of Applications and Insights
1️⃣ LLMs like GPT-4 are becoming increasingly advanced and versatile.
2️⃣ LLMs are useful for various tasks in the financial sector like:
- Automating report generation.
- Forecasting market trends.
- Analyzing
🚀 RAG Survey
Retrieval Augmented Generation (RAG) refers to the retrieval of relevant information from external knowledge bases before answering questions with LLMs.
❌ Challenges of LLMs:
- Hallucinations (generating inaccurate information)
- Slow knowledge updates
- Lack
TweetNLP - Cutting-Edge NLP library (9 tasks) for Social Media. Library:
Good work from
@Cardiff_NLP
people in creating this library. For NLP transformers survey, refer
#nlproc
#nlp
🚀 Code Llama-70B (the latest Coding LLM from MetaAI)
✔️ Recently, MetaAI released Code Llama-70B, the largest model in the CodeLlama family.
🟦 This Code LLM is initialized from Llama 2 and then trained on large volumes of code data.
✔️ Code Llama-70B is available on Hugging
Satya Nadella, in his early days at Microsoft as Technical Marketing Manager
✔️Satya Nadella joined Microsoft in the early 1990's and then stayed in the same company.
✔️Satya Nadella quickly rose through the ranks at Microsoft and held leadership roles in both enterprise and
🎀
@PyTorch
Deep Learning
✅ This playlist covers the following
- Pytorch Deep Learning Series - Introduction
- PyTorch Deep Learning, Section 2: Deep Dive into Basics (Part 1)
- PyTorch Deep Learning, Section 2: Building Strong Foundations (Part 2)
- PyTorch Deep Learning,
9th Workshop on Noisy and User-generated Text (W-NUT)
@eaclmeeting
2024
✅ If you are working on problems at the intersection of NLP and Social media, the WNUT workshop (organized along with EACL 2024) is a good venue to submit your research paper.
✅ WNUT workshop is
Sam Altman is No Longer the CEO of
@OpenAI
📍 It's a big surprise that Sam Altman is sacked as the CEO of OpenAI.
📍This happened just a few days after OpenAI DevDay, where he revealed the plans for a more advanced model, GPT-5.
📍 Undoubtedly, Sam Altman contributed a lot
Scale-LLM Workshop 2024
✅Workshop on the Scaling Behavior of Large Language Models (Scale-LLM Workshop
@eaclmeeting
2024)
✅The workshop will provide focused discussions on multiple topics in the general field of Scaling behavior of Large Language Models.
✅ Scale-LLM
🚀 DeepSeek-Coder - Open Source Code LLMs
DeepSeek-Coder - Family of open-source code models with sizes from 1.3B to 33B.
☑️ These LLMs are pretrained from scratch on 2 trillion tokens.
☑️ DeepSeek-Coder outperforms closed-source LLMs like Codex and GPT-3.5.
✅ These models
PEFT Methods (LoRA, QLoRA) Survey
LLMs with billions of parameters have been successful in NLP tasks.
Parameter Efficient Fine-Tuning (PEFT) reduces fine-tuning parameters and memory usage while maintaining performance.
This survey paper reviews PEFT methods, discusses
@SwiggyCares
@zinqshere
MRP means Maximum Retail Price which is inclusive of all the taxes. Then how can you add taxes beyond MRP? Can you please clarify?
🚀 Build Gemini Chatbot using Streamlit
✅ Gemini is the latest and advanced Chatbot LLM introduced by Google AI.
☑️ Streamlit is a Python library to create ML and chatbot apps in just a few lines of codes.
✅ Build Gemini Chatbot using Streamlit involves the following steps
State of LLM Apps 2023 (by
@streamlit
)
Key takeaways
- OpenAI is dominant (73% use GPT models)
- The future is multi-agent (56% use orchestration)
- Most apps bypass vector magic (Only 19% use vector retrieval)
- Chatbots are on the rise (25% and growing are chatbots)
State of
Vanna - Chat with your SQL Database
Vanna is an open-source Python library that allows you to generate SQL queries from natural language questions.
It uses a Retrieval-Augmented Generation (RAG) framework to train a model on your data and then answer your questions.
Vanna is
🚀 LLMLingua - LLM Prompt Compressor
❌ Drawbacks with length prompts ❌
1️⃣ Large language models (LLMs) have demonstrated remarkable capabilities.
2️⃣ Advancements techniques such as Chain-of-Thought (CoT), In-Context Learning (ICL), and Retrieval-Augmented Generation (RAG)
⚡️ DeepSeek LLM - New Open Source LLM (outperforms GPT-3.5)
❌ Existing research on scaling LLMs presents contradictory findings, making further scaling unclear.
✅ This paper proposes
- new scaling laws
- introduces DeepSeek LLM, a new open-source LLM open-source language
PandasAI is a Python library that adds Generative AI capabilities to the pandas library.
- PandasAI uses
@OpenAI
models.
- PandasAI tutorial:
- PandasAI library:
♨️ PromptBench: A Library for Evaluation of Large Language Models
➡️ PromptBench is a unified library for evaluating large language models (LLMs).
➡️ Provides several key components for easy use and extension:
- Prompt construction
- Prompt engineering (e.g., few-shot,
🚀Fabricator - LLM Library for Labelled Data Generation
✔️ Most NLP tasks are modelled as supervised learning and thus require labelled training data to train effective models.
✔️ However, data labelling is an expensive, laborious and time-intensive process.
✔️FABRICATOR is an
🚀 Knowledge Fusion Of LLMs
Training LLMs from scratch is inefficient and expensive.
Merging existing models is a compelling and cost-effective alternative.
However, direct weight blending is impractical due to varied architectures.
This paper proposes a novel approach called
♨️ Retrieval-Augmented Generation (RAG) Survey
RAG is a technique that combines large language models (LLMs) with external knowledge bases to improve answer accuracy and reduce model hallucinations, particularly for knowledge-intensive tasks.
☑️ RAG achieves this by:
-
🚀 TEXTMACHINA - Seamless Generation of Machine-Generated Text Datasets
❌ Challenges
- Easy access to powerful Large Language Models (LLMs) leads to misuse and the need for robust detection/attribution tools.
❎ Existing solution
- Datasets for training MGT-related models,
💡 Large Language Models and The End of Programming
✅ Dr. Matt Welsh discusses the impact of AI models like ChatGPT on the future of computer science and programming.
☑️ He presents the argument that LLMs could fundamentally change how we build software, potentially leading to
Prompting Framework" (PF) is the framework for managing, simplifying, and facilitating interaction with large language models (LLMs).
✅ Prompting Framework is the upper layer which enables LLMs to interact with the external world.
✅ Some of the popular LLM prompting
🚀
@OpenAI
GPT Store
✅ GPTs are custom versions of ChatGPT.
☑️ Over 3M GPTs have been created.
✅ GPT Store - find useful and popular custom versions of ChatGPT.
☑️ Key Points
- Discover custom ChatGPT models created by OpenAI and the community.
- Explore various
LangServe - Deploy LangChain Apps
This video provides a step-by-step guide on how to deploy LangChain applications to
(i) Google Cloud and
(ii) LangServe hosted deployments.
Video tutorial:
#langchain
#llms
#generativeai
#nlproc
#deeplearning