Important Papers on LLMs and GPTs – AI | Commerce

Large language models, or LLMs, have varying levels of training and parameters. LLMs contain hundreds of millions or billions of documents and words that have been said over time. Few new business and social ideas are ever discovered. For decades, the words to describe any task have been uttered and captured. Mature LLMs (none exist in 2023) will provide trusted information. Encyclopedia Britannica was a trusted source in the 1960s and 1970s. A number of competing encyclopedias were sold, as a number of key LLMs will emerge.

It’s kind of like the discussions on oversize rings in an engine or memory speed in a computer. Ford vs Chevy. Bank of America vs Chase. The difference was rarely seen in a meaningful way.

These are titles and links to seminal papers on underlying AI research.

Category	Title	Source
Research	LLaMA: Open and Efficient Foundation Language Models	https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/
Research	Semantic reconstruction of continuous language from non-invasive brain recordings	https://www.nature.com/articles/s41593-023-01304-9.epdf
Research	Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation	https://arxiv.org/abs/2305.01210
Research	Unlimiformer: Long-Range Transformers with Unlimited Length Input	https://arxiv.org/abs/2305.01625
Research	Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models	https://arxiv.org/abs/2305.01645
Research	Language Models: GPT and GPT-2	https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
Research	Transformer Puzzles	https://github.com/srush/Transformer-Puzzles
Research	LlamaIndex 0.6.0: A New Query Interface Over your Data	https://betterprogramming.pub/llamaindex-0-6-0-a-new-query-interface-over-your-data-331996d47e89
Research	The Ultimate Battle of Language Models: Lit-LLaMA vs GPT3.5 vs Bloom vs …	https://lightning.ai/pages/community/community-discussions/the-ultimate-battle-of-language-models-lit-llama-vs-gpt3.5-vs-bloom-vs/
Research	Harnessing LLMs	https://www.linkedin.com/pulse/harnessing-llms-part-i-peter-bull/
Research	How to train your own Large Language Models	https://blog.replit.com/llm-training
Research	Scaling Forward Gradient With Local Losses	https://arxiv.org/abs/2210.03310
Research	Introducing Lamini, the LLM Engine for Rapidly Customizing Models	https://lamini.ai/blog/introducing-lamini
Research	Categorification of Group Equivariant Neural Networks	https://arxiv.org/pdf/2304.14144v1.pdf
Research	Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond	https://arxiv.org/abs/2304.13712
Research	The Practical Guides for Large Language Models	https://github.com/Mooler0410/LLMsPracticalGuide
Research	Introduction to LangChain: A Framework for LLM Powered Applications	https://www.davidgentile.net/introduction-to-langchain/
Research	Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models	https://arxiv.org/abs/2304.13835
Research	A large-scale comparison of human-written versus ChatGPT-generated essays	https://t.co/qLO7JV2Gbl
Research	Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery	https://arxiv.org/abs/2304.13714
Research	A Cookbook of Self-Supervised Learning	https://arxiv.org/abs/2304.12210
Research	NeMo Guardrails	https://developer.nvidia.com/blog/nvidia-enables-trustworthy-safe-and-secure-large-language-model-conversational-systems/
Research	AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head	https://arxiv.org/abs/2304.12995
Research	State Spaces Aren’t Enough: Machine Translation Needs Attention	https://arxiv.org/abs/2304.12776
Research	Answering Questions by Meta-Reasoning over Multiple Chains of Thought	https://arxiv.org/abs/2304.13007
Research	Getting Started with LangChain: A Beginner’s Guide to Building LLM-Powered Applications	https://towardsdatascience.com/getting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c
Research	Generative AI at Work	https://www.nber.org/papers/w31161
Research	LLM+P: Empowering Large Language Models with Optimal Planning Proficiency	https://arxiv.org/abs/2304.11477
Research	Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers	https://arxiv.org/abs/2110.02402
Research	Improving Document Retrieval with Contextual Compression	https://blog.langchain.dev/improving-document-retrieval-with-contextual-compression/
Research	The Embedding Archives: Millions of Wikipedia Article Embeddings in Many Languages	https://txt.cohere.com/embedding-archives-wikipedia/
Research	Hugging Face Hub	https://python.langchain.com/en/latest/modules/models/llms/integrations/huggingface_hub.html
Research	Effective Instruction Tuning	https://twitter.com/vagabondjack/status/1649127428659265537
Research	Reinforcement Learning with Human Feedback (RLHF)	https://github.com/opendilab/awesome-RLHF
Research	Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes	https://arxiv.org/abs/2304.09433
Research	Transformer Math 101	https://blog.eleuther.ai/transformer-math/
Research	Open-source research on large language models (LLMs)	https://twitter.com/cwolferesearch/status/1647990311547797504
Research	A visual guide to transformers	https://twitter.com/akshay_pachaar/status/1647940492712345601
Research	Enhancing Vision-language Understanding with Advanced Large Language Models	https://github.com/Vision-CAIR/MiniGPT-4/blob/main/MiniGPT_4.pdf
Research	Transformer: Attention Is All You Need	https://arxiv.org/abs/1706.03762
Research	LLMs on personal devices	https://simonwillison.net/series/llms-on-personal-devices/
Research	LLM Source Context Evaluation	https://twitter.com/jerryjliu0/status/1647626532519841793
Research	Generative Agents: Interactive Simulacra of Human Behavior	https://arxiv.org/abs/2304.03442
Research	Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study	https://arxiv.org/abs/2304.06762
Research	Auto-evaluate LLM Q+A chains	https://twitter.com/RLanceMartin/status/1647645549875859456
Research	Understanding Diffusion Models: A Unified Perspective	https://arxiv.org/abs/2208.11970
Research	Building LLM applications for production	https://huyenchip.com/2023/04/11/llm-engineering.html
Research	Boosted Prompt Ensembles for Large Language Models	https://arxiv.org/abs/2304.05970
Research	Teaching Large Language Models to Self-Debug	https://arxiv.org/abs/2304.05128
Research	The Power of Scale for Parameter-Efficient Prompt Tuning	https://arxiv.org/abs/2104.08691
Research	Multimodal Procedural Planning via Dual Text-Image Prompting	https://arxiv.org/abs/2305.01795
Research	Are Emergent Abilities of Large Language Models a Mirage?	https://arxiv.org/abs/2304.15004