details of the close kind

Important Papers on LLMs and GPTs

Large language models, or LLMs, have varying levels of training and parameters. LLMs contain hundreds of millions or billions of documents and words that have been said over time. Few new business and social ideas are ever discovered. For decades, the words to describe any task have been uttered and captured. Mature LLMs (none exist in 2023) will provide trusted information. Encyclopedia Britannica was a trusted source in the 1960s and 1970s. A number of competing encyclopedias were sold, as a number of key LLMs will emerge.

It’s kind of like the discussions on oversize rings in an engine or memory speed in a computer. Ford vs Chevy. Bank of America vs Chase. The difference was rarely seen in a meaningful way.

These are titles and links to seminal papers on underlying AI research.

CategoryTitleSource
ResearchLLaMA: Open and Efficient Foundation Language Modelshttps://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/
ResearchSemantic reconstruction of continuous language from non-invasive brain recordingshttps://www.nature.com/articles/s41593-023-01304-9.epdf
ResearchIs Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generationhttps://arxiv.org/abs/2305.01210
ResearchUnlimiformer: Long-Range Transformers with Unlimited Length Inputhttps://arxiv.org/abs/2305.01625
ResearchDistill or Annotate? Cost-Efficient Fine-Tuning of Compact Modelshttps://arxiv.org/abs/2305.01645
ResearchLanguage Models: GPT and GPT-2https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
ResearchTransformer Puzzleshttps://github.com/srush/Transformer-Puzzles
ResearchLlamaIndex 0.6.0: A New Query Interface Over your Datahttps://betterprogramming.pub/llamaindex-0-6-0-a-new-query-interface-over-your-data-331996d47e89
ResearchThe Ultimate Battle of Language Models: Lit-LLaMA vs GPT3.5 vs Bloom vs …https://lightning.ai/pages/community/community-discussions/the-ultimate-battle-of-language-models-lit-llama-vs-gpt3.5-vs-bloom-vs/
ResearchHarnessing LLMshttps://www.linkedin.com/pulse/harnessing-llms-part-i-peter-bull/
ResearchHow to train your own Large Language Modelshttps://blog.replit.com/llm-training
ResearchScaling Forward Gradient With Local Losseshttps://arxiv.org/abs/2210.03310
ResearchIntroducing Lamini, the LLM Engine for Rapidly Customizing Modelshttps://lamini.ai/blog/introducing-lamini
ResearchCategorification of Group Equivariant Neural Networkshttps://arxiv.org/pdf/2304.14144v1.pdf
ResearchHarnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyondhttps://arxiv.org/abs/2304.13712
ResearchThe Practical Guides for Large Language Modelshttps://github.com/Mooler0410/LLMsPracticalGuide
ResearchIntroduction to LangChain: A Framework for LLM Powered Applicationshttps://www.davidgentile.net/introduction-to-langchain/
ResearchMulti-Party Chat: Conversational Agents in Group Settings with Humans and Modelshttps://arxiv.org/abs/2304.13835
ResearchA large-scale comparison of human-written versus ChatGPT-generated essayshttps://t.co/qLO7JV2Gbl
ResearchEvaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare deliveryhttps://arxiv.org/abs/2304.13714
ResearchA Cookbook of Self-Supervised Learninghttps://arxiv.org/abs/2304.12210
ResearchNeMo Guardrailshttps://developer.nvidia.com/blog/nvidia-enables-trustworthy-safe-and-secure-large-language-model-conversational-systems/
ResearchAudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Headhttps://arxiv.org/abs/2304.12995
ResearchState Spaces Aren’t Enough: Machine Translation Needs Attentionhttps://arxiv.org/abs/2304.12776
ResearchAnswering Questions by Meta-Reasoning over Multiple Chains of Thoughthttps://arxiv.org/abs/2304.13007
ResearchGetting Started with LangChain: A Beginner’s Guide to Building LLM-Powered Applicationshttps://towardsdatascience.com/getting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c
ResearchGenerative AI at Workhttps://www.nber.org/papers/w31161
ResearchLLM+P: Empowering Large Language Models with Optimal Planning Proficiencyhttps://arxiv.org/abs/2304.11477
ResearchLanguage Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformershttps://arxiv.org/abs/2110.02402
ResearchImproving Document Retrieval with Contextual Compressionhttps://blog.langchain.dev/improving-document-retrieval-with-contextual-compression/
ResearchThe Embedding Archives: Millions of Wikipedia Article Embeddings in Many Languageshttps://txt.cohere.com/embedding-archives-wikipedia/
ResearchHugging Face Hubhttps://python.langchain.com/en/latest/modules/models/llms/integrations/huggingface_hub.html
ResearchEffective Instruction Tuninghttps://twitter.com/vagabondjack/status/1649127428659265537
ResearchReinforcement Learning with Human Feedback (RLHF)https://github.com/opendilab/awesome-RLHF
ResearchLanguage Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakeshttps://arxiv.org/abs/2304.09433
ResearchTransformer Math 101https://blog.eleuther.ai/transformer-math/
ResearchOpen-source research on large language models (LLMs)https://twitter.com/cwolferesearch/status/1647990311547797504
ResearchA visual guide to transformershttps://twitter.com/akshay_pachaar/status/1647940492712345601
ResearchEnhancing Vision-language Understanding with Advanced Large Language Modelshttps://github.com/Vision-CAIR/MiniGPT-4/blob/main/MiniGPT_4.pdf
ResearchTransformer: Attention Is All You Needhttps://arxiv.org/abs/1706.03762
ResearchLLMs on personal deviceshttps://simonwillison.net/series/llms-on-personal-devices/
ResearchLLM Source Context Evaluationhttps://twitter.com/jerryjliu0/status/1647626532519841793
ResearchGenerative Agents: Interactive Simulacra of Human Behaviorhttps://arxiv.org/abs/2304.03442
ResearchShall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Studyhttps://arxiv.org/abs/2304.06762
ResearchAuto-evaluate LLM Q+A chainshttps://twitter.com/RLanceMartin/status/1647645549875859456
ResearchUnderstanding Diffusion Models: A Unified Perspectivehttps://arxiv.org/abs/2208.11970
ResearchBuilding LLM applications for productionhttps://huyenchip.com/2023/04/11/llm-engineering.html
ResearchBoosted Prompt Ensembles for Large Language Modelshttps://arxiv.org/abs/2304.05970
ResearchTeaching Large Language Models to Self-Debughttps://arxiv.org/abs/2304.05128
ResearchThe Power of Scale for Parameter-Efficient Prompt Tuninghttps://arxiv.org/abs/2104.08691
ResearchMultimodal Procedural Planning via Dual Text-Image Promptinghttps://arxiv.org/abs/2305.01795
ResearchAre Emergent Abilities of Large Language Models a Mirage?https://arxiv.org/abs/2304.15004