Balance performance with cost optimization to unlock the potential of AI
With the rise of AI and machine learning, large language models (LLMs) have become increasingly popular, but their high computational costs can be a barrier to entry for many organizations. This book offers cost-effective approaches to building and deploying LLMs. At each stage of the process, from model selection and prompt engineering to fine tuning and deployment, you can minimize costs without unduly sacrificing performance.
Written for developers and data scientists, Large Language Model-Based Solutions provides the practical, technical knowledge needed to implement valuable generative AI applications like search systems, agent assists, and autonomous agents. The book explores techniques for optimizing inference, such as model quantization and pruning, as well as opportunities for reducing costs at the infrastructure level. It also considers future trends in LLM cost optimization, so you can remain competitive for the next stage in generative AI.
Written by one of Amazon's leading data scientists, this book empowers you to overcome the challenges associated with LLMs and successfully implement generative AI