本实践课程系统讲解了如何测试和评估生成式AI应用,重点涵盖RAG(检索增强生成)和智能体AI工作流。课程从大型语言模型(LLM)应用评估的核心流程入手,通过RAGAs开源框架教授上下文相关性、忠实度和幻觉率等关键指标的计算方法。学员将通过实际操作使用Pytest实现自动化测试,运用DeepEval评估多智能体系统,并利用LangSmith进行工作流追踪和调试。课程无需评估框架先验知识,仅需基础Python能力和学习热情,最终使开发者能够构建可信赖的生产级LLM应用,掌握自定义评估数据集创建和基于标准答案的输出验证能力。

核心学习内容包括:RAG全组件评估方法、智能体AI测试自动化、LangSmith追踪评估技术,以及基于Python的测试框架实战应用。

MP4 | 视频:h264,1280×720 | 音频:AAC,44.1 KHz,2 Ch
语言:英语 | 时长:2小时47分钟 | 大小:1.7 GB

Testing or Evaluating Generative AI:RAG, Agentic AI, Handson。Mastering LLM Evaluation: Hands-on with RAG Testing, Agentic AI Testing, DeepEval, LangSmith. Learn how to test GenAI. Evaluating Large Language Model (LLM) applications is critical to ensuring reliability, accuracy, and user trust—especially as these systems are integrated into real-world solutions. This hands-on course guides you through the complete evaluation lifecycle of LLM-based applications, with a special focus on Retrieval-Augmented Generation (RAG) and Agentic AI workflows.You’ll begin by understanding the core evaluation process, exploring how to measure quality across different stages of a RAG pipeline. Dive deep into RAGAs—the community-driven evaluation framework—and learn to compute key metrics like context relevancy, faithfulness, and hallucination rate using open-source tools.Through practical labs, you’ll create and automate tests with Pytest, evaluate multi-agent systems, and implement tests using DeepEval. You’ll also trace and debug your LLM workflows with LangSmith, gaining visibility into each component of your RAG or Agentic AI system.By the end of the course, you’ll know how to create custom evaluation datasets and validate LLM outputs against ground truth responses. Whether you’re a developer, quality engineer, or AI enthusiast, this course will equip you with the practical tools and techniques needed to build trustworthy, production-ready LLM applications.No prior experience in evaluation frameworks is required—just basic Python knowledge and a curiosity to explore.Enroll and learn how to evaluate or test Gen AI application. What you’ll learn
Learn the end-to-end process of evaluating LLM applications. Quality criteria to choosing the right evaluation method, metrics for RAG, Agentic AI.
Learn Evaluation of RAG with RAGAs framework. Understand RAG and what are the component to evaluate of RAG.
Learn how to evaluate RAG with context precision, recall metrics. How to test RAG application with Python and RAGAs.
Learn how to test and evaluate RAG application with Pytest. Learn API automation of RAG application.
Learn how to test and evaluate Agentic AI application using DeepEval. Automation testing of Agentic AI using Pytest.
Learn how to trace RAG application using LangSmith. Create evaluation dataset using Python. Evaluate with Dataset using LangSmith.

Requirements
Very basic idea of Python and Generative AI.

下载说明:用户需登录后获取相关资源
1、VIP会员仅需30元全站资源免费下载!
2、资源默认为百度网盘链接,请用浏览器打开输入提取码不要有多余空格,如无法获取 请联系微信 yunqiaonet 补发。
3、分卷压缩包资源 需全部下载后解压第一个压缩包即可,下载过程不要强制中断 建议用winrar解压或360解压缩软件解压!
4、云桥网络平台所发布资源仅供用户自学自用,用户需以学习为目的,按需下载,严禁批量采集搬运共享资源等行为,望知悉!!!
5、云桥网络-CG数字艺术学习与资源分享平台,感谢您的赞赏与支持!平台所收取打赏费用仅作为平台服务器租赁及人员维护资金 费用不为素材本身费用,平台资源仅供用户学习观摩使用 请下载24小时内自行删除 如需商用请支持原版作者!请知悉并遵守!
6、For users outside China, If you do not have a Baidu Netdisk VIP account, please contact WeChat: yunqiaonet for assistance with logging into Baidu Netdisk to download resources..