TLDR;
This video provides a guide to effectively use GPT5, focusing on agentic use cases and coding applications. It covers key prompting techniques, including controlling agentic eagerness, using tool preambles, and optimizing prompts for coding tasks. The video also introduces OpenAI's prompt optimization tool for refining prompts with direct feedback.
- Agentic eagerness allows control over GPT5's decision-making.
- Tool preambles provide updates on GPT5's progress.
- The prompt optimization tool helps refine prompts for better results.
Introduction to GPT5 Prompting [0:00]
The video introduces a guide by OpenAI on how to maximize the quality of outputs from GPT5, particularly for agentic use cases. GPT5 excels at tool calling, instruction following, and understanding long contexts, and the guide provides tips to control these agentic behaviors. The video will cover the guide and also OpenAI's prompt optimization tool.
Agentic Eagerness and Reasoning Effort [1:05]
Agentic eagerness is the ability to control how much decision-making GPT5 makes versus how much direction it takes. By default, GPT5 is thorough, but this can be adjusted for faster answers by changing the reasoning effort setting in the playground, API, or by choosing a faster model in chat GPT. Lowering reasoning effort reduces exploration depth, improving efficiency and latency, which results in less tokens used, less tool calls, less thinking, and faster and cheaper answers. Prompts can define how the model interacts with tools, including search breadth and frequency.
Context Gathering and Tool Call Budget [2:57]
Clear criteria should be defined in prompts for how the model explores the problem space, including methods for broad to focused queries, deduplication, and avoiding over-searching. Early stop criteria and escalation procedures can also be included. A tool call budget can be set to limit the number of tool calls, balancing speed and correctness. Providing an escape hatch allows GPT5 to end context searches early if sufficient information is found. Conversely, increasing the reasoning effort parameter makes GPT5 more eager to collect all necessary context before answering.
Defining Stop Conditions and Tool Preambles [7:15]
It's important to define stop conditions, safe and unsafe behaviors, and when to hand control back to the user. Tool preambles provide updates on what GPT5 is doing, which tools it's using, and its status throughout the process. GPT5 is trained to provide clear plans and progress updates, but the frequency, style, and content of tool preambles can be customized in the prompt.
Responses API [9:27]
The GPT5 API has two versions: chat completions and the responses endpoint. The responses API is recommended due to statistically significant improvements in evaluations. It allows reusing context across API calls, improving agentic flows, lowering costs, and using tokens more efficiently. The model can refer to previous reasoning traces, conserving chain of thought tokens and eliminating the need to reconstruct plans after each tool call.
Prompt Optimizations for Agentic Coding [10:26]
GPT5 is particularly good at creating frontends, and the recommended front-end frameworks include Next.js, TypeScript, React, and HTML. For styling, Tailwind CSS, Chad CN UI, and Radix themes are suggested. To one-shot a web application, instruct the model to create a rubric to measure itself against, improving output quality through thorough planning and self-reflection.
Iterating on Existing Codebases [12:58]
When iterating on existing codebases, GPT5 searches for reference context, such as reading package.json. This behavior can be enhanced by summarizing key aspects like engineering principles, directory structure, and best practices in the prompt. The cursor team found that GPT5's verbosity could be adjusted to keep text outputs brief while encouraging verbose outputs in coding tools.
Cursor's Nuances and Key Tips [14:43]
The cursor team refined prompts to encourage clarity in code, preferring readable solutions with clear names and comments. They also found that providing more details about product behavior encouraged the model to carry out longer tasks with minimal interruption. GPT5 was initially too inquisitive, so the cursor team tuned it down by removing the maximize prefix and softening language around thoroughness.
Optimizing Intelligence and Instruction Following [16:44]
Verbosity controls the length of the final answer, while reasoning effort controls the length of the thinking process. GPT5 follows instructions with precision, but conflicts or undefined edge cases in the prompt can cause issues. To address this, prompts should be logically consistent, and AI can be used to identify conflicts. The prompt engineering process should be iterative, involving testing and adjusting as needed.
Minimal Reasoning and Markdown Formatting [18:59]
Minimal reasoning is the fastest option for latency-sensitive use cases, but performance can vary more drastically. Emphasizing a brief explanation of the thought process, thorough tool calling preambles, clear tool instructions, and prompted planning can improve performance. Markdown formatting should be used only when semantically correct, with specific guidelines for file, directory, function, and class names.
Metaprompting and Prompt Optimization Tool [20:42]
GPT5 can be used as a metaprompter to optimize prompts, providing suggestions for additions or deletions to elicit desired behavior. The prompt optimization tool in the playground allows users to enter a developer message and prompt, then optimizes the developer message to get the most out of the user message. The tool provides explanations for changes and allows users to request further modifications.