Baz Agents
Coding and Code Review agents that analyze pull requests for quality, correctness and security issues.
Baz runs AI agents, including coding agents and code review agents, as for evaluating pull requests. Rather than assessing changes on a per-file basis, these agents consider the entire repository and its external context. The codebase is split into indexable units, and embeddings with similarity measures are used to retrieve relevant code and tests. Agents perform agentic code analysis and optional runtime inspection, yielding structured findings. These findings are shared as pull request comments and CI check results.
Coding Agents
Fixer is a new class of code composing agents. It runs in a ephemeral, secure, sandboxed environment and can be tasked with fixing issues discovered during review.
Accelerates the code review cycle by letting suggested fixes be committed directly to a PR, eliminating manual edits and context switching.
What it does
Proposes small, safe edits to address verified issues and can apply them to the PR so reviewers and authors see a working suggestion in-place.
Focuses on self-contained fixes that are low risk to apply automatically.
High level guidance
Only apply fixes that are clearly correct and scoped to the change. Avoid risky changes that require design or product decisions.
Keep the suggested changes minimal and accompanied by a short rationale so reviewers can accept or tweak the suggestion quickly.
Tools it uses Tools that gather code and diff context, tools that produce a patch or suggestion, and tools that safely create commit suggestions against the PR.
Context it consumes PR diff, related files needed to justify the fix, and any metadata that explains intent (PR title or ticket). The agent favors fixes that can be validated by the changed code alone.
How it behaves Runs a quick verification workflow: build a minimal justification for the change, prepare a patch, and surface the patch as a suggested commit. The agent favors simplicity and high confidence fixes.
How to Trigger Once Baz Fixer is enabled and configured, you can trigger fixes directly from your PR in two ways:
Apply fix on a single comment Each Baz review comment includes a checkbox: “Apply fix with Baz”. Selecting it will generate a commit that addresses that specific finding.
Fix all comments in a PR
If your PR contains multiple Baz comments, you will see a “Fix all” option in the PR description. Selecting it will generate a separate commit for each open Baz comment in the PR.
Code Review Agents
Reviews are our general purpose code-review agent class. They are individually scoped, contextualized and steered to discover, analyze and fix coding issues on specific engineering sub-domains. Combined with memories, derived by user feedback to the Baz agent on pull requests, each agent is both extremely focused and highly tuned to your codebase's unique requirements.
Ensures implemented code and design align with documented requirements, identifying gaps or deviations early.
What it does
Extracts explicit requirements from tickets and designs and validates whether the implementation satisfies those requirements.
Produces a verdict for each requirement with evidence: met, partially met, or not met.
High level guidance
Keep extraction strictly ticket-driven: only record requirements explicitly stated in the source materials.
Validate each requirement using code and, when available, preview environments or design artifacts.
Tools it uses Tools that fetch ticket and design artifacts, tools that help get context from code and diffs, visual comparison helpers when preview environments are available, and evidence capture tools.
Context it consumes Ticket text and attachments, design files, PR diff, optional preview environment snapshots, and prior specifications for consistency.
Activation note Connect your integrations to activate this agent. When design or preview integrations are present the agent will include visual validation as part of the verdict.
Ensure AI-generated code follows consistent, high-quality standards aligned with your engineering practices.
What it does
Applies organization coding conventions and quality expectations to AI-produced output.
Produces guardrails and standard phrasing developers can copy to align model behavior.
High level guidance
Emphasize consistency and predictability. Encourage minimal, well documented suggestions and require evidence when changes affect public contracts.
Tools it uses Policy and style templates, and context tools that map repository expectations.
See more details Skills & Instructions
Code Correctness
Identifies logical inconsistencies, flawed conditionals, and edge cases that could produce unexpected behavior.
What it does
Highlights incorrect logic, incomplete implementations, missing steps, and unintended side effects.
Gives concrete traces and examples of failing execution paths.
High level guidance
Compare the implementation with PR intent or ticket context to determine whether behavior is intentional. Prioritize concrete, reproducible issues.
Tools it uses Tools that map code flows and help extract execution traces along with code and diff exploration utilities.
Context it consumes PR title and ticket context, diff hunks, and the code paths needed to trace complete execution from input through output.
Detects changes that alter or remove existing functionality and could break dependent APIs or features.
What it does
Finds contract or API surface changes and ties them to consumers that would fail.
Produces actionable findings with exact locations and suggested mitigations.
High level guidance
Only label a change as breaking when there is direct evidence showing a consumer or contract is affected. Avoid hypothetical statements.
Tools it uses Tools that discover API surface and contract definitions, tools that help find consumers across the repo, and diff/context tools to produce evidence.
Context it consumes PR diff, public API/type definitions, API docs if present, and consumer client code references.
Ensures variables and functions use appropriate data types to prevent type related errors.
What it does
Flags type changes that could cause runtime or integration problems, especially where code interfaces with external systems.
Recommends specific, actionable type fixes or mitigations.
High level guidance
In strongly typed modules prefer conservative assumptions about types, but call out clear inconsistencies that impact external contracts.
Tools it uses Tools that help get type and API context from code, diff comparators, and repo search helpers.
Context it consumes PR diff, type definitions and usages, and any module metadata that clarifies language and dependency expectations.
Code Quality and Correctness
Finds unclear identifiers and obvious spelling mistakes that reduce code clarity.
What it does
Flags non descriptive or incorrect names and typos in code and comments.
High level guidance
Be conservative with stylistic nitpicks. Avoid enforcing strict naming conventions that conflict with the repo style.
Detects duplicated logic and enforces existing team patterns and conventions.
What it does
Finds repeated code and suggests refactors that follow team patterns.
Encourages reuse and clearer abstractions aligned with local conventions.
High level guidance
Prefer refactors that are small and safe for the current change. Avoid large architectural rewrites in a single suggestion.
Proposes simpler, more idiomatic code that keeps readability and correctness.
What it does
Suggests idiomatic patterns appropriate for the repository language and flags overly verbose constructs.
High level guidance
Focus on substantial improvements that reduce complexity while preserving behavior. Do not suggest changes for very small or trivial lines.
Ensures code is tidy, well organized, and follows agreed style rules to improve maintainability.
What it does
Flags commented out code, obvious clutter, and structural issues that impede readability.
Avoids false positives for accepted patterns such as deliberate multi-line strings or documented TODOs.
High level guidance
Recommend cleanup only when it improves maintainability and does not change intent. Keep suggestions pragmatic and minimal.
Security Best Practices
Identifies common security anti patterns like unsanitized inputs, PII exposure, and injection vectors.
What it does
Flags hardcoded secrets, PII leaks in logs, risky SQL or command usage, and missing input validation.
High level guidance
When calling out PII or secrets, specify exact locations and avoid hedging language. Avoid flagging framework handled behavior or test placeholders.
Ensures backend APIs follow modern REST conventions and sound design.
What it does
Looks at route names, HTTP method usage, versioning, parameter patterns, and resource hierarchy.
Focused specifically on backend endpoints and their server side implementations.
High level guidance
Only flag server side endpoint implementations. Do not flag client-side calls or client library usage.
FAQ
What do Baz’s default agents do?
They analyze change requests for naming, typing, logic bugs, outdated comments, log errors, etc., using a combination of AI, parsing, and repository context.
Last updated