Baz Reviewers
Baz's managed default reviewers analyze change requests for global naming, typing and logic bugs
Overview
Default Reviewers use an agentic retrieval and analysis system to process code changes within the context of the entire codebase. Code is divided into manageable chunks using a LangChain-based framework, with tree-sitter handling parsing for supported languages like Python. For embedding and similarity search, Baz relies on Voyage-Code-3, a model optimized for code representation. This setup enables Baz to analyze pull requests while accounting for dependencies and broader repository context, identifying issues such as breaking changes, outdated comments, and log errors.
Baz automates several steps in the code review process by integrating directly with GitHub. It evaluates outdated comments based on commit metadata and prior comment payloads, determining whether issues have been resolved. Log errors are identified by parsing GitHub Actions logs and attaching detailed comments at the relevant lines. Baz also identifies specific issues like typos, generic variable names, missing test assertions, and type mismatches. These insights are delivered as structured comments, enabling developers to address them directly in the GitHub interface.
The system is designed for efficient processing and scalability. Repository and organization data are stored in a single multi-tenant table, filtered by organization ID, repository name, and file path. Embeddings are stored in a pgvector database, enabling similarity searches to locate relevant code sections. When files are updated, Baz reprocesses only the changed files, ensuring minimal overhead while maintaining up-to-date insights. This approach supports a wide range of use cases and scales to handle large repositories effectively.
Last updated