December 2, 2024

How an AI Code Review Can Solve Inefficiencies in Development

The AI testing tools market is projected to exceed $2 billion by 2033, with AI code review driving innovation in software development and enhancing code quality in cloud and SaaS platforms. Read on to discover how this technology transforms workflows

5 min read

Meet our Editor-in-chief

Paul Estes

For 20 years, Paul struggled to balance his home life with fast-moving leadership roles at Dell, Amazon, and Microsoft, where he led a team of progressive HR, procurement, and legal trailblazers to launch Microsoft’s Gig Economy freelance program

Gig Economy
Leadership
Growth
  • Traditional code reviews take an average of 18 hours per pull request, while AI tools generate reviews in seconds, speeding up workflows and ensuring consistent, high-quality feedback.

  • Debugging consumes 50% of engineers' time at $160/hour, yet only 10% of critical bugs are caught. An AI code review can identify issues earlier, saving time and improving reliability.

  • With 40.9% of developers interested in AI code reviews, companies like VidMob saw a 50% reduction in unplanned work in two months, boosting efficiency and developer confidence.

Staff writer

From AI to FinOps, our team's collective brainpower fuels this blog.

“Today, software engineering and coding is the number-one area impacted by AI. At this point, software engineering without AI is a little bit like writing without a word processor.” - Hadi Partovi, Chief Executive of Code.org and Silicon Valley Investor and Adviser to Airbnb, Uber, Dropbox, and Facebook.

Every AI developer has probably wondered if you can use AI for code reviews. The concept of an AI code review is rapidly gaining popularity as a significant aspect of software development. Interestingly, AI code reviews are becoming more accurate and directly related to improving large language models (LLMs). Enhancing a model's ability to code enhances its overall capacity to reason and solve non-coding problems. Mark Zuckerberg, the CEO of Meta Platforms, explained, "Training the models on coding helps them just be more rigorous and answer the question and help reason across many different types of domains."

This is why OpenAI and Google are in a race to advance their LLMs' coding capabilities, including their ability to conduct an AI code review. The stakes are high, and they know that coding accuracy is the key to unlocking new levels of AI performance. OpenAI started this race early, using Turing’s coders to vet and enhance its AI models. Realizing it was lagging, Google followed suit, striking a multimillion-dollar deal with Turing. More than 500 programmers were put to work evaluating Google’s AI models over thousands of coding tasks for six months, mirroring OpenAI’s approach.

The impact of a reliable AI code review isn't limited to improving software development. It’s transforming entire business operations. As large language models improve, the potential for automated insights and decisions expands across industries, promising efficiency gains beyond the software field.

AI Code Reviews Solve Traditional Bottlenecks

Traditional code review processes are fraught with pain points that lead to inefficiencies and delays. Time constraints and bottlenecks are typical - engineers often juggle multiple tasks, making it challenging to provide timely feedback. The average code review takes around 18 hours from pull request (PR) publishing to completion, resulting in considerable development slowdowns. Allocating resources effectively for code reviews can also be a struggle, especially in high-paced environments. Review consistency is another significant challenge, as different reviewers have varying standards, leading to inconsistent quality and prolonged review times.

Moreover, debugging is an enormous burden on engineers, consuming about 50% of their time costing an average of $160 per hour. Despite these efforts, only 10% of critical bugs are caught by rule-based debugging tools. As Naomi Chopra, co-founder of Hatica, stated, “I saw the problems that developers have firsthand, and it’s not, as you might expect, about the difficulty of writing new code. Rather, it’s about inefficiency.”

AI solutions have emerged to solve these concerns, particularly in AI code reviews and automated code analysis. Since the inception of code review in the 1970s, the process has transformed dramatically. Large Language Models (LLMs) can generate a review in seconds, often delivering results faster than Continuous Integration (CI) systems. This speed is pivotal in addressing traditional bottlenecks. As OpenAI and Google race to create the best AI code review tools, the competition drives the innovation needed to overcome these challenges and reshape how developers approach code quality automation.

A look at the benefits of an AI code review.‍

Caption: A look at the benefits of an AI code review.

Beyond Software Development: The Impact of AI Code Reviews

AI code reviews impact software development in critical areas, from accelerating velocity to enhancing team dynamics. By utilizing AI specifically trained in programming code, developers can address bottlenecks in their workflow more efficiently, leading to faster iteration cycles. Recent studies show that 40.9% of developers are interested in using AI for code committing and review - a clear signal of these tools' value in speeding up development timelines. Using specialized large language models (LLMs) trained on code makes these reviews more accurate and structured, elevating product quality by catching issues and suggesting improvements that might be overlooked in traditional processes.

Developers are showing significant interest in the concept of an AI code review.
Source: https://survey.stackoverflow.co/2024/ai#developer-tools-ai-tool-interested

An interesting result of LLMs trained specifically on code is their enhanced ability to make decisions in other domains. In fact, research shows that large language models (LLMs) trained on programming code (Code-LLMs) are better at solving structured reasoning tasks than LLMs trained on regular text. 

Why is a Code-LLM with better decision-making abilities significant? These enhanced capabilities contribute to broader business outcomes by boosting efficiency and accuracy across various domains. This aligns with vertical AI - specialized, domain-focused models that outperform general-purpose solutions in specific tasks. 

By integrating Code-LLMs into software development, companies can achieve faster and higher-quality results while also creating opportunities for these models to expand their utility across other areas, such as project management, customer support automation, data analysis, and even decision-making processes in finance and healthcare. Vertical AI addresses immediate development challenges and evolves into a foundational technology that provides a competitive edge across industries.

OBDS Improves Code Quality and Reduces Delays with AI

One powerful example of how AI-driven tools are transforming business workflows comes from OBDS; a company focused on providing critical aviation software. OBDS is responsible for delivering up-to-date flight manuals and checklists that thousands of pilots rely on daily. With a growing codebase and the high stakes of aviation, maintaining code quality and avoiding errors that could lead to Aircraft on Ground (AOG) situations became increasingly difficult. Their manual code review process was slowing down development and leaving little room for error.

To address these challenges, OBDS integrated Bito’s AI Code Review Agent into their CI/CD pipeline. The AI tool automated code reviews and seamlessly integrated with their GitHub environment, analyzing pull requests in real-time. In just a short time, Bito reviewed 136,500 lines of code, flagged 473 issues, and saved the team 24 hours per sprint cycle. As Charles Guerin, CEO at OBDS, said, “One of the unexpected values of Bito’s AI code reviews is that it doesn’t accuse you of anything. Bito says: here’s what’s going on and here’s what you might consider doing about it. Engineers look at it as a time saver, not a supervisor. They’d rather the computer yell at them, not me!”

The results were immediate. Automating code reviews improved speed, reduced errors, and enhanced collaboration among engineers. With faster development cycles and higher-quality code, OBDS was able to meet the demanding standards of the aviation industry, ensuring safer and more reliable software for their customers.

Customizing Code Reviews to Meet Unique Needs

A great case study of how an AI-powered code review makes a difference is KukuFM’s adoption of CodeAnt AI to transform its development process. KukuFM, a leading audio storytelling platform in India, struggled with an inefficient code review process. As the platform grew, ensuring high-quality code and addressing bugs early became critical, but their manual review system slowed development and left room for errors. The team needed a faster, more reliable way to maintain quality and security while keeping up with best practices.

To solve this, Aman Bapna, KukuFM’s Director of Engineering, introduced CodeAnt AI’s AI Code Reviewer into their workflow. The tool's automated pull request generation identified potential defects early and allowed for customizable rules to match KukuFM’s coding standards. Seamlessly integrating with their existing tools like BitBucket saved the team time and effort. The team praised its ability to simplify the process, saying the automatically generated descriptions were “very useful to verify if the code meets all the requirements.”

The impact was immediate. Bugs were significantly reduced, security improved, and development cycles sped up, freeing developers to focus on innovation. By streamlining a tedious but essential task, CodeAnt AI helped KukuFM scale more efficiently while maintaining the quality and reliability their growing audience expects.

Automated Code Analysis: Empowering Developers with Clarity

Another interesting story of AI-driven code review comes from VidMob, a company that faced significant challenges in managing its remote teams and complex codebase. The transition to microservices amplified difficulties, making onboarding junior developers and addressing technical debt daunting. Traditional code analysis tools provided excessive data but needed actionable insights, leading to reactive decision-making and unclear priorities. This not only hindered progress but also strained team coordination and developer confidence.

To address these issues, VidMob adopted CodeScene (an automated code analysis tool) and integrated it into its development workflow. CodeScene’s tools, such as Hotspot Maps and Pull Request Integration, enabled targeted refactoring and improved code health. The Off-boarding Simulator and team alignment feature enhanced visibility and proactive management. Developers, even juniors, gained confidence through actionable insights and pre-merge quality gates, fostering a learning and quality coding culture.

In just two months, VidMob reduced unplanned work by 50%, streamlined onboarding, and embedded the value of high-quality code across teams, transforming challenges into measurable success.

The Economics of AI: Scale AI’s Shift to Skilled Labor

As we've discussed, LLMs are at the core of any AI code review. Creating such LLMs requires significant investment in terms of both computational power and human expertise. Training LLMs is not just about algorithms; it involves a dedicated effort from skilled human contractors. Scale AI, a company  pioneering in this line of work, has transitioned from relying on cheaper overseas labor to hiring highly educated U.S.-based professionals, including PhDs, doctors, and lawyers, to fine-tune these models.

Scale AI's CEO, Alexandr Wang, emphasized, “We need the best and brightest minds to be contributing data. Their work is how they have a very scaled society impact.” This focus on expertise has fueled Scale AI's revenue growth, with revenues expected to more than triple from $334 million in 2023 to just over $1 billion this year. However, this shift comes at a cost - labor expenses have increased sharply, resulting in a drop in gross margin from 59% to 49% between 2022 and 2023.

The move toward hiring highly skilled workers has also changed the team structure, pushing Scale AI to make its workforce more efficient. According to investor reports, they are now focused on identifying “efficient experts” to train LLMs, aiming to balance quality and cost as they scale their operations.

The Future Implications of AI Code Review Tools

The rapid adoption of AI code review tools has the potential to improve software development by making processes more efficient, identifying bugs faster, and improving the industry in meaningful ways. As Ariel Katz suggests, "The future of software development lies in the seamless integration of human creativity and AI capabilities." Developers who effectively merge AI's computational strength with their unique problem-solving abilities will shape the landscape ahead. 

Beyond improving productivity, AI will also fundamentally impact security. Joseph Thacker envisions numerous startups dedicated to AI-driven security, with AI SOC analysts, ethical hackers, and AI code review tools becoming standard. 

However, as AI features become more deeply embedded, they will also introduce new vulnerabilities. This emphasizes the growing importance of responsible AI, where security and ethical considerations must take priority. As Thacker notes, "Allowing AI systems to make decisions is convenient," but risks emerge if not accompanied by rigorous security testing. The future will be defined by how effectively the industry navigates these challenges, integrating AI responsibly and creatively.

The Final Takeaway: AI Code Reviews Enhance Efficiency

The addition of an AI code review to the software development pipeline enhances workflows by removing bottlenecks, automating code analysis, and improving quality. Businesses using these tools can work more efficiently, make fewer mistakes, and encourage innovation. With the AI boom and a surge in cutting-edge innovations, companies adopting AI-driven code tools can have an extra edge that helps keep them competitive in today's market.

Cut through the AI hype and join the thousands of business leaders getting practical enterprise insights delivered to their inbox

Welcome to the community! We'll be in touch soon.

Frequent Asked Questions

Can AI replace coding?

+

AI cannot fully replace coding but can significantly enhance the process. It excels at generating code snippets, debugging, and automating repetitive tasks, but human developers are still crucial for understanding requirements, creating designs, and solving complex challenges. AI is a powerful assistant, not a full replacement for human expertise.

Which AI is better for coding?

+

GitHub Copilot is highly regarded for coding, providing smart code completions, and even generating entire functions. Other tools like Kite and TabNine are also excellent for increasing coding efficiency. The best choice varies based on your programming language, workflow, and project size.

What is the best AI to check code?

+

OpenAI’s Codex, which powers GitHub Copilot, is considered one of the top choices for checking code, offering suggestions and debugging help. For more security-focused needs, Snyk Code stands out. The right AI for checking code depends on the type of project and your specific priorities.

What is the AI code review agent?

+

An AI code review agent is an automated tool that uses artificial intelligence to assist in reviewing code. It detects bugs, vulnerabilities, and inefficiencies while providing suggestions to improve code quality. These agents integrate seamlessly into development processes, making reviews faster and more consistent.

Which AI model is best for code review?

+

The best AI model for code review depends on your specific needs. GitHub Copilot and OpenAI’s Codex are popular choices for suggesting code improvements and identifying issues. For security and bug detection, tools like Snyk Code (formerly DeepCode) are highly effective.