Wednesday, June 19, 2024

Measuring AI effectiveness beyond productivity metrics


Last year was an AI milestone marked by enthusiasm, optimism, and caution. AI-powered productivity tools promise to boost productivity by automating repetitive coding and tedious tasks and generating code.  A year later, organizations are struggling to quantify the impact of their AI initiatives and are reevaluating metrics to ensure they reflect the desired business outcomes.

Measuring developer productivity has historically been a challenge, with or without the introduction of AI-powered developer tools. Last year, McKinsey & Company described developer productivity measurement as a “black box,” noting that in software development, “the link between inputs and outputs is considerably less clear” than other functions.

Reporting on the productivity of AI-powered coding demands a more nuanced approach than traditional metrics such as lines of code produced, the number of code commits, or task completion. It requires a shift to evaluating real-world business outcomes that balance development speed, software quality, and security.

Although using AI to produce more code faster can be beneficial, it can also lead to technical debt if the resulting code isn’t high quality and secure. AI-generated code often requires more time to review, test, and maintain. For example, developers may save time using AI to write code, but it will likely be spent later in the software development lifecycle. Additionally, any security flaws in AI-generated code will require engagement from security teams and additional time to mitigate potential security incidents. 

When assessing the value AI brings to software development, it’s essential to consider that AI should be implemented and evaluated as a supplement to human developers, not a replacement. 

Better productivity metrics

Instead of focusing on acceptance rates or lines of code generated, organizations should aim for a more holistic view of AI’s impact on productivity and their bottom line. This approach ensures that the actual benefits of AI-aided software development are fully realized and appreciated.

The best approach involves merging quantitative data from throughout the software development lifecycle (SDLC) with qualitative insights from developers regarding the real impact of AI on their daily work and its influence on long-term development strategies. For example, GitLab’s research finds that developers spend about 75 percent of their time on tasks other than code generation, which means that a more productive use of AI could enable developers to spend less time reviewing, testing, and maintaining code. 

One recommended technique for measurement is the DORA framework, which looks at a development team’s performance over a specific timeframe. DORA metrics measure deployment frequency, lead time for changes, mean time to restore, change failure rate, and reliability to provide visibility into a team’s agility, operational efficiency, and velocity as a proxy for how well an engineering organization balances speed, quality, and security.

Additionally, teams should consider utilizing value stream analytics to evaluate the complete workflow from concept to production. Value stream analytics does not rely on a solitary metric; it continuously monitors metrics such as lead time, cycle time, deployment frequency, and production defects. This approach maintains a focus on business results rather than developer actions. 

Implementing AI successfully

AI is still a new technology, and organizations should anticipate typical growing pains with the transition while recognizing that development and security teams may not trust AI yet. Introducing new AI tools to an existing workflow can require additional process changes, such as code reviews, testing, and documentation. 

To begin, teams should build best practices by working in a lower-risk segment before expanding their AI applications to ensure they scale safely and sustainably. For example, AI code generation helps produce scaffolding, test generation, syntax corrections, and documentation. This way, teams can build momentum and motivation by seeing better results and learning to use the tool more effectively. Initially, productivity may decline as teams adjust to these new workflows. Organizations should give their teams a grace period to determine how AI best fits their processes. 

AI will play a critical role in the evolution of DevSecOps platforms, reshaping how development, security, and operations teams collaborate to accelerate software development without sacrificing quality and security. Business leaders will ask to see how their investments in AI-powered tools are paying off — and developers should embrace this scrutiny and leverage the opportunity to showcase how their work aligns with the organization’s broader goals. 

By adopting a holistic approach that evaluates code quality, collaboration, downstream costs, and developer experience, teams can leverage AI technologies to enhance human efforts.

Image Credit: Irinayeryomina /

Taylor McCaslin is AI/ML Product Lead for GitLab.

Read more

Local News