Agile has revolutionized the software industry during the 21st century and replaced waterfall as the predominant method of software engineering. More recently, The Lean Startup transformed how people think about efficiently developing new technology products. Now, lean software engineering is emerging in response to limitations of traditional agile approaches like scrum.
With all these methodologies floating around, it can be hard to know what’s what. We are here to set the record straight on the differences between agile and lean, along with tips on how to fix common agile anti-patterns with lean methods.
The good news is that lean enhances agile rather than replacing it (like agile replaces waterfall), so you can incrementally add elements of lean without having to make major changes.
The term agile is often used to describe a lot of different things. In the broadest sense, “agile” refers to the original 12 principles of the Agile Manifesto, which are deliberately somewhat vague.
In practice, “agile” usually means that teams have adopted the scrum methodology. Scrum involves fixed-cadence “sprints” where there is a predefined goal and scope. Each task is typically estimated using “story points” rather than direct time estimates, and the sprint scope will be set based on the total amount of work the team can get done.
At the end of each sprint, the team has a retrospective discussion where they assess whether the sprint was successful and look for opportunities to improve. A team is usually deemed successful in scrum if they are able to finish a predictable and consistent number of story points each sprint – commonly known as “sprint velocity.”
Tasks should be well-defined at the start of each sprint, but tasks that are further out on the backlog may or may not be planned in detail. One tenet of agile is to avoid pre-planning a lot of work in detail (as in old-school waterfall development) because plans are often subject to change.
Another common agile process is Kanban, which actually originates from lean principles. Kanban does not have any sprint boundaries and is less structured than scrum, with teams continuously moving tasks through a workflow. Kanban is more tolerant of scope changes and requires less up-front planning. Because Kanban does not define a cadence for retrospective discussions or velocity measurement, teams will need to do that on their own. Though we focus on scrum for the rest of this article, a typical Kanban implementation with regular retrospectives will be largely analogous to scrum.
For the purposes of this article, we use the term “agile” to refer to processes associated with agile as practiced with scrum or a similar methodology, rather than “agile” as an umbrella term for anything aligned with the abstract principles outlined in the agile manifesto.
Lean software development comes from a set of principles outlined in The Toyota Way that originate from the Toyota Production System. In 2003, the book Lean Software Development: An Agile Toolkit made the first attempt at translating those principles to software engineering.
The original Lean Software Development book recommends a lot of specific practices, but the key principle from which all others derive is value stream mapping. Value stream mapping is looking at a project timeline and highlighting only activities that directly add value for the customer. Everything else is waste and could theoretically be eliminated, including:
The key advantage of lean is that it provides a framework for measuring waste in an absolute sense and assessing how close a team is to optimal productivity, which agile does not do.
An ideal agile sprint culminates in delivering functionality to customers, and teams are judged by their ability to achieve this goal. What happens after delivery, however, is unspecified.
In practice, it’s easy for agile teams not to follow up and see whether customers are getting value out of features. Each sprint’s success is judged in a retrospective meeting that typically happens before there is time to gather good feedback from customers.
Lean Solution: Add a Validation Step for Features. The Lean Start-up harps on this issue and discusses how to prioritize value, not just delivery. One key recommendation from this book is to put each new feature in a validation status after it launches to ensure that someone assesses its impact on customers. You can also have separate retrospective meetings dedicated to issues that recently finished validation where teams talk about features they delivered in previous sprints that did not add as much value as expected.
Lean Solution: Measure Short-term Churn. To assess the impact of low-value development, you should also look at how much time was spent on code that was subsequently removed or replaced in future sprints. You can track this manually, or use a tool like minware to do it for you.
The scrum agile process operates in “sprints” of fixed time intervals, such as two weeks. Scrum also involves setting goals at the beginning of each sprint in terms of functionality delivered to customers.
Experienced software engineers know that you can have quality, speed, and functionality, but not all three. If you control two, then the third will suffer. Agile does exactly this by prioritizing speed (sprint length) and functionality (sprint goals) over quality.
Scrum’s problem is that the key sprint velocity metric counts all work including both new features and bug fixes. Scrum ends up encouraging developers to cut corners to achieve the current sprint goal, because they are not penalized for creating bugs that show up in future sprints.
Lean Solution: Measure Fix vs. Feature Velocity. To fix this problem, measure how much of each sprint’s velocity is dedicated to bug or other quality fixes (e.g., user experience or performance problems), and make the goal to optimize long-term feature velocity after subtracting these fixes. This way, teams are ultimately judged by the time to implement features as well as fix any bugs that fall out, which removes the incentive to kick the can down the road on quality.
Lean Solution: Measure Customer Quality Impact. One way to increase feature velocity would be to just not fix any bugs, which would be an unintended outcome. To make sure this doesn’t happen, it’s important to also track the impact of quality problems on customers. You can do this with metrics like customer support contacts per customer activity (e.g., orders, daily active users), exception tracker error rates, net promoter scores, etc. If these metrics aren’t improving, then the fix vs. feature velocity metric may not accurately represent the time it would take to deliver well-functioning software.
Caution: Beware of Fix vs. File Rates. It can be helpful to look at how quickly teams are fixing bugs compared to how many bugs are filed. However, this metric can be misleading if people aren’t filing bugs that customers care about, or if they file too many bugs with a low impact. Customer contact rates are preferable because they’re more likely to be accurate, and they run less risk of creating tension between teams (“Hey, stop filing these low-priority bugs, you’re making me look bad!”)
The main metric in a scrum agile process is sprint velocity. Velocity is the number of “story points” (an abstract estimate of task difficulty) the team gets done in a fixed time period. Focusing on velocity pushes teams toward improving their predictability, which is a good thing. However, the problem is that there’s no way to tell what velocity should be, and therefore no way to identify process inefficiencies that don’t impact predictability. With scrum, a predictably slow and predictably fast team may have the same metrics!
Story points make matters worse by obscuring how people spend their time, and sprint reports only look at what got done by the end of the sprint without considering the length of each task.
The end result is that agile hides overhead caused by context switching, process bottlenecks (i.e., waiting for the next resource to become available), and other problems like having too many meetings or interruptions. A pessimist might even say that hiding things from bad micro-managers is a feature of agile rather than a problem. However, the Toyota Production System has shown that close partnership between managers and workers is a healthier dynamic.
Lean Solution: Measure Cycle Time. You can expose process problems by measuring how long it takes tasks to complete each stage of development, from starting work to opening a pull request, receiving a review, completing testing, merging, and launching to production. Tools like minware and Code Climate Velocity provide aggregate cycle time metrics, which you can then analyze to see where tasks are getting held up. Improving cycle times ultimately improves velocity by reducing overheard from context-switching and merge conflicts between long-running tasks.
Lean Solution: Measure Work-in-Progress Inventory. Cycle time has some limitations like not appropriately weighting larger tasks and only being measurable after the fact. To further drive down context-switching overhead, you can track and limit the number of development days worth of work-in-progress “inventory” that the team has each day during a sprint.
A perennial challenge that engineers face with agile is getting product managers and other business stakeholders to care about technical debt. The reason is that tech debt’s cost is typically built into task estimates, so it doesn’t detract from velocity. As a result, managers have a hard time understanding how much more a team could get done with a lower technical debt burden.
Lean Solution: Record Time Lost to Technical Debt. The way to fix this problem is to expose the magnitude of “interest payments” that teams make by struggling with tech debt in each sprint. During retrospectives, ask each developer to estimate the amount of time they wasted struggling with technical debt, and subtract that from velocity to provide a “debt-free” velocity metric. This will align everyone on the importance of fixing technical debt.
Lean Solution: Track Effort Spent on Refactoring. It is also helpful to measure how much time teams invest in repaying the “principal” of tech debt by refactoring to ensure that it is sufficient. You can mark certain tasks as refactoring and see how much velocity is dedicated to tech debt elimination each sprint. Another helpful approach is looking at long-term churn with a tool like minware to see how much older code is being replaced. Very low long-term churn rates may indicate that code is not being properly maintained.
Because scrum operates in fixed sprint increments, teams often neglect critical long-term planning. While it is important not to plan too much in advance because circumstances can change, the reality is that many projects span multiple sprints and failing to mitigate big risks in advance can cause major delays.
For example, selecting a database technology to use for a project without vetting its performance characteristics at scale can cause a team to redo large amounts of work from previous sprints when they discover fundamental limitations during the final testing stage.
The scaled agile framework provides some guidance for longer-term planning via a higher-level "program backlog." This is helpful, but does not provide an explicit mechanism for ensuring adequate technical planning, or for gathering feedback about the impact of planning misses.
Lean Solution: Introduce an Explicit Technical Planning Stage. To mitigate this issue, teams should have an explicit technical planning stage prior to implementation with the goal of quickly mitigating major risks that could come up prior to project completion. This stage may involve technical proof-of-concepts or “spikes” in addition to written design documentation. Technical planning reduces the risk of redoing work in later sprints. A good technical plan should serve as a blueprint, making the implementation phase highly predictable.
Lean Solution: Measure Pre-merge Churn. To assess the impact of planning misses, you can measure the amount of time spent on rework for tasks within a project. You can ask developers to estimate rework time in retrospectives, and also use a tool like minware to calculate a precise number of development days that are lost to rework prior to merging code.
Agile has been a tremendous step forward for software engineering teams by greatly improving efficiency from old-school waterfall methodology. However, it leaves too many things undefined and lacks the ability to counteract many anti-patterns that plague software development projects.
Lean principles provide a framework for filling the gaps left by agile and mitigating problems that often fly under the radar. You don’t need to throw out agile, but can instead follow the steps outlined here to make your agile process more lean and improve the quality of life for both engineers and managers building complex software.