The productivity promise
Every few months, another study emerges claiming AI makes knowledge workers dramatically more productive. Headlines tout 30% time savings, 40% quality gains, and the potential for trillion-dollar economic impact.
But the headlines miss the most important finding: the productivity gains are not uniform. They depend heavily on who is using the AI.
A junior consultant using AI to draft a market analysis sees a 43% improvement in output quality. A senior partner using AI to synthesize research across 50 industries and identify strategic patterns produces work that would have previously required a team of five analysts working for three weeks.
The difference is not 43%. It is multiplicative.
This article examines the research behind AI productivity gains and explains why AI paired with senior domain expertise produces a fundamentally different outcome than AI paired with junior labor.
The BCG/Harvard study
In 2023, Boston Consulting Group partnered with Harvard Business School to run the largest controlled study of AI productivity in knowledge work to date.
The setup: 758 BCG consultants were given 18 realistic consulting tasks and tracked for productivity and quality changes.
The results:
- Consultants using GPT-4 completed 12.2% more tasks on average
- They completed tasks 25.1% faster
- 40% of the AI group produced higher quality results
- Junior consultants saw a 43% improvement in task performance
- Senior consultants saw a 17% enhancement in task performance
At first glance, this looks like AI helps juniors more than seniors. But that interpretation misses the critical insight:
Juniors improved at tasks that are mechanical — formatting, data extraction, basic synthesis. These are tasks where AI provides near-parity with human output. A junior with AI can now perform formatting and data extraction tasks almost as well as they could manually, which represents a large percentage improvement on their baseline.
Seniors improved at tasks that are judgment-intensive — strategic synthesis, identifying non-obvious patterns, connecting disparate frameworks. These are tasks where AI amplifies an existing capability rather than replacing a mechanical process. A senior with AI can now synthesize 10x more source material, consider 3x more scenarios, and pressure-test recommendations against a broader knowledge base.
The study also identified two distinct patterns of AI use:
- "Centaurs" — Divide tasks strategically between human and AI based on each one's strengths
- "Cyborgs" — Completely integrate AI into their task flow, moving back and forth continuously
Senior practitioners were significantly more likely to adopt the "cyborg" approach — using AI not as a separate tool but as an integrated extension of their thinking.
McKinsey's Lilli
McKinsey deployed its proprietary AI tool "Lilli" to its consultants starting July 2023. The internal results provide a real-world validation of the research findings.
Adoption: 72% of McKinsey's ~45,000 employees actively use Lilli within six months of launch
Usage: Lilli processes 500,000+ prompts per month
Results: Consultants report up to 30% time savings in research and knowledge synthesis
Lilli draws from 100,000+ internal documents and interview transcripts. It can create PowerPoint presentations, draft client proposals, find internal experts, and research industry trends.
The 30% time savings number deserves scrutiny. It is not that consultants work 30% fewer hours — it is that they reallocate 30% of their time from mechanical tasks (finding prior work, synthesizing reports, formatting slides) to judgment tasks (strategic design, client conversations, synthesis).
A McKinsey partner can now ask Lilli to pull every piece of prior work on retail transformation in Southeast Asia, synthesize the common patterns, and draft a 10-slide overview in 20 minutes. That task used to take an analyst 3 days. The partner spends those 3 days on insight generation instead of knowledge aggregation.
The $4.4 trillion question
McKinsey's Global Institute estimates that generative AI has a potential $4.4 trillion annual productivity boost globally — approximately 4% of global GDP.
The estimate assumes:
- GenAI can automate work activities absorbing 60-70% of employees' time today
- Knowledge workers using AI report 10-25% performance gains on typical tasks (writing, research, programming)
- Employees using AI report an average 40% productivity boost
But here is the critical nuance: the $4.4T number assumes widespread adoption at current capability levels. It does not account for the multiplicative effect when AI is paired with deep expertise.
The 40% productivity boost applies to tasks like "write an email" or "summarize a meeting." These are tasks where AI provides direct substitution for human labor.
The multiplicative effect happens when AI enables a fundamentally different scope of work. A senior strategist can now:
- Synthesize research across 100 sources instead of 10
- Benchmark against 50 companies instead of 5
- Prototype 10 financial models instead of 1
- Draft 5 strategic options instead of presenting a pre-selected recommendation
This is not 40% faster. This is 5x the analytical breadth with the same time investment.
The senior-junior paradox
Multiple studies show that AI produces larger percentage gains for novice workers:
- BCG/Harvard: 43% improvement for juniors vs 17% for seniors
- GitHub Copilot: 27-39% output increase for junior developers vs 8-13% for seniors
- Stanford/MIT customer support study: 35% gain for novices, minimal for experienced agents
This creates an apparent paradox: If AI helps juniors more, why do seniors produce greater value?
The answer lies in what is being measured.
Juniors see larger gains on mechanical tasks. AI narrows the gap on work that is codified, rules-based, or pattern-matching:
- Data extraction
- Formatting
- Basic synthesis
- Following templates
- Applying known frameworks
Seniors see multiplicative gains on judgment tasks. AI amplifies the work that requires domain expertise, tacit knowledge, and strategic thinking:
- Identifying which frameworks apply to a novel situation
- Recognizing patterns across disparate industries
- Pressure-testing recommendations against second-order effects
- Synthesizing qualitative insights that do not fit templates
In sales environments, the best agents were 2.8x more likely to close sales when using AI — not because AI wrote better scripts, but because experienced agents used AI to enhance their ability to handle unexpected questions and adapt messaging in real-time. Junior agents could not leverage AI in the same way because they lacked the judgment to know when to deviate from the script.
AI commoditizes entry-level work while amplifying senior value. The base of the pyramid becomes less necessary. The top becomes more powerful.
What 5x actually looks like
"5x output per senior hour" is not a metaphor. Here are concrete examples:
Research synthesis: A senior partner needs to understand the competitive landscape for digital banking in Southeast Asia. Previously, this required briefing two analysts, waiting 5 days for a synthesis deck, then spending 2 hours reviewing and refining. With AI, the partner spends 90 minutes directing AI to scan industry reports, regulatory filings, competitor financials, and news articles, then synthesizes the findings directly. The output quality is higher because the partner's strategic lens is applied from the start, not after junior analysts pre-filtered the data.
Scenario modeling: A CFO wants to understand the financial impact of 5 different go-to-market strategies. Previously, this required building a financial model (2 days), running 5 scenarios (1 day), creating visualizations (half a day). With AI, the senior partner prototypes the model structure in 2 hours, directs AI to generate scenario outputs, and spends the remaining time on strategic interpretation rather than Excel formulas.
Benchmarking: A client asks how their AI governance framework compares to industry leaders. Previously, an analyst would research 5-10 companies, extract governance practices, and create a comparison matrix (1 week). With AI, the senior partner directs a scan of 50 companies, identifies the top 10 relevant frameworks, and synthesizes the comparison in 4 hours. The breadth is 5x greater. The strategic insight is deeper because the partner is doing the pattern recognition, not reviewing someone else's interpretation.
Deliverable iteration: A client wants to see 3 alternative strategic options instead of a single recommendation. Previously, this would require the team to draft 3 separate strategy narratives, build supporting analysis for each, and create 3 sets of slides (2-3 weeks of additional work). With AI, the senior partner outlines the 3 options, directs AI to draft the narrative arcs, generates supporting exhibits, and spends the time refining the strategic logic rather than formatting slides.
The common thread: AI handles the parallel work that used to require multiple people. The senior practitioner does the synthesis, judgment, and strategic design that only expertise enables.
Implications for consulting
The 5x multiplier has profound implications for consulting economics:
Traditional model: A consulting engagement bills $500K, staffs 1 partner (10% of hours) + 1 manager (30% of hours) + 3 analysts (60% of hours) for 12 weeks. The client pays for 1,200 junior hours to get 120 senior hours of strategic thinking.
AI-augmented model: A consulting engagement bills $300K, staffs 1 senior partner with AI infrastructure for 6 weeks. The client pays for 240 senior hours of strategic thinking augmented by AI-generated research, analysis, and drafting equivalent to 1,200 junior hours.
The client gets:
- 2x more senior judgment (240 hours vs 120 hours)
- Faster delivery (6 weeks vs 12 weeks)
- Lower cost ($300K vs $500K)
- No handoff failures (1 person vs 5-person team)
The senior partner gets:
- Higher hourly realization ($1,250/hr vs $1,100/hr blended)
- Direct impact (their expertise applied to the problem, not filtered through juniors)
- Reputation alignment (if the project succeeds, they get credit; if it fails, they are accountable)
The firm no longer needs to hire, train, and manage 10 junior analysts per partner. The economic model shifts from leverage through people to leverage through AI.
The bottom line
AI productivity gains are real. But they are not uniform.
A junior consultant using AI gets 43% better at mechanical tasks. A senior partner using AI can produce analytical output equivalent to a team of five.
The difference is not 43%. It is 5x.
The consulting industry will use AI. The question is whether firms will use AI to augment juniors and preserve the pyramid — or whether they will use AI to amplify seniors and eliminate the need for leverage altogether.
At Pertama Partners, we made a choice: AI paired with senior domain expertise. The person doing the work has 10+ years of operating experience. The AI provides the scale that junior analysts used to provide.
The result is faster delivery, better outcomes, and no bait-and-switch.