Did AI or WFH Break The Career Ladder?
A rigorous new paper argues that the junior-hiring decline isn't really about AI. It's worth taking seriously, yet the methodology doesn't cleanly support the conclusion many people are drawing.
The first US labour data after the agentic shift has arrived - just. On June 2, 2026, the BLS released the quarterly Census of Employment and Wages for Q4 2025. The national headline numbers were unremarkable. Employment up 0.2% year-on-year. Average weekly wages up 4.2%. Broadly, late 2025 looks like late 2024.
But underneath that, the supersector breakdown tells a different story. The Information sector (which is one of the most AI-exposed of the NAICS categories, covering software publishers, data processing, telecommunications, and related industries) is down 2.2% in employment year-on-year. Wages are up 6.9%. Employment falling while average wages rise is consistent with the signature of a workforce shifting upward (fewer lower-paid workers, a higher-paid remainder) though the supersector totals alone can't confirm which tier is leaving.
Recently, a growing body of research has viewed this kind of pattern as evidence that generative AI is substituting for entry-level knowledge work. The Stanford Canaries paper. The Maasoum-Lichtinger AI-integrator-firm study. Mahieu’s Flanders administrative-data paper. The Anthropic Economic Index work. Different datasets, different methodologies, all finding the same shape. Junior down. Senior stable.
That pattern does not prove AI substitution. It is only a useful reason to care about the identification problem: when junior-heavy knowledge work weakens, we need to know whether we are seeing AI, remote-work frictions, or a third force correlated with both.
In May 2026, that AI substitution perspective got a serious challenge. Not from someone arguing that AI technology isn’t real, but from a careful labour-economics paper arguing that the signal is being mis-attributed. The results are real, but is the full conclusion really justified based on the methodology?
The WFH challenge
The Broken Ladder: AI, Remote Work, and Early-Career Hiring, is by Peter Lambert (University of Warwick and LSE) and Yannick Schindler (Ellison Institute of Technology, Oxford) and the acknowledgments include John Van Reenen, Nick Bloom, Steve Pischke, Steven Davis, and Stephen Hansen. This is not a fringe paper.
The dataset is large. 243 million new-hire records assembled from résumé data, predominantly via LinkedIn. 407 million online job vacancy postings collected from across the web. Four countries: US, UK, Canada, Australia. Period: 2017 through 2025.
The starting insight is a correlation. At the 6-digit O*NET-SOC occupation level, the standard measure of generative-AI exposure (Eloundou et al. 2023) and the standard measure of work-from-home exposure (Hansen et al. 2023) have a Spearman rank correlation of 0.77. The same occupations sit at the top of both rankings - software developers, accountants, management consultants. The same occupations sit at the bottom - electricians, janitors, construction labourers. The two different shocks hit largely the same kinds of work.
That correlation has a clean implication. Any study identifying AI effects through occupation-level exposure is also identifying WFH effects at the same time. The two cannot be disentangled at occupation level alone.
Lambert and Schindler design around this problem. They run joint difference-in-differences with both exposures entered together as co-treatments. When entered separately, the two exposures produce nearly identical event-study paths. Around 4-5 percentage points off the junior share of new hires by 2025. Around 3 percentage points off the share of postings requiring limited experience. Entered jointly, the WFH path remains essentially unchanged. But the AI-exposure path attenuates heavily and becomes statistically indistinguishable from zero.
They run dozens of robustness checks. Cinelli-Hazlett selection-on-unobservables diagnostics. Monte Carlo simulations on classical measurement error. Leave-one-out exercises across 18 occupational groups. Country-by-country re-estimation. Alternative WFH and AI exposure measures. The WFH-dominant ranking survives almost all of it.
Their cleanest test goes one step further. They identify firms that visibly offered remote work in 2021-22 (using job-posting language) and use that direct adoption signal as treatment in its own right. That design also predicts the post-2022 junior-share decline. Their reading: it is the organisational frictions of remote work (higher supervision costs, slower on-the-job learning) that have shifted hiring away from junior workers, not AI substitution.
Then on May 28th, the New York Fed published a media advisory arguing that 64% of the rise in young-college-graduate unemployment from 2023 to 2025 is explained by remote work, not generative AI. The methodology is different but, importantly, not independent in the way that phrase implies. The NY Fed headline rests on CPS microdata combined with a standard occupation-level index of how remotely a job's tasks can be done - the same task-content exposure family whose confound this paper is warning about. So rather than corroborating Lambert and Schindler from a clean angle, the NY Fed decomposition leans on the very identification strategy at issue. Two results pointing the same way, built on the same kind of exposure proxy, is closer to a shared methodological prior rather than independent confirmation.
To their credit, the NY Fed authors also hold AI exposure constant and find the remotability gap persists - but conditioning one occupation-level index on another with which it is 0.77-correlated is the very move Lambert and Schindler show cannot separate the two shocks. Their genuinely stronger evidence is firm-level: using data from one Fortune 500 company, they show it hired fewer inexperienced and more experienced workers when offices closed, reverted when offices reopened, and kept favouring experienced workers specifically for distributed teams even after reopening. That within-firm, distributed-versus-colocated variation does sidestep the exposure confound, and on the narrow AI-versus-WFH question it is cleaner than anything in the exposure literature. But it is one firm, so external validity is the open question - and it is the part of the WFH case that doesn't reduce to a correlated proxy.
Overall, this challenge is significant. It is a serious empirical and institutional pushback from credible sources, arriving at roughly the same moment.
But is the conclusion correct?
Two things deserve careful examination before we accept this verdict.
Start with the asymmetric measurement. The Lambert-Schindler design pits actual WFH adoption against occupation-level AI exposure. Direct firm-level adoption for one shock. A task-content proxy for the other. The authors acknowledge this explicitly: “A symmetric measure of actual GenAI adoption would be a natural complement, but is presently not available at the scale and coverage of our WFH measure.” That asymmetry matters more than they suggest. What is actually shown is that exposure-based AI designs lose explanatory power when actual-WFH-adoption is included. That is not the same as showing that actual AI adoption has no effect on junior hiring. A firm using Claude or Copilot heavily could be reducing junior hiring in ways an occupation-level exposure index would never cleanly capture. The comparison we really need is actual-WFH-adoption against actual-AI-adoption, both at the firm level, in the same dataset. That comparison does not yet exist.
Then there is selection into WFH. Firms that visibly offered remote work in 2021-22 are not a random cross-section of the economy. They are deeply self-selected. The most dangerous dimension is interest-rate sensitivity and tech-funding-cycle exposure, which predicts both heavy WFH adoption and a junior-hiring collapse for reasons unrelated to either remote-work frictions or AI. The 2022 rate-hiking cycle and the tech-layoff wave that began with Meta's November cuts both landed hardest on exactly the high-WFH, high-exposure firms - and the matching strategy, keyed to WFH and AI exposure quintiles, does nothing to absorb them. The rest of the list compounds it: digital readiness, post-pandemic restructuring intensity, urban concentration, capital intensity, white-collar share. The paper handles this by matching on AI and WFH exposure quintiles, which is likely the right move for the most obvious correlated treatments. But the matching does not address the long list of other dimensions on which WFH-adopting firms differ from non-adopters. The Cinelli-Hazlett exercise is useful as a generic omitted-variable benchmark, but it does not directly model this particular selection channel: high-WFH adopters may also be the firms most exposed to the 2022–23 rate shock, venture-funding reversal, and tech-sector restructuring. If those shocks independently reduced junior hiring, the “actual WFH adoption” design may still be partly loading on a broader high-growth-tech correction rather than remote-work frictions alone.
Where is the real AI signal?
I think there are two qualifications we should also consider to close out this discussion.
First, all of this really needs to be overlaid against AI capability timing. Lambert and Schindler deliver a good analysis, and I believe they are very likely right for the 2023 through 2024 window. The kinds of AI tooling that could plausibly substitute for a junior software developer (reasoning models capable of multi-step planning and coding harnesses capable of executing it autonomously) only became mainstream from early to mid 2025 - OpenAI’s o1 in December 2024, Claude 3.7 in February 2025, then Claude Code, Cursor and Codex moving into production deployment through the rest of that year. Before that point, the enterprise-scale substitution case was much weaker. So the junior-hiring decline that Lambert and Schindler observe across 2023 and 2024 is extremely unlikely to be AI substitution. WFH organisational frictions, the 2022-23 rate-hike cycle, the venture-funding reversal, and the tech-sector restructuring that started in late 2022 are far more plausible drivers for that window. The substitution case was always going to make its empirical bid from late 2025 forward, as the agentic-coding capability shift fed enterprise hiring decisions through 2026 and beyond. And crediting Lambert and Schindler with the 2023-2024 attribution does not really challenge the substitution argument. But extending the WFH attribution forward into 2025-and-onwards, on the basis of a sample that mostly predates the capability shift, I think that is overclaiming on what their evidence can support.
Second, the cleanest place to test the substitution case is at the firms with the best information about what AI can actually do, and the least exposure to the WFH confound. The frontier labs are unique on both counts. They know what their tools can do. They are directly exposed to whether AI can substitute for the junior engineering work they do themselves. And they run heavily in-person operations by deliberate choice (Anthropic and OpenAI in San Francisco, Google DeepMind in London) which makes the WFH organisational-frictions story largely inapplicable.
What these firms are doing maps onto the substitution prediction cleanly. Anthropic does not run a summer internship program. OpenAI closed its residency program earlier than its historical cycle. And across the broader tech industry, Q1 2026 layoffs ran at 47.9% AI-attributed - a 5-6× jump from sub-8% across 2025.
These firms are running mostly co-located. They are at the leading edge of AI capability. And they are visibly redistributing hiring away from junior generalists toward senior researchers and specialists. That is the compositional signature appearing at the source, where the WFH-confound that bites the broader literature does not apply.
Where this leaves us
Lambert and Schindler have established something real and worth monitoring and exploring further. Studies that rely on occupation-level AI exposure indices to identify AI effects on hiring share a confounding problem with the rapid post-pandemic shift to remote work. Until firm-level direct measures of AI adoption become available at scale, the exposure-design literature cannot cleanly attribute the junior-hiring decline to either shock. That is a methodological flag worth attaching to every paper in this space that uses exposure-based identification.
But what their paper has not established is that AI adoption has no effect on junior hiring from 2025 onwards. The data limitation runs in both directions, and the evidence from the firms with the best information and no WFH confound (the frontier labs) points the other way.
The really clean test is the firm-level AI adoption data. That data is being built right now. The Anthropic Economic Index. The upcoming BLS adoption measures. The next round of empirical work, paired with the multi-quarter QCEW trajectory through Q1 and Q2 2026 (due in August and December), will all help to settle this debate properly.
One more thing worth saying. Lambert and Schindler are studying the same augmentation-phase window the cohort-pattern literature has been studying. If their WFH organisational-frictions story is right, the pattern should stabilise or improve as firms diffuse better remote-management practices. If the AI substitution story is right, the pattern should accelerate from 2026 onwards, because the late-2025 agentic-coding shift puts genuinely more capable AI tooling into the same firms. The next round of QCEW releases and resume-and-posting data will read directly on that prediction.
In the meantime, the Q4 2025 QCEW Information sector reading (employment down, wages up) is the leading-indicator signal worth watching. The pattern is real and visible across multiple datasets. What is driving it is what the next round of evidence will resolve.


