Business Issue #134 ·

Amazon's Math: Cut 30,000, Hire 11,000

Employees gaming an AI-usage leaderboard exposed the crack in the demand story behind Big Tech's $700B capex

Amazon's Math: Cut 30,000, Hire 11,000

Opening

Hello, reader — this is Knowledge Talking. Last week, AWS CEO Matt Garman said something remarkable in an interview: “Replacing junior employees with AI is one of the dumbest things I’ve ever heard.” In the same interview, he revealed that Amazon is hiring 11,000 interns and new graduates this year. I was floored when I heard it.

Just months before those words, Amazon had cut roughly 30,000 corporate jobs. Amazon CEO Andy Jassy had sent employees a formal letter saying “AI will reduce our total corporate workforce.” The company is also pushing a plan to replace more than 500,000 jobs with robots.

Hiring and cutting at full throttle, simultaneously — strange math. But what caught my attention more than the contradiction was a different incident inside Amazon: employees deliberately gaming an AI-usage leaderboard, a practice they called “tokenmaxxing.”1

🏷️ The People Pretending to Use AI

Here is what the Financial Times reported in May. Amazon ran an internal leaderboard called KiroRank. It was a dashboard ranking employees by token2 consumption — how heavily they used Kiro, the company’s AI coding tool. Amazon had set a target: more than 80% of developers should use AI tools at least once a week.

The result was predictable. Employees automated tasks like code deployment, email triage, and Slack message handling through an internal agent3 tool called MeshClaw — except these tasks weren’t actually needed. They were feeding busywork to AI to climb the rankings. Employees called it tokenmaxxing.

One employee told the FT: “The pressure to use this tool is enormous. There are people maxxing out token usage with MeshClaw.” Amazon said token consumption wasn’t part of performance reviews, but employees felt managers were checking the data informally. Security concerns surfaced too — MeshClaw had the authority to deploy code and interact with internal systems on a user’s behalf. One employee said the “default security settings are scary” and that they “can’t let it wander around unsupervised.” I’ve covered this pattern before in this newsletter, with Uber’s case.

What happened after Uber burned a year’s budget in four months

This wasn’t just Amazon. At Meta, a leaderboard called “Claudeonomics” tallied token consumption across 85,000 employees — 60 trillion tokens burned in 30 days. At public API prices, that’s roughly $9 billion worth. Top users earned titles like “Token Legend” and “Session Immortal.” Some employees left AI agents running research jobs unattended for hours just to climb the rankings. It was shut down within 48 hours of press coverage.

At Microsoft, president Julia Liuson sent an internal memo declaring that “AI usage is no longer optional — it’s core to every role and every level.” Nvidia’s Jensen Huang went a step further, saying publicly that he’d be “deeply worried about a $500,000-a-year engineer who isn’t spending $250,000 worth of tokens.”

The interesting part: all three companies shut down or restricted their leaderboards at almost the same time. Across Big Tech, AI usage itself had become a performance metric — and the industry simultaneously admitted the metric had failed.

📊 Is the Demand Behind $700 Billion Real?

Garman summed up the episode this way: “You have to measure the thing you’re actually trying to measure. We built the wrong metric, and people found ways to game the metric instead of the goal.”

Fair enough. But a more fundamental question remains: why was this metric built in the first place?

This year, combined AI capital expenditure (CAPEX)4 across Amazon, Alphabet, Meta, and Microsoft is roughly $700 billion — up more than 70% from $410 billion last year, or about ₩1,000 trillion. The core premise of this investment is that “AI demand will keep outrunning supply.”

But what if part of that demand was tokenmaxxing? Of the 60 trillion tokens that 85,000 employees burned in 30 days, how much was real work — and how much was leaderboard gaming?

External paying-customer demand is a separate matter, of course. In a CIO survey Garman cited, roughly 90 out of 100 raised their hands to say they’re seeing ROI on AI investment. He pointed to insurance claims processing cut dramatically from 60–90 days, and agents lifting a business process success rate from 20% into the 90s.

And Amazon does ship AI products that actually work. Amazon Connect Talent, launched in April, is a recruiting tool where AI autonomously conducts voice interviews around the clock. “Our recruiters want to focus on finding candidates and building relationships,” Garman said. “Punching in details is not what they want to do.” So he’s selling a tool that replaces recruiting interviews with AI while calling junior-replacement “dumb.” The two statements look contradictory, but they coexist within Garman’s logic: if AI handles the repetitive work, people move up to higher-value work.

But Garman himself conceded something: “Just as many companies — more, actually — are shutting down PoCs5 that didn’t show results.” The point isn’t that demand is all phantom. The problem is that in the usage data underpinning hundreds of billions in infrastructure investment, performance signaling and actual value creation are mixed together with no way to tell them apart.

Economics has a concept for this: Goodhart’s Law.6 “When a measure becomes a target, it ceases to be a good measure.” The moment token consumption was set up as the metric for AI productivity, it lost its ability to measure AI productivity. Tokenmaxxing is the textbook case.

🔍 Korea Is Losing Its Ladder Too

Let me pause on Garman’s “Excel analogy.” He argued that “Excel eliminated manual bookkeeping jobs, but people learned computers and the labor market expanded.” He’s not wrong. But the structure is different this time.

A Stanford research team analyzing ADP payroll data found that in AI-exposed occupations, employment of early-career workers aged 22–25 fell about 20%, while employment of workers in their 30s and up actually rose 6–9%. AI isn’t eliminating all jobs — it’s automating the work juniors learn on: summarizing, organizing, drafting. The bottom rungs of the ladder.

Korean data shows the same pattern. According to the hiring platform Catch, entry-level job postings at large IT and telecom companies plunged 67% year over year. Entry-level regular hiring across all large companies fell 43%. In May, Korea’s count of regular salaried workers turned negative year over year — for the first time in 26 years and 5 months. Regular workers in their 20s fell by 57,000 in the information and communications sector alone, while workers in their 30s in the same sector grew by 26,000. Hiring is being restructured from entry-level to experienced.

When Excel arrived, bookkeepers could climb up to become accountants. But what AI is automating now is the climbing itself — the junior developer learning the system by writing boilerplate, the junior analyst absorbing business context by cleaning data. In the words of the Stanford Social Innovation Review, what AI ate was “the low-risk, repetitive work that taught people how to work in organizations.” When that learning-work disappears, a structural hole opens in the talent pipeline — and in 3–5 years, the pool of people who should become middle managers and seniors runs dry.

Some companies see the problem. IBM decided that a strategy built purely on AI efficiency isn’t sustainable long-term and announced it would triple entry-level hiring. IBM’s CHRO said “the companies that will succeed most in 3–5 years are the ones doubling their junior hiring right now.” Cognizant hired 20,000 new graduates in 2025 alone.

This is the context for reading Garman’s claim that “the most important skill for future employees is not any particular technology but the willingness to learn.” He’s right. But in a structure where the opportunities to learn are themselves shrinking, willingness alone cannot replace the ladder.

Oswarld’s Take

Two decades of building GTM and technology strategy for companies have taught me, over and over, that measurement creates behavior. Set MAU as the KPI and employees chase signups instead of retention. Measure code commits and meaningless commits multiply. Measure token consumption and tokenmaxxing is the inevitable result.

What I find more telling is how this phenomenon interlocks with the structural incentives of an infrastructure vendor. AWS sells AI infrastructure. Customers must use a lot of AI for servers to sell. So when Garman says “jobs will change, not disappear,” he may be sincere — but it’s also the position he is commercially required to hold. If customers fear AI adoption, they don’t buy infrastructure.

This isn’t about Garman personally. It’s the structural double bind every infrastructure vendor carries: you have to say “AI will change the world” while also saying “the world must not change too much.”

The real test of AI adoption is simple. Ask not “how much are we using it” but “what has actually changed.” If insurance claims processing dropped from 60 days to 3, that’s valuable regardless of how many tokens were burned. This is exactly why Korean companies pursuing AI transformation should design outcome metrics before usage metrics.

Closing

To sum up.

Amazon cutting 30,000 while hiring 11,000 isn’t a contradiction — it’s a portfolio swap. Whether that swap rests on real productivity or on the phantom of “usage,” tokenmaxxing-style, is still being tested. For the $700 billion Big Tech is pouring in this year, and for the AI transformation investments Korean companies are making in its wake, the simplest question that separates real demand from noise is this: not “how much AI are we using” but “what has AI actually changed.”

Have you ever been told to measure “usage rates” or “adoption” after your company rolled out AI tools? Was what you measured the tool’s value — or just evidence of use? Share your experience in the comments and I’ll fold it into a future issue.


💬 Share your experience with measuring AI adoption in the comments · 📨 If a colleague would find this useful, pass it along.


References & Further Reading

Primary sources

Background

The author, Kwangseob Ahn, is a professor of business administration at Sejong University and lead consultant at OBF (Oswarld Boutique Consulting Firm). He teaches statistics and data analysis — business data management and business analytics — while leading GTM and AI strategy consulting in the field, designing the seam between technology and business. He has published academic research on memory architectures for AI dialogue systems (HEMA) and runs Daily Arxiv, a daily curation of global AI papers. He holds a graduate degree from Korea University’s Graduate School of Technology Management and a KMBA. He is the author of Homo Brainless: The People Who Outsource Their Thinking (Korean edition).

Footnotes

  1. Tokenmaxxing: artificially inflating AI tool usage to climb internal leaderboard rankings. A portmanteau of “token” (the unit of data AI processes) and “maxxing” (maximizing) — a phenomenon that surfaced across Big Tech simultaneously in 2026.

  2. Token: the smallest unit of data an AI model uses to process text. One Korean character is roughly 2–3 tokens, and most AI services price by token consumption.

  3. Agent: AI software that judges and executes tasks on its own, without human intervention. Unlike a simple chatbot, it can autonomously handle real work — deploying code, processing email, managing schedules.

  4. CAPEX (Capital Expenditure): what companies invest in long-term assets like data centers, servers, and real estate. In the AI era, GPU servers and power infrastructure make up most of it.

  5. PoC (Proof of Concept): a small-scale experiment to confirm whether a new technology or idea actually works. Success leads to full adoption; failure ends it.

  6. Goodhart’s Law: the principle, from British economist Charles Goodhart, that “when a measure becomes a target, it ceases to be a good measure.” Token consumption losing its ability to measure productivity the moment it turned from metric into target is the canonical example.