Internet Companies Will Design AI Agents the Way They Designed Feeds: Path Dependence in Agent Product Design

Fri, 12 Jun 2026 08:00:00 +0800

TL;DR A few weeks ago I wrote about the structural shape of inbox-native agents vs. Chrome extensions. That post asked what shapes of agent exist and which one fits which workload. This post asks a different question: who is building these agents, and what organizational baggage will they ship by default? My answer, after a half-year running two agent products in production next to incumbents in the same space, is uncomfortable: internet companies will design AI agents the way they designed feeds — open loops, growth metrics, attention capture — because the org, the dashboards, and the muscle memory are the same. The agent you want (calm, draft-only, guardrail-first) is the one built by a team that was never paid to capture your attention. The agent you will be offered is the one built by a team that always was. The difference is not the technology. The difference is the org chart.

A few weeks ago I wrote about the structural difference between a Chrome extension and an inbox-native AI agent, drawing on six months running both in production for cross-border sourcing workflows. The post ended on what I thought was the structural point: the inbox agent is “a junior colleague who is always at the desk” — async, multi-device, memory-driven, compounding in value with every email it reads.

I still believe that. I am no longer sure it is the most important sentence in the post.

The most important sentence, in retrospect, is the one I almost cut:

The agent that only works when the user is at the keyboard is not the agent most operators need.

That sentence is true. It is also dangerous. Always-on, multi-device, memory-driven, and compounding are exactly the four properties that made the consumer attention economy work. The same four properties that make an agent useful to a Rotterdam importer on a 4-hour train are the four properties that, in the hands of a company whose business model is attention capture, will produce the next TikTok-shaped product — except this time the loop runs inside the inbox instead of inside a feed.

So this post is a follow-up. It is a builder/buyer’s warning about organizational path dependence in agent product design. It is not a product review, and it does not name specific companies. The pattern I want to expose is structural, and it shows up anywhere a consumer-internet organization ships a new product category on top of an old operating system.

What “path dependence” means inside a product org

Path dependence is the academic phrase for the obvious observation that the past shapes what is possible next. Inside a product organization, the past is not just the codebase — it is the operating system the org runs on: the dashboards, the OKRs, the hiring pipeline, the meeting cadence, the growth team, the engagement team, the weekly review template, the quarterly business review template, the comp plan.

These things are not the product. They are the infrastructure of product decisions. When a new product category shows up — say, AI agents — the org does not redesign the infrastructure. It bolts the new product onto the infrastructure it already has, and the infrastructure quietly decides what the product is allowed to be.

Three concrete dynamics I have seen play out in product organizations, anonymized and generalized, that I think everyone building or buying an agent product should be able to recognize:

1. The metrics org keeps tracking the metrics that exist. The dashboards in the BI tool are wired to a specific shape: DAU, WAU, MAU, time-in-app, sessions per day, notification opt-in rate, retention curve, conversion. A new product is asked to report against these metrics because those are the metrics the dashboard already shows. The product team is not making a conscious decision to optimize for attention. They are answering a question the dashboard already knows how to ask.

2. The growth team keeps A/B testing the open-loop mechanic. A growth team is good at one thing: finding the variable in the product that most increases the metric in #1. For a feed, that variable is the next-item-to-scroll-to. For an inbox agent, the analogous variable is the next-draft-to-push-notify-about. The growth team does not need to be told to test this. They will test it because that is the muscle. The PM does not need to approve it. The A/B test will run because the infrastructure supports it.

3. The PM keeps being measured on engagement, not on outcomes. The PM’s comp plan and the PM’s quarterly review are tied to engagement metrics because that is how PMs have been evaluated for fifteen years. A PM who ships a product that the user uses less but gets more done is, on paper, underperforming. A PM who ships a product the user uses more but accomplishes less is, on paper, a hero.

None of these people are bad actors. None of them are deliberately designing an attention black hole. They are running the org’s operating system. The agent that ships from a consumer-internet org will be optimized for the metric the org already knows how to measure, not the metric the new product category actually needs.

The five design decisions that betray the muscle memory

The way to see path dependence in an agent product is to look at the default design decisions, not the marketing decisions. Defaults are what the product does when the user does not actively intervene. Defaults are what the org shipped when nobody was watching. And defaults, in my experience, betray the muscle memory cleanly.

Here are the five default design decisions that, in my reading of the inbox-agent category, separate the products built by internet-incumbent teams from the products built by teams without that muscle memory. I have phrased each as a false default and a calm default, because that is the binary the org’s operating system is making for you.

False default 1 — Open-loop inbox (never hits zero). The agent generates drafts continuously, surfaces them in a constantly-refreshing queue, and the queue never closes. The user’s mental model: there is always more to do. The calm default: closed-loop task list. The agent finishes the day’s work, the queue closes, and the user has permission to stop. If the queue is not closed by design, the user will not close it by willpower.

False default 2 — Push notifications for drafts. The agent sends a phone notification when a draft is ready, when a follow-up is overdue, when “an insight” is available. The user is interrupted in the middle of dinner, a meeting, a kid’s bedtime. The calm default: pull-only. The user opens the agent when they want to see what is there. The agent does not knock. Especially in an inbox-native agent, push notifications are the single most reliable way to convert a productivity tool into an attention sink.

False default 3 — Variable reward on “agent insights.” Some insights are useful; most are not. The product surfaces them with the same chrome and the same frequency, and the user has to look at every one to find the useful ones. The calm default: trust-graded surfacing. Useful insights (overdue follow-up, supplier renegotiation) get a distinct, rare, high-signal treatment. Low-signal observations stay in a daily digest the user reads on their own time. The variable-reward mechanic is the same mechanic that made the feed addictive; importing it into an agent is importing the exact thing the user is trying to escape.

False default 4 — Auto-suggest next action. The agent drafts the next email before the user asks. The user opens the inbox, and there are 7 drafts waiting, each representing a decision the user now has to make. The calm default: wait-for-prompt. The agent watches; it does not draft unless the user explicitly asks. The auto-suggest mechanic, in the agent form factor, is a decision-debt generator. Every auto-suggested draft is a decision the user is now responsible for, even if they delete it.

False default 5 — Gamified streak / score. The product shows “you replied 23% faster this week” or “you sent 14 supplier follow-ups in 3 days.” The metric is real; the framing is borrowed straight from consumer apps. The calm default: outcome-based reporting. The product reports on what got done, not on the activity. “Quoted Hamburg Logistics at $4.20 — they accepted on the third round, 2% below your last cycle” is a calm report. “You sent 14 emails” is a gamification report.

These five are not the only defaults. They are the five I have watched the most carefully. In every case, the false default is the easier default to ship, because it is the default the org’s tooling, dashboards, and comp plans are already aligned with. The calm default is harder to ship, because the org has to choose it over the easier path, and the choice will not show up on a quarterly dashboard the org already knows how to read.

The mental-model trap: funnels vs. tasks

Internet PMs are trained to think in funnels, sessions, retention curves. The training is good. The categories are wrong for agents.

A funnel is a model of conversion — turning a non-user into a user, turning a user into a paying user, turning a paying user into a power user. The unit is the user, and the metric is the user’s progression through the funnel.

An agent product is not a funnel. The user is already a user — they have the inbox, they have the work, they have the relationships. The product’s job is to compress the work, not to convert the user. The unit is the task, and the metric is the task’s quality of completion.

This sounds abstract. It is concrete in practice. A funnel-trained PM asks: “how do I get the user to open the agent 3 times a day?” A task-trained PM asks: “how do I get the user to send 3 supplier follow-ups a week without opening the agent at all?” Same business. Same user. Opposite product.

The funnel-trained PM will ship notifications, streaks, and variable-reward surfacings. The task-trained PM will ship background execution, draft-only mode, and a daily calmness report. Both are trying to make the product successful. They will ship opposite products, because they are answering different questions, and they were not aware the question had changed.

The deepest version of the path-dependence problem is not that internet orgs ship the wrong feature. It is that internet orgs do not realize they are answering the wrong question. The funnel is the only question their org knows how to ask.

Two counter-examples I want to be honest about

It would be lazy to say “all internet companies will ship attention-capture agents.” The pattern is strong; it is not total. Two counter-examples are worth dwelling on, because each one teaches something specific about the path the calm-by-default product has to take to survive inside a consumer-internet org.

Counter-example 1 — The “screen-time-style” guardrail that shipped inside an attention-capture business. You know the one. It is the feature inside a consumer-internet product that tells you how much time you have spent in the product and offers to limit it. It exists, it works, and it is the single most informative product decision in the last decade for the argument I am making. The reason it exists is not that the org woke up one morning and decided to protect user well-being. The reason it exists is that the external pressure (regulators, public trust crises, executive-level edict) made the alternative — not shipping it — worse for the parent business than the lost engagement. The guardrail shipped when the cost of not shipping it crossed a threshold. That is the lesson. The guardrails ship when the alternative is worse for the parent business than the lost attention. They do not ship from product instinct.

Counter-example 2 — The smaller, productivity-native vendor that ships calm-by-default. A handful of agent products in the productivity space ship with do-not-disturb-by-default, daily action caps, weekend read-only modes, and weekly calmness reports. They are not internet incumbents. They are small teams, often with productivity or note-taking roots, often with founders who came from engineering or research rather than consumer growth. They ship calm-by-default not because they had a grand principle, but because they did not have the growth-team muscle pushing the other way. The absence of the muscle produced the calm product. That is not a recommendation for founding such a team. It is an observation that the calm product and the muscle-memory product are correlated, and the correlation is causal in both directions.

The honest synthesis: the agent products that don’t feel like attention black holes are the ones built by people who never built attention black holes. The corollary, less comfortable: the agent products that do feel like attention black holes are the ones built by people who did.

What this means for the cross-border operator buying or building an agent

I want to land this in the same operator’s note voice as the parent post, because most readers of this blog are running real businesses and do not have time for framework-only essays. Two practical takeaways, one for buyers and one for builders.

For the buyer: read the org chart before the landing page

The single most informative question you can ask a vendor, when evaluating an agent product, is: “what metric does your product team get measured on?” Not what metric do they claim to optimize for. What metric does the team’s quarterly review actually score them on.

If the answer is engagement / retention / time-in-app / notification opt-in — the metric family the consumer-internet org already knows how to measure — the product will drift toward attention capture over the next 12 to 18 months, even if the current release is calm-by-default. The drift is structural. It is not a choice the product team is making. It is the org’s operating system doing what operating systems do.

If the answer is tasks completed, attention preserved, decision quality, user-reported calmness, or some outcome-based metric — and especially if the answer includes the word boundary — you may have found the rare product. Ask to see the dashboard. If they will not show it, that is a signal too.

The org chart is a leading indicator. The landing page is a lagging indicator. The landing page is what the org wants you to believe. The org chart is what the product is.

For the builder: your org’s blank slate is your product advantage

I am going to say something uncomfortable to anyone building an agent product at a small, productivity-native team. Your lack of a growth team is your product advantage. Spend it on calm-by-default.

The reason I say it uncomfortably is that this advantage is invisible to you. From inside the org, it looks like “we don’t have a growth function, we should hire one.” From inside the org, the open-loop inbox is a feature the product manager keeps proposing, and the product manager keeps proposing it because they have read six blog posts about engagement metrics. From inside the org, the calm default looks like “we are leaving growth on the table.” It is not leaving growth on the table. It is the product.

If you hire a growth team, they will A/B test the open-loop mechanic. The tests will work. The metric will go up. The product will get worse for the user, on a dimension the metric does not capture. This is not a failure of the growth team. It is a success of the growth team, measured by the wrong metric.

The builder’s job, in 2026 and 2027, is to choose the metric before building the product, and to defend that metric against the well-meaning people inside the org who will keep proposing the engagement metric. Calm-by-default is not a feature you ship. It is a metric you choose, and then everything else follows.

The five design defaults I would ship if I were building an inbox agent today

This is the part of the post I most wanted to write, because it is the part I can be concrete about. The list below is the starting point — not the destination. But every one of these defaults exists in tension with the false defaults I described above, and every one of them is the kind of default that, in my reading, a calm-by-default product has to ship on day one. You cannot bolt them on later, because the engagement metric will not move if they are on, and the org will not leave them on.

I want to acknowledge upfront: these are the defaults I would ship, drawn from the agent products I have built and the agent products I have used. Other builders will have other defaults. The point of the list is not to be canonical. The point is to show that calm-by-default is a set of specific, named, shippable decisions, not a vibe.

1. Draft-only by default. The agent produces drafts; the user reviews and clicks send. Auto-send exists only for specific, opt-in sequences (overdue-follow-up detection, in narrow contexts). The reasoning is laid out in the parent post — the user is responsible for the agent’s output, and the cost of an un-reviewed auto-send is structurally higher than the cost of an extra 20 seconds of review. This default is the foundation. Every other default depends on it.

2. Office hours for the agent. The agent does not generate drafts, send notifications, or surface insights between 22:00 and 08:00 in the user’s local timezone. The boundary is hard. The user’s mental model: the agent is at work when I am at work, and the agent is at home when I am at home. This default is the one that most clearly inverts the consumer-internet default of “always-on, always-pinging, always-engaging.” It is also the default that is most likely to be challenged inside a consumer-internet org, because the metric will visibly drop during the off-hours window. The metric is wrong. The boundary is right.

3. Bounded daily action quota. A hard cap on drafts per day — somewhere around 15 to 25, depending on the operator’s workload — with anything beyond the cap treated as observation, not action. The reasoning is structural: the inbox is a finite medium, and the agent’s job is to compress the work in the inbox, not to manufacture work that does not exist. If the agent is generating more than ~20 drafts per day for a single user, the agent is probably not compressing work — it is creating work. The quota is the guardrail that catches this early.

4. Read-only mode for weekends. The agent reads email, watches for follow-up signals, and prepares a Monday-morning summary. It does not draft, it does not notify, it does not surface insights in real-time. The user’s mental model: the work is being watched, not done. This default is the one that protects the longest continuous chunk of cognitive rest in most users’ week. It is also the one that the engagement metric will be most allergic to. Ship it anyway.

5. A weekly “calmness report” alongside the activity report. Most agent products report on activity: drafts generated, follow-ups sent, time saved. The calmness report reports on the opposite: how many notifications did the user not receive (because they were outside office hours), how many drafts did the agent not generate (because the quota was met), how many decisions did the user not have to make (because the agent waited for prompt). The user sees, in numbers, the attention the agent preserved. This is the report that makes the calm posture visible to the user, which is the only way it survives a quarterly review.

I will say, honestly, that I have shipped four of these five in the inbox agent I have built. I have not shipped the weekend read-only mode, because I am still worried about the user who has a supplier emergency on a Sunday. The worry may be wrong. I am leaving it as a default the user can opt into, which is itself a small concession to the engagement-metric worldview. I want to be honest about that concession, because the post would be weaker if I claimed to have shipped all five cleanly.

What I think happens next (a personal view)

I expect three things, and I want to mark them clearly as predictions rather than observations. Predictions age badly, and I am willing to be wrong about each of these.

1. By 2027, “calm agent” becomes a recognized category descriptor. The way “privacy-first” became a category descriptor in the late 2010s — not because the vendors suddenly became virtuous, but because the buyer started asking the question and the marketing had to answer it — “calm agent” will become a phrase buyers use to filter products. The vendors that win will be the ones who ship the guardrails before the buyers ask for them. The vendors that lose will be the ones who bolt the guardrails on after a public trust incident.

2. Internet incumbents will discover, 18 months in, that the agent form factor breaks the funnel assumptions. You can scroll a feed forever. You cannot draft an email forever. The inbox is a finite medium, and the agent’s job, when done well, is to reduce the volume of email, not increase it. The engagement metric, applied to an agent product, will flatline around month 18. The product will not be deprecated — it will be quietly de-prioritized, the growth team will be reassigned, and the next quarterly review will talk about “agent adoption” rather than “agent engagement.” The shift is from “how often does the user use it” to “does the user still need it” — and the latter question does not support a growth-team operating system. I have not seen this shift happen yet. I expect to see it in 2027.

3. The first-mover advantage in calm-by-default is structural, not marketing. The vendors that ship calm-by-default on day one will accumulate trust that is hard to build later. The vendors that ship engagement-by-default and try to walk it back will discover that the walk-back is the trust event, not the original choice. This is the same pattern as “we collected your data and now we promise not to” — the promise is the scandal. The calm vendor does not have a scandal to walk back from, because they did not collect the open-loop in the first place.

What I got wrong about path dependence in agent design

A few predictions I made early, that the data has corrected:

“Internet companies will be careful with agents because the regulatory risk is high.” No. The regulatory risk is 24+ months away. The quarterly metric pressure is this quarter. Path dependence wins on the timescale that matters, and the metric is the timescale the org runs on. I underestimated how short the org’s horizon is. The horizon is this quarter. Anything beyond the quarter is a slide in a deck.

“Users will vote with their feet for the calm agent.” Partly. Most users vote for the frictionless agent, and the calm agent is, by definition, less frictionless. The user who wants the calm agent is a specific user — the user who has already burned out on the engagement-by-default product and is looking for the alternative. The market for calm-by-default is real, but it is a replacement market, not a greenfield market. That is a different go-to-market motion, and a different business model, and most internet-incumbent orgs are not equipped to run it.

“The agent form factor is so different from the feed that the org will have to adapt.” Wrong. The org will frame the agent as a feed-shaped product (notifications, streaks, variable reward), and the user will experience it as a feed-shaped product, because the org is the one designing the defaults. The form factor does not force the org to adapt. The org can absorb the form factor into its existing operating system. I underestimated the org’s ability to absorb new shapes into old infrastructure.

The one-line honest answer for cross-border operators

If a European importer or a Chinese supplier operation asked me today: “I am about to buy an AI agent for my business. What should I look for?” — I would say:

The agent you want is the one built by a team that was never paid to capture your attention. The agent you will be offered is the one built by a team that always was. Read the org chart before you read the landing page.

That sentence is the structural answer to the question the parent post left open. The parent post told you what shape of agent to look for. This post tells you who will build it well. The two questions are not the same, and answering only one of them will leave you with a calm-looking product that drifts toward attention capture inside a year, because the org’s operating system will not let it stay calm.

FAQ: Internet Companies, Path Dependence, and AI Agent Design

Q: Is this just a critique of big tech? A: No — it is a structural observation about organizations whose product, metrics, and incentive systems were built for consumer attention capture, applied to a product category (agents) where the user is best served by the opposite posture. The critique is of the org, the dashboard, and the incentive system — not the people inside it. The pattern shows up most clearly at the largest consumer-internet orgs because the infrastructure is most entrenched there, but it shows up at smaller companies too, whenever the founders import the consumer-internet growth playbook.

Q: Are there internet companies building calm agents? A: A small number have shipped calm-by-default features (do-not-disturb by default, daily action caps, weekend read-only modes), usually after external pressure (regulation, public trust crises, executive edict) forced the move. The pattern is consistent: the guardrails ship when the alternative is worse for the parent business than the lost engagement. They do not ship from product instinct. If you find a consumer-internet-incumbent product that ships all five of the design defaults I listed above on day one, that is a signal — but the signal is more often a temporary posture than a structural commitment, and the product will drift when the external pressure fades.

Q: What about open-source agents — do they escape this problem? A: Open-source agents escape the organizational path dependence (no growth team, no engagement metrics, no quarterly OKR tied to time-in-app) and inherit a different one: the contributor’s mental model, which is usually engineering-flavored, not consumer-flavored. Open-source agents tend to ship calm-by-default not by design but by neglect of the attention-capture surface. That is not the same as principled calm-by-default, and it is fragile — the moment a company adopts the open-source agent and bolts a growth team onto it, the calm posture disappears. Open-source is a release from one set of path dependences and an entry into a different one.

Q: Should I just build my own agent in-house to avoid the trap? A: Building in-house lets you choose the metric, but it does not protect you from the metric you choose. Most in-house teams that build agents default to the metrics they have heard of — DAU, time-in-app, adoption — because those are the names of the dashboards. The trap is not “big company vs. small company.” It is “engagement metric vs. outcome metric.” Pick the metric first, then build the product. The five design defaults above are downstream of the metric choice; the metric choice is the one that has to come first.

Q: Does the same path-dependence argument apply to sidebar / Chrome-extension agents? A: Less so, but not zero. Sidebar agents are session-bound and DOM-anchored, which structurally limits the open-loop surface — there is no inbox that never hits zero, no background draft waiting for review at 23:00. The session boundary is itself a calm-by-default feature, even when the product team did not design it that way. The risk vector shifts from “always-on attention capture” to “session depth capture” (how long can we keep the user in the sidebar), which is a smaller and more familiar concern, and one that the extension’s session-bound architecture already partially constrains. The sidebar agent is the easier form factor to ship calmly by accident. The inbox-native agent is the form factor that requires intentional calm-by-default design.

Q: What is the earliest signal that an agent product is drifting toward attention-capture? A: Three early signals, in order of how early they appear. (1) The product ships a notification that interrupts the user with “a draft is ready” before the user asked — the agent has shifted from pull to push. (2) The marketing page starts using the word frictionless to describe the user experience — the team has started optimizing for time-in-agent, and frictionless is the engagement-metric team’s favorite word. (3) A streak or score feature appears in the product — the team has imported the consumer-app engagement model wholesale. If you see all three inside the first six months, the product will be hard to use calmly inside a year. The calm-by-default vendor will not ship any of these. The fact that they did not ship them is the leading indicator of whether they will ship them later.

About the MailMiner Editorial Team

The MailMiner Editorial Team is a group of cross-border e-commerce operators, TikTok Shop sellers, and AI tooling builders. We publish case studies drawn from real seller interviews and our own product experiments — never generic theory, never fabricated case studies.

Our focus areas include inbox-native agent design, cross-border sourcing automation, Chrome extension architecture for vertical AI tools, and AI tooling for cross-border operators. Past coverage includes a Spanish TikTok-to-Shopify founder’s journey, the Amazon refined-selection 90% framework, and the inbox-native agent vs. Chrome extension comparison that this post builds on.

Disclosure: Operational claims in this post — the 5 design defaults, the “screen-time-style guardrail” counter-example, the “engagement metric flatlines around month 18” prediction, and the structural observations about consumer-internet product organizations — are reported by the editor and not independently audited. The underlying claim that organizational metrics systems shape product defaults is widely discussed in product-management literature, but the specific application to AI agent design is the editor’s framework, drawn from 6 months of running two agent products in production next to incumbents in the same space. As with the parent post, browser- and email-platform behavior can vary across providers and versions; the Chrome extension security model and Gmail API access scope have changed multiple times in the last 24 months.

Building or buying an AI agent for cross-border operations, and trying to tell the calm-by-default product from the engagement-by-default product in the first 10 minutes of a sales call? Reach out via the About page — we read every message.

ProductDesign on MailMiner Agent Blog