“AI” vs. the Web
A concise, citation-rich memo on “AI” claims from a moderate Web software perspective.
Last updated: 14 July 2025
Authored by Lincoln Russell, 9 May 2025, CC BY-SA
What do we mean by “AI” today?
- Machine learning (ML) uses statistics-based algorithms to recognize patterns and generalize to unseen data, which allows it to perform tasks without specific instructions.
- ML is a branch of computational learning theory, of which a key tenant is “Probably Approximately Correct” (PAC), which says the goal, given finite data and an uncertain future, is to reach a high probability result, not a correct one.
- A large language model (LLM) is a type of machine learning optimized for natural language processing that uses self-supervised learning on huge datasets.
- “LLM” entered the lexicon with OpenAI’s GPT-3 in 2020 (more broadly in 2022).
- A prompt is input to an LLM. Prompts are added at multiple levels: the service provider (e.g. OpenAI), the app (e.g. a bot provider), and finally the end user.
- Retrieval-augmented generation (RAG) is way of feeding specific content into a pre-trained LLM to give it domain-specific information, or information not in its training. It also allows the LLM to cite its sources in the response. However, it adds risks.
- Functionally, RAG searches content before prompting the LLM with the user prompt combined with the documents retrieved by the search.
- Generation augmented retrieval (GAR) flips the script: Rather than outputting conversational text based on additional documents, you leverage the LLM to enhance & add context to your “normal” search results.
AI is technically a broader term than ML (a subset of AI), but today we’re talking about LLMs, which are a subset of ML. In other words, “AI” is being used as a marketing term for “LLM” (the technical term). This matters because there is no intelligence / reasoning behind LLMs, but rather a statistical model for achieving PAC (“Probably Approximately Correct”) results.
Buzzwords explained
- “Generative AI” — Marketing term for LLM output (vs other ML-based tech).
- “Hallucination” — Marketing term for when PAC (“Probably Approximately Correct”) isn’t correct at all (a feature of the technology, not something that can be eliminated).
- “Reasoning” — Marketing term for an LLM reprocessing its own output in multiple passes / steps in an attempt to reduce incidents of “hallucination”.
- “Prompt Engineering” — Marketing term to associate LLM prompt writing with software engineering. (By contrast, engineering has deterministic outcomes.)
- “Agentic” — Integrating APIs with an LLM to translate PAC output into deterministic actions in other systems. Much of this existed prior to LLMs using other forms of ML, such as sentiment analysis and auto-escalations of support tickets.
We’re generally in the midst of a game of oneupmanship between LLM vendors to invent new terms and “next generation” features. To assess the landscape today, refer to the LLM Leaderboard or Zapier’s list.
Benefits & strengths of “AI” today
The current generation of LLMs were a turning point for achieving PAC results for:
- Data trend identification
- Prose authoring & summarizing
- Image generation & identification
- Audio transcription
- Search enhancement
- Code authoring & summarizing (in limited capacity)
These tasks are automatable, and enabling non-engineers to achieve PAC results through prompt writing is a key benefit. Benefits lean into content discovery, pattern recognition, blending many inputs, and coping with unpredictable inputs (e.g. natural language).
Due to the rate of LLM releases and relative newness of mass-access, there is still much discovery to be done regarding how the above strengths can be remixed for individual business cases beyond surface-level integrations like chatbots, search, and summarization.
Barriers to realizing benefits
Tactically, the most common limitation in realizing value from LLMs are:
- Structured access to data
- Having a sufficient corpus of content to make an LLM domain-specific
- Structuring complex prompts to get PAC within a desired threshold
Strategically, the primary barriers are:
- Unrealistic expectations based in magical and conformity thinking.
- Many AI implementations are simply marketing spin or fraud designed to generate positive buzz.
- The future stability of LLMs is uncertain (see below).
These distract from & impede discovery of useful implementations and threaten their future viability.
Known Risks & Costs: Product / Marketing
- Anthropomorphizing AI promotes over-trusting with negative consequences.
- Association with a broader AI backlash.
- What are we even doing? (Nov 2024)
- AI policy update proposal (Apr 2025)
- Study: Your coworkers hate you for using AI at work (May 2025)
- The people refusing to use AI (May 2025)
- Running afoul of specific academic concerns about use & impact of AI.
- AI, students, and epistemic crisis (Jul 2024)
- Lack of clarity on long-term costs of cloud-based LLMs currently operating at a loss.
- Revenue durability in the LLM world (Apr 2024)
- Personal & business privacy concerns from staff, customer, and vendor access.
- User-facing LLM content is unpredictable with potentially dramatic failure modes.
- A modest defense of Marco Buscaglia (AI vs journalism) Gonzalez (June 2025)
Known Risks & Costs: Engineering / Tooling
- Emerging tech is R&D work, which is lower short-term ROI than strategic roadmap work.
- PAC creates a QA challenge because most automated test strategies will fail. It has the potential to exponentially increase those costs (or defect rates).
- “Agentic AI” creates novel risks for security & compliance.
- “Agentic AI Red Teaming Guide” Cloud Security Alliance (CSA) (May 2025)
- “Companies are Discovering a Grim Problem With Vibe Coding” Tangermann, Futurism (May 2025)
- LLM spiders are a constant threat to Web infrastructure & “open” philosophy.
- Top-down tooling edicts are bad for morale & the business.
- Uncritical use of LLMs in code can speedrun creation of technical debt.
- We lack evidence that LLMs improve velocity on engineering teams.
- We have evidence that LLMs harm learning and promote dependence.
- How AI Vaporizes Long-Term Learning” Edutopia (Jan 2025)
- Generative AI runs on gambling addiction” Gerard, Pivot to AI (June 2025)
- Only 25% of AI initiatives are delivering ROI.
- No available data demonstrates AI-enabled capacity gains in software engineering, but engineers self-report spending less time doing valuable work when using AI.
Analysis of risks & costs
“Genuine revolutions create genuinely irrational environments” and the best strategy under those circumstances, for both skeptics and true believers, is pessimistic adoption: “only adopt specific implementations that are well-tested and have a demonstrated effectiveness.”
While we do not know if LLMs are the future of software engineering, we lack data supporting a significant positive impact from it in the present.
Writing process vs software engineering process
For internal use cases involving natural language (e.g. emails, meetings, presentations) LLMs are a significant advancement with trivially constrained risks in the hands of capable users.
For use cases involving coding language, benefits are often more muted (or non-existent) while the risks & costs are dramatically higher. LLMs best assist with code authoring (the act of typing code), which is typically not a noteworthy bottleneck in the software development lifecycle.
Understanding how work impacts a broader system and having sufficient context to consistently make good decisions is a much larger concern and far more time-consuming than typing code. Using an LLM to author your code may impede or short-circuit this process entirely, creating an emphasis on short-term delivery capacity over long-term sustainability and cumulative capacity, creating an environment where capacity erodes over time as your team effectively suffers “brain drain” without any employee turnover.
While an email is likely consumed once, code must be accurately re-consumed many times over its lifespan by engineers as well as provide flexibility for unforeseen needs which requires much deeper values judgements than LLMs are capable of performing. LLMs are not effective at code review, and LLM-generated code can add friction to the process by increasing the time required for review & communication about the code when neither party truly understands it.
Alternative ML-based strategies
Many use cases for applying LLMs to code were already possible with other ML technologies. While LLMs were a big leap for natural language enablement, they were a much smaller iteration for engineering enablement. It’s likely there are more appropriate existing tools for any given challenge, which makes it important to not presuppose AI is the solution to any of them. Data analysis and quality assurance are key domains where ML tools already proliferate.
Future stability of LLMs
There are a number of factors threatening the stability and success of LLMs (and the world on which they rely), which we can put in three buckets:
Training LLMs is prohibitively resource-intense
- LLM web crawlers (“spiders”) relentlessly scrape content by ignoring instructions, circumventing IP bans, and inducing exponentially higher operating costs on Web site owners.
- They’re looting the Internet (Apr 2024)
- Artficial intelligence web crawlers are running amok (Jul 2024)
- Denial (Apr 2025)
- Environmental costs from the electricity & water consumption required.
- Humans are more involved in the training process than acknowledged.
- How Syrian Refugee in Lebanon Train AI (Jul 2024)
- Consuming illegally-obtained copyrighted material is considered required to train an LLM.
Diminishing returns on LLM investments
- They’ve run out of novel text to train the largest LLMs. Having scraped the known Internet and all accessible digitized data, there’s simple no way to continue scaling them this way.
- LLMs are starting to ingest other LLM output as training material as that content proliferates across the Web, which demonstrably degrades the quality of their output.
Lack of visibility into quality & methods
- There is no visibility into how LLMs work, making them a blackbox technology.
- Uneven quality & target use cases make the market unpredictable.
- Artists & authors are actively poisoning content to disrupt the quality of LLM output.
- The Art of Poison-Pilling Music Files (Apr 2025)
- Malicious actors can easily plant misinformation in LLMs and create fake narratives (or dangerous code examples).