THE LLM LANDSCAPE - BEYOND THE HYPE
- candyandgrim

- Nov 18, 2025
- 7 min read

MY JOURNEY (PROBABLY YOURS TOO)
I started with Grammarly for polish, and Google for research. Then ChatGPT arrived in November 2022, and like millions of others, I thought: "Finally - one tool to rule them all."
Until it wasn't.
The breaking point? I asked ChatGPT to analyse a transcript and extract quotes on a specific topic. It delivered - except the quotes didn't exist. Not paraphrased. Not summarised. Completely fabricated.
That's not a "harmless hallucination." That's professional negligence.
I needed an alternative immediately. Grok and DeepSeek were the hot options. Between Chinese government data access (DeepSeek) and Elon's ego (Grok), I chose Grok as the lesser evil.
Fast. Less filtered. But then the political bias became impossible to ignore - anything critical of Musk, Tesla, or Trump got sanitised. That's not an AI assistant; that's a PR shield.
Now? Primarily Claude for accuracy and citations, but I'm not monogamous. Different tools for different tasks:
Writing: Claude (accuracy) + Grammarly (polish)
Research: Perplexity (citations) or Claude
Quick queries: Whatever's fastest
The lesson: There is no "best" LLM. There's only "best for this specific task right now."
And when companies force Google Workspace or Microsoft 365 ecosystems top-to-bottom - Drive, Slides, Gemini/Copilot, the lot - I see lazy thinking disguised as efficiency.
Limit your stack? Absolutely. Prioritise interoperability? Yes. But Google Slides is objectively worse than PowerPoint or Keynote. Drive's clunkier than Dropbox. Forcing inferior tools because they share a subscription isn't strategy - it's surrender.
For some Claude tips, check out: https://www.linkedin.com/feed/update/urn:li:activity:7389616583985221632/?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAS9-QoBRfb3h9Jwhawg6Xh5cP31jv0f4tw by Joanna Lamadjieva
THE LANDSCAPE: WHO'S ACTUALLY WINNING?
Market share (November 2025):
ChatGPT: 60-82% (timing + Microsoft muscle)
Gemini: 13-24% (mostly passive Workspace adoption)
Claude: 3-21% (low consumer, high enterprise/dev)
Perplexity: 6.2%
DeepSeek: 0.5-5.9% (exploded January 2025, then crashed)
Grok: 0.8% (Twitter bubble, nothing more)
Why ChatGPT dominates:
First-mover advantage (November 2022, 100M users in 2 months)
Distribution muscle (free tier, mobile apps, Microsoft bundling)
"Good enough" trap (solves 80% of use cases, so why switch?)
Institutional lock-in (universities adopted it, students trained on it)
But dominance ≠ quality. Blind tests show Claude Sonnet 4 actually ranks higher on accuracy and reasoning.
THE DEEP DIVE: STRENGTHS, WEAKNESSES, ETHICS
CHATGPT (OpenAI)
Strengths:
Ubiquity - everyone knows it, shares prompts for it
Conversational tone - friendly, accessible
Plugin ecosystem - web browsing, code interpreter, custom GPTs
Microsoft backing - Bing, Office, GitHub Copilot integration
Weaknesses:
Hallucinations - confidently invents information (my transcript disaster)
No native citations - manual verification required for everything
Sycophantic - tells you what you want to hear, not always what's true
Rate limits on free tier - frustrating caps mid-workflow
Ethical concerns:
Training opacity - scraped the web without permission, lawsuits mounting
Catastrophic economics - loses £1.80 for every £1 earned, burned £10.8B in H1 2025. Microsoft life support only.
Carbon cost - training GPT-4 consumed 50 GWh (powering 5,000 homes for a year)
IP muddle - can generate copyrighted-style content, putting users at risk
Verdict: The default choice, but increasingly a legacy play. Microsoft funding keeps it alive despite catastrophic unit economics.
CLAUDE (Anthropic)
Strengths:
Accuracy over speed - reasoning prioritises correctness
Native citations - artifact system shows sources, enables verification
Longer context - 200K tokens vs GPT's 128K (better for long documents)
Constitutional AI - trained with explicit ethics, less sycophantic
Artifacts - visual outputs (code, documents, diagrams) separate from chat
Weaknesses:
Slower - deliberate reasoning takes time
Less "fun" - more formal, less conversational than ChatGPT
Smaller ecosystem - fewer plugins, integrations
Phone verification required - barrier to casual trial
Ethical concerns:
Training data - also scraped the web, but emphasises Constitutional AI principles
Google backing - Alphabet invested £1.6B+, raises data-sharing questions
Carbon footprint - similar compute demands to GPT
Feature lag - often trails OpenAI on new capabilities
Verdict: The switchers' choice - people move to Claude after ChatGPT fails on accuracy-critical work. Enterprise and developer favourite.
DEEPSEEK (China)
Strengths:
Ridiculously cheap - API pricing undercuts everyone
Open-source R1 - released weights publicly (rare for frontier models)
Speed - optimised for fast inference
Weaknesses:
CCP censorship - blocks Tiananmen, Taiwan, Hong Kong, Uyghur topics
Data residency - all data stored in China, CCP access by law
Security holes - keystroke tracking, data exfiltration discovered
Geopolitical risk - US/EU warnings for sensitive work
Ethical concerns:
Mandatory censorship - not a bug, it's by design. Chinese law demands compliance.
Data sovereignty - UK/EU/US data crosses into CCP jurisdiction
Training mystery - scraped web + unknown Chinese datasets
Carbon - trained on Chinese grid (higher coal mix than US/EU)
Verdict: Peaked at 5.9% in January 2025, crashed to 0.5% after security revelations. Use at your own risk - fine for public content, catastrophic for proprietary work.
GROK (xAI / Elon Musk)
Strengths:
X integration - convenient if you live on the platform
Speed - optimised for fast responses
"Edgy" positioning - markets itself as less filtered
Weaknesses:
Political bias - manually tuned by Musk, censors criticism of Musk/Tesla/Trump
Accuracy issues - multiple hallucination reports
Twitter bubble - 0.8% market share despite hype
Ethical concerns:
Deliberate bias - Musk openly states he's tuning it "anti-woke"
Censorship - suppresses negative content about Musk, his companies, political allies
Training grab - scraped X/Twitter without user consent
Vanity risk - survives only as long as Musk's interest holds
Verdict: Avoid unless you're a Musk loyalist. Political bias makes it unsuitable for professional work. 0.8% market share tells the real story.
GEMINI (Google)
Strengths:
Search integration - can search web natively, cite sources
Android/Workspace ubiquity - baked into phones, Gmail, Docs
Multimodal - handles images, video, audio natively
Generous free tier - fewer rate limits than ChatGPT
Weaknesses:
Passive adoption - most users don't choose Gemini, it's just there
Trust issues - Google's surveillance capitalism reputation
Botched launch - image generation controversies damaged credibility
Corporate bloat - slower iteration than startups
Ethical concerns:
Data hoarding - Google's entire model is data collection
Training grab - scraped web, YouTube, Google Books (lawsuits pending)
Privacy theatre - claims GDPR compliance, incentives misaligned
Carbon - massive data centres, though offset with renewables
Verdict: Ambient AI, not chosen AI. If you're in Workspace, you'll use it by default. Otherwise, professionals actively choose ChatGPT or Claude.
PERPLEXITY
Strengths:
Research-focused - built for citations, source verification
Clean UI - no clutter, just search and answers
Fast - optimised for quick lookups
Pro mode - deeper reasoning when needed
Weaknesses:
Narrow use case - great for research, less useful for creative writing or coding
Smaller model - not as capable as GPT-4 or Claude for complex reasoning
Niche adoption - 6.2% market share
Ethical concerns:
Training opacity - less transparent than competitors
Sustainability unclear - VC-funded, unit economics unknown
Verdict: Excellent for research-specific tasks. Not a ChatGPT replacement, but a valuable specialist tool.
THE GEOPOLITICAL MINEFIELD
Using AI isn't just tech - it's politics.
DeepSeek: Data stored in China, CCP access guaranteed. Fine for public content, catastrophic for proprietary.
Grok: Musk's bias makes it unreliable for objectivity. If neutrality matters, avoid.
ChatGPT/Claude/Gemini: All US-based, subject to FISA, CLOUD Act. Europeans especially wary post-Snowden.
The uncomfortable truth: No LLM is geopolitically neutral. You're choosing whose laws, whose values, whose bias you'll accept.
THE CARBON PROBLEM
Training a frontier LLM consumes:
50-100 GWh (powering a small city for a year)
Thousands of tonnes of CO2 (depends on grid carbon intensity)
Running inference adds up:
ChatGPT: ~0.001 kWh per query
At 100M+ daily users = 36.5 GWh/year just for inference
Who's doing better?
Google/Anthropic: Both claim carbon-neutral data centres (renewables + offsets)
OpenAI: Microsoft data centres moving to renewables, not there yet
DeepSeek/Grok: Zero transparency
Reality check: AI isn't "clean tech." Every query costs the planet something. Choose wisely.
SO WHICH LLM SHOULD YOU USE?
Honest answer: It depends.
For accuracy-critical work: → Claude (citations, transparency, reasoning)
For speed and "good enough": → ChatGPT (ubiquitous, fast, conversational)
For research with citations: → Perplexity (built specifically for this)
For Workspace users: → Gemini (it's already there, decent quality)
For Twitter addicts: → Grok (if you can stomach the bias)
For cheap API (non-sensitive only): → DeepSeek (understand the geopolitical risks)
For professional creative work: → Multiple tools - Claude for accuracy, ChatGPT for speed, Grammarly for polish
BECOME A TOOL POLYGAMIST
Don't marry one LLM. Build tool fitness strategy:
Primary workhorse (Claude or ChatGPT - pick your poison)
Specialist backup (Perplexity for research, Grammarly for polish)
Emergency redundancy (if primary fails, what's your fallback?)
Why?
Tool failure is inevitable (outages, rate limits, policy shifts)
No single LLM excels at everything
Market consolidation is coming - tools will die, pivot, get acquired
The creatives who survive won't be loyal to ChatGPT. They'll be fluent in the logic that spans all LLMs.
THE CORPORATE ECOSYSTEM TRAP
I've worked at two companies recently:
Company 1: Google-first (Drive, Docs, Slides, Gemini)
Company 2: Microsoft-first (OneDrive, Word, PowerPoint, Copilot)
Both claimed "efficiency through standardisation."
The reality:
Drive < Dropbox (sync issues, collaboration bugs)
Slides < PowerPoint/Keynote (animation limits, design tools)
Gemini/Copilot = IT's choice, not what's best for the work
This is security theatre masquerading as efficiency. Managing one vendor is easier than admitting some tools simply do the job better.
The solution: Limit your stack, yes. Prioritise interoperability, absolutely. But tool fitness > brand loyalty.
If Slides can't deliver, use Keynote and export to PDF. If ChatGPT hallucinates, use Claude and document your process. Work quality matters more than IT's convenience.
THE SWING VOTER PRINCIPLE
I'm not fickle - I'm a swing voter.
I only switch when:
Current option becomes untenable (broken, expensive, unreliable)
Alternative is dramatically better - not 10% better, but leaves the old guard in the dust
Small improvements don't move the needle. Revolutions do.
ChatGPT dominated because it was a revolution (November 2022). Claude gains ground because it solved ChatGPT's biggest weakness (accuracy). DeepSeek briefly exploded because it was 10x cheaper (then geopolitics killed it).
The next LLM that wins won't be "slightly better than ChatGPT." It'll solve a problem so fundamental, so painful, that switching becomes inevitable.
Until then? Tool polygamy. Multiple options. Zero loyalty.
What's your LLM journey? What made you switch (or what's keeping you on ChatGPT)? What would it take to change tools?
Drop your story below - let's map the real landscape, not the marketing hype.




Comments