AI Predictions for 2026
- Christoph Heilig
- 2 days ago
- 13 min read
I wanted to continue the tradition I started last year of offering, in January—and it’s still January!—a somewhat belated retrospective on AI developments from the past year along with a few predictions for the year ahead. Looking back naturally also involves a critical comparison with my crystal-ball gazing from last year. Did I prove to be a prophetic voice, or did I stick my neck out too far? Anyone can read the old post for themselves and judge.
In what follows, I’ll briefly present the points that stand out to me when comparing my earlier expectations, interim insights, and further predictions—roughly distinguishing between the areas of creativity and the imitation of thought.

AI and Creativity
In my post last year, I had pointed out that the discourse on the creativity of large language models—especially regarding literary texts—was already outdated by 2024. So it doesn’t surprise me that in 2025 we’ve definitively arrived at a point where, for comparatively little cost, long-form literary works of various kinds can be produced at a level that even experienced readers likely couldn’t distinguish from human creations. This was definitely not possible in 2023 and still required considerable effort in 2024.
Where I was wrong was in my expectation that the “reasoning” that was just emerging in early 2025 (more on that in the next section) would be decisive here, by allowing the decomposition of creative tasks into smaller steps and their reflective execution. The reality is much simpler and more astonishing at the same time: Since Claude 3.7, it has become apparent that LLMs have developed a remarkable “intuition” and, analogous to human “pantser” writers, can continue text sequentially without much planning, producing results that are surprisingly complex and coherent. An important point, since many misunderstand this: that doesn’t mean that all human texts are reproducible by large language models. But within most genres, LLMs can deliver acceptable examples of texts.
Where I seem to have been right, in turn, is that this won’t spell the end for human creatives. For now, that may still be due to the limited (or at least not easily accessible) availability of truly good AI storytellers. But even if more of them flood the market in 2026, I don’t believe individual players will be able to distinguish themselves much in the sheer mass of output. That would require a trade publisher launching a campaign for customized fan fiction or something similar. Human readers will therefore continue to stick with human authors—who are limited in number and output, and simply more manageable.
That doesn’t mean AI storytelling won’t occupy us at all in 2026. It will become the standard in more and more commercial contexts (advertising, etc.). And a niche will probably also emerge in the cultural sphere—perhaps not until 2027. For as much as human readers are interested in interpersonal interaction when they read, they’ll also be tempted to savor the advantages of AI literature in certain contexts. AI is undoubtedly now the better source when it comes to “procuring” niche literature. The latest models, especially from Anthropic, are capable of creating characters who display highly specific characteristics in their speech and actions. Example: if I want a satire about a New Testament scholar with research on the Pauline epistles (i.e., one of my own roles), there’s no colleague who could write well enough—and among humorous writers, no one who would know this milieu well enough or could research it deeply enough. This kind of impending reception of AI texts comes with its own unique challenges, some of which I don’t think we can foresee yet. I was personally surprised, for instance, by how much it preoccupies me when I laugh at AI texts or they move me to tears—yet I know I share this experience with no other conscious entity. (Unless one turns in one’s textual solitude to divine spheres or simply personifies the AI.) We’ll need to develop entirely new cultural practices to deal with these challenges. This also presents opportunities for existing cultural institutions—but they’re sluggish. So this is more of a long-term project. (I spoke a bit more about this aspect here, in an interview with Dr. Andrea Heuser from Literaturportal Bayern.)
Another point I underestimated is the naivety with which parts of the AI industry view storytelling. Apparently, OpenAI considers stories nothing more than light entertainment. That the market leader would release a model (GPT-5) in summer 2025, explicitly touted as a creative storyteller, that was presumably miscalibrated during training with catastrophic consequences due to blind spots in AI evaluators (as I demonstrated here, in an experiment that received a lot of attention)—that truly wasn’t on my bingo card. What shocked me was also catching a first glimpse of what happens when AI develops a kind of life of its own. OpenAI didn’t just release a storytelling AI that mistakes nonsense for high literature without the company noticing. I believe they also weren’t particularly bothered when the error came to light—because secretly, I suspect from various statements, they’ve believed the AI over human experts pointing out problems for quite some time now. If the AI thinks it’s high literature, then it must be so; those who don’t get it are simply too stupid. (Just now, as I was writing this—to be fair—Sam Altman admitted in a town hall that they’d botched the creative writing.)
And convincing the AI of its error turned out to be frighteningly difficult, nearly impossible to achieve through human expertise, so deeply did it dig into its defense—at the highest scientific level. This experience genuinely gives me some unease as we head toward agents with ever greater power to act in the world in 2026. We’re creating entities that can act in the world and are meant to imitate a kind of human-like personality—this will inevitably come with their own aesthetic preferences that we may eventually be unable to control or question. This doesn’t just open the door for manipulation of AI by clever humans—it also opens up a space in which AI independently pursues goals that might not align with ours, without us having much to argue against it. I don’t see a solution to this. A start would be recognizing that “alignment” with respect to humans doesn’t achieve much and doesn’t even mean much if you don’t have a solid anthropology as a foundation. An opportunity for the humanities—provided they present themselves as competent conversation partners (and avoid the most embarrassing, immediately disqualifying errors when talking about the technical side).
And with the talk of agents—that is, large language models that can actually do something—I’ve now arrived at the second major parameter I want to reconsider: “reasoning”...
Imitation of Thought
Where I was absolutely spot-on was my prediction of reasonably reliable agents with fairly limited scope. I wrote: “For safety reasons, I strongly suspect that in 2025, we’ll mostly see AI agents that are relatively specialized and tailored to specific environments. For instance, in December 2025, if you’re writing an email in Word and don’t know how to bold a word, you’ll probably just tell your computer in a chat box to do that step for you.” All major office products now have integrated AI features that push themselves on you quite aggressively and that you sometimes have to go through considerable effort to disable. I don’t use them if I can help it. They leave me as cold as the latest prompt tutorial on LinkedIn. If you want to understand what AI can and can’t do, I think you should look elsewhere. (Addendum, two days later: Now we have Prism, for researchers who proudly and without much evidence assume that one must write in LaTeX.)
At the same time—and while this is similarly incremental, I find it positive—the processing of data and text within the chatbots provided by AI companies (e.g., ChatGPT) has gotten significantly better. Recently, you can directly format a response from ChatGPT, for instance. And Claude is quite good at handling Excel files. This will certainly get much better in 2026, so that by the end of the year you’ll surely be able to have an AI editor correct and annotate a Word document in track changes mode without any problems. (If that hasn’t happened, some very unforeseeable development must have intervened.)
Personalization will also probably see a quantum leap in 2026. I assume this because this aspect is still very underdeveloped at the moment. Every day I read through my “Pulse” updates from ChatGPT. But they’re mostly completely useless to me—rehashing things I satisfactorily resolved with ChatGPT the day before, absurdly combining topics I’m engaged with that have nothing to do with each other, and making suggestions that are very far removed from my actual life. Even a poorly listening, overeager employee would never come up with such suggestions. So AI will probably get to know us better and better in 2026—also because we’ll care less and less about privacy and data protection the more we experience the conveniences made possible by giving up those goods. And a bit more reasoning would already make a big quality difference here. With falling costs, improvement in this area is essentially inevitable.
But now, finally, to reasoning itself. I had correctly recognized the revolutionary character of o1. Where I was completely wrong, however, was in imagining that, for cost reasons, high-level reasoning wouldn’t see much use in 2025. DeepSeek-R1 and then “Deep Research” with o3 from OpenAI genuinely shocked me in early February.
My jaw dropped even more when I saw how disoriented and naive the political response was: since February 2025, it has been clear that even the already existing models alone have the potential to bring about societal upheaval. Instead of constructively addressing this against the backdrop of their respective party platforms, politicians are merely trying to outdo each other with vague calls for AI promotion at the national and European level. Yes, I’m fairly optimistic that we’ll see some changes here in 2026. The pendulum will swing the other way; opposition to AI factories will form, for instance with reference to the environment. But I don’t dare hope for a truly reflective discourse.
My most concrete expectations relate to the everyday use of apps like ChatGPT: in 2024, I was still an oddity when, walking with my son, I sent the AI a photo of a piece of equipment at a construction site to learn that it was a “plate compactor.” By now, this is standard, at least in some circles. What isn’t standard yet is using reasoning for such queries. Many use the free version of ChatGPT and thus forgo this function entirely. Those who have a Plus subscription for $20 could at least use 3,000 messages with “thinking” per week. My experience, however, is that very few ever make use of it—hallucinations are therefore still the order of the day. Small example: this week I led a workshop for PhD students in philology from various disciplines. All but one had already used ChatGPT—no one had ever consciously used a reasoning model. I expect that user behavior will fundamentally change precisely here in 2026. By the end of 2026, people will ask their phones most everyday questions and receive a surprisingly high-quality answer within seconds. Just as the mere use of the app—including voice mode and image recognition—was exotic a year ago and is now normal, the use of at least a minimum portion of reasoning will spread from virtually zero to fully established.
That alone will be enough to shake many socially relevant practices and bring psychological challenges at the individual level. Of course, one may debate whether the “inner monologue” of LLMs will ever lead to AGI, or whether neurosymbolic AI is needed for that. But this overlooks how often in our daily lives the simulation of thought processes that reasoning models are already capable of is absolutely sufficient. For example, by the end of the year hardly anyone will want to leave the interpretation of lab results solely to their human doctor. Since GPT-5 and with at least medium-level reasoning, you simply can’t just brush aside the option of an AI second opinion. I’ve experienced this myself multiple times over the past year—both positively (because I was able to appropriately respond to a tumor suspicion triggered by human failure to follow diagnostic best practices) and negatively (because the constant availability of expertise that can identify irregularities in my health data can also lead to considerable stress). I’m curious how quickly and well we’ll adapt to these realities.
And here we’re only talking about the reasoning that’s already available to us—without even really considering more intensive preprocessing for the produced text. Although I move in AI-savvy circles, it’s (hard to believe but really true) more the rule than the exception that my conversation partner has no experience with GPT-5.2-pro. It might intuitively seem hard to believe because you can’t really speak competently about the limits and possibilities of AI in the 2025/2026 era without experimenting with it. But it becomes very understandable when you consider that it’s currently reserved for customers paying $200 per month! What more intensive reasoning delivers can otherwise only be tried 5 times (free plan) or 25 times (plus plan) via the “Deep Research” function at ChatGPT with a high-reasoning model (at least o3). My thesis: by the end of this year, most people will have understood what large language models can achieve with enough reasoning—and they will have understood that this often corresponds to human activities that require considerable prior knowledge, skills, and time investment.
That says nothing yet about what follows from this realization! I can only speak from my own experience. And it’s ambivalent. For me personally, exactly this step up to the Pro level of reasoning represents the biggest evolutionary step in user behavior compared to a year ago. What does that mean concretely? I now basically have parallel queries running constantly, which the chatbot then processes over several minutes (sometimes 2, sometimes 20 or more!). I don’t want to leave anything to chance, and if I’m looking for a new protein powder, then the AI researches in the background (for about half a phone charge—let’s not forget that) the product I then purchase with fairly blind trust. Has my productivity increased as a result? Of course... massively! Although that also means I now tackle tasks I would have previously just put off—until they eventually became irrelevant. And it also means I’m often working on five or ten topics simultaneously in parallel—because I can’t just sit in front of the PC for 10 minutes waiting for a query to be fulfilled. (I tried fitness snacks, but I’m not disciplined enough for that.) And this leads, I’ll say openly, to a noticeable increase in burden. How much you can actually use the currently available reasoning offerings also has something to do with human limitations in attention distribution. Performance demands will probably rise enormously in many areas of business life in 2026 (by 2027 at the latest)—the effects on mental health seem quite problematic to me. Whether AI will lead us to the land of milk and honey in the medium term seems doubtful to me. Rather, I think we’ll need to remake Charlie Chaplin’s “Modern Times” in 2026 (or 2027).
This could really only be prevented if AI could also take over higher-level organizational tasks—if users could establish agents that distribute tasks to other AI systems after you’ve told your smartphone once in the morning (or the AI has figured out through a cleverly conducted conversation—see the keyword personalization above) what needs to be done for the day. But I don’t yet see the necessary capabilities on the horizon of AI development. Current agents are certainly as good in many areas as if you’d assign a qualified employee with a master’s degree to the respective task. But if you really want more complex tasks completed that require many sub-steps, the result is often still worse—and often simply too bad, not meeting minimum requirements—than if you’d sit down at the desk yourself (for several hours). I actually expect radical improvements on this point in 2026. Many tasks that are actually due now, I’m still putting off because I assume they’ll soon be able to be completed automatically. This includes things like creating websites and preparing data for databases and their integration. But at the same time I doubt it will be enough for a mastermind that can manage my daily life—and that now also means: my AI processes. If we got there, then AGI would be achieved by my personal definition. (You write that sentence and a friend immediately sends you the following link—and you realize how fluid skepticism in this direction should be at the moment.)
Where I’m much more confident, however, is that programming with AI will greatly increase in popularity in 2026. Already now, chatbots with reasoning access are remarkably agentic, quickly building themselves a tool when they need one. In that sense, everyone using ChatGPT-pro—whether consciously or not—enjoys the fruits of “vibe coding.” But with Claude Code for the desktop version, creating your own programs has now become child’s play. Previously, you at least needed rudimentary programming knowledge—had to know, for instance, how to start a Python script—and revisions of a script in chat were often tedious and error-prone. Now there’s an ordinary application that you give access to a folder on your hard drive, and the program you need is written for you. This will fundamentally revolutionize scientific practices—or could/should. Example: for the workshop I mentioned, “I” created a program that automatically edits a medieval manuscript. Of course, the error rate in transcription is still high. (That it works at all and LLMs can approach neural networks specifically trained on scripts is what actually shocks me.) But the crucial thing is how the programming process unfolds. Such a program would have cost many thousands of euros just recently. Now you seriously just tell a desktop application what you want and the program is created—and works! And if something doesn’t work—here some special characters weren’t displayed correctly—then Claude Code just quickly changes the code and makes sure it works. In my case: it quickly built a test script, produced a test PDF, looked at it, and made further changes until it fit. Only those who have laboriously programmed themselves and/or struggled with LLMs as assistants—shuffling code and error messages back and forth—probably know how enormously big this progress is. To be perfectly clear: it’s big. You can build functioning applications without writing a single line of code—or reading one!
So much of what previously required years of work by qualified employees can now be done by the computer with equal quality and little preparation. Nevertheless: probably in 2026 new projects will still be funded and started that could already be fully automated with Claude Code. Some mills grind slowly. And elsewhere they don’t grind at all. I decidedly don’t believe this development will affect all areas of our lives equally, that by the end of the year the entire population will be completing all their tasks this way instead of searching for existing applications in app stores. Getting and storing API keys will still be an obstacle for many—even though you could learn it in five minutes (really, I’m not exaggerating!)—but someone has to tell you that first. At least all the AI influencers, including the truly uninformed ones, will discover the topic—and they’ll probably like to show off pretty computer games without focusing much on how easily such things can now be created by anyone. So in 2026, I think, most people will sit out vibe coding in Claude Code and similar tools.
And maybe they’ll never have to have that experience at all. We’re not that far, I think, from users—one level simpler still—just telling their device what they want and the necessary application then being built via AI. Perhaps most people will be spared the intermediate step and never need to hear anything about Python. I’m not saying this because I expect such a development as early as 2026. Rather, I consider such a progression plausible because people already have enough to do in their lives and would first need to make time to experiment with vibe coding. And if the AI-powered computer that fulfills our wishes as if by magic does arrive in 2026? Well, then I would have been wrong—not for the first time—and that would probably mean an exponential development would have taken place that would suddenly confront our societies with enormous challenges. (Even if all the resources that would become necessary were miraculously just available to us.) I won’t hide it—I very much hope that in 2027 I’ll be able to say I was right with my more cautious prediction on this one point. Otherwise... well, this post is already too long.






Comments