Author Archives: Jim Grisanzio

Unknown's avatar

About Jim Grisanzio

Software, Science, Geopolitics, Money

Henri Tremblay at JavaOne 2026

Henri Tremblay at JavaOne 2026 | Duke’s Corner Java Podcast | May 18, 2026

Here’s the second interview I did at JavaOne in March with Henri Tremblay. Henri is a Java Champion, Montreal JUG leader, and EasyMock lead developer from Canada.

Henri’s session at JavaOne covered the Java Memory Model, which is a topic he believes every Java developer should understand well. He’s been to six JavaOne’s and had warm words for the conference, which represents a rare opportunity to meet the people whose code runs on systems and devices all over the world.

He has clear advice for developers: read books, understand how and why your code works, and get out there and join the community.

We also talked about why Java still powers so much of the world’s critical infrastructure, from banks to the Mars rover. Henri pointed out that companies often start in C++ and then move to Java because Java runs nearly as fast once it’s going and is far easier to change later.

On AI, Henri had a balanced view. He uses it for tedious work, like sifting through a gigabyte of logs to find a single error. But he was also clear about the risks. “We should not get lazy at reviewing code because AI will generate tons and tons of code. It’s not bad at reviewing it, but still it makes mistakes.” He warned that AI reflects the average of what’s on GitHub, and most code on GitHub isn’t great. Your role, he said, is to find a better answer.

For students and junior developers, he says they should also leverage AI for learning, but he advises that they internalize the fundamentals of software engineering deeply. “Read books, please, please!” He pointed to Core Java, the book he originally learned from and is now helping revise. Blogs and YouTube videos only tough on surface level issues. Books take you deep and that’s the knowledge you need to grow your career.

Henri Tremblay on LinkedIn: https://www.linkedin.com/in/henritremblay/
Jim Grisanzio on LinkedIn: https://www.linkedin.com/in/jimgris/

Treating Populations, Not People

I’ve been following Nick Norwitz, MD/PhD, for a while now. He’s quite the researcher, that’s for sure. And his latest video covering his cholesterol situation is wild: I Bet My Life Against Cholesterol Dogma (And Won)

In the past Nick had severe inflammatory bowel disease, which altered his life significantly. At one point he dropped below 100 pounds, ended up in intensive care, and drove him to question his life. I totally understand. I’ve been there a few times myself. So, out of desperation he tried a therapeutic ketogenic diet, which sent his disease into full remission. Now, that fact alone isn’t supposed to be possible, but it’s actually the side story at the moment.

The real issue here turned out to be his cholesterol. As a result of that new diet, Nick’s LDL (low-density lipoprotein) climbed from a so-called healthy 90 to near 600, while his total cholesterol tapped out at 700! Under traditional medical science, those numbers represent an emergency right now and over time a death sentence from a heart attack or a stroke. Treatment was needed right away. But Nick made a different decision. As he says, for seven years he has been “running an experiment that should be killing me.” But he kept at it because “the thing that is causing my cholesterol to go so high is also saving my life.” That’s quite a position to be in. But it’s also one in which many people find themselves when dealing with two intractable medical conditions that doctors can’t simultaneously treat without side effects.

Since Nick’s been dealing with massive LDL for so long while also declining standard treatment with statin drugs, I was curious why he hadn’t yet gotten a cardiac calcium scan. Those scans are fast and relatively inexpensive. Why wouldn’t he want to check for plaque in his coronary arteries? It just seemed odd to me given how comprehensively he researches his own conditions.

Well, recently, he finally went ahead and got his scan. The result? Zero! No plaque at all! Personally, I wasn’t surprised because I’m familiar with his research, and I know others who have had similar results from similar conditions. But I’m sure he was relieved given his own history and also living with being attacked constantly for his unique approach. The result of zero plaque “doesn’t just poke a cholesterol dogma, it shatters it,” he says. It does, indeed. But medicine changes slowly. Perhaps in a hundred years or so the protocols will change to embrace new research. For now, though, he’ll likely remain an aberration, just like all the other aberrations out there. The difference here, though, is that Nick publishes prolifically, so he’s leaving a detailed scientific paper trail that other researchers are noticing.

Anyway, it’s a good story on its own for Nick’s gut and his heart. But the part I want to point to comes later when he steps back and explains why he thinks this happened. His argument concerns context. A single blood marker like cholesterol, or even height, only means something within a particular person’s situation. For example, Nick uses basketball superstar Shaquille O’Neal to illustrate the point. Shaq is seven foot one because of lucky genes. But Nick says that other people could be the same height because of a tumor messing with their growth hormones while the tumor eventually quietly kills them. Same measurement, different reality. “Context is everything.”

Then Nick makes the point that stuck with me because I’ve experienced it many times myself dealing with the medical industrial complex. It’s a very real phenomenon. Modern medicine, he says, often misses context entirely when dealing with individual patients. The context of the person gets traded away for algorithms that let the system run efficiently at scale as it implements particular protocols or policies. And as much as individual doctors may want the best for their patients, and Nick believes they do, “modern medicine treats populations, not people.”

That’s the line right there. I got it right away. Experienced it many times. Painfully. Nick says that treatment protocols get built on averages drawn from large groups of mostly sick people. That may work well enough for most people most of the time to get them out of some acute condition or to enable them to live longer with a chronic condition. But it fails the outliers. These are the people whose numbers on one test look alarming for a reason the average never accounts for while their other markers are exceptionally healthy. In Nick’s case, that reason is high cholesterol driven by a therapeutic diet rather than any other known disease. He returns to the idea near the end and tells us this: “Don’t settle for being treated like the population average, because almost none of us are.” He’s right. But the “settle” bit is challenging because the medical community as a system isn’t so warm and fuzzy when confronted with people who question it.

There’s much more in the video. He briefly reviews his experiment where he added Oreo cookies to his diet that cut his cholesterol sharply, which he offers as an indicator that his fat-burning metabolism may be the real driver to his cholesterol markers breaking the established reference ranges. He also explains why he walked away from a traditional medical career to instead work as a medical communicator. On placing that bet against decades of established cardiology, he says, “The data judges the ideas, not the credentials backing them.” His YouTube channel has over a million subscribers now, so he’s touching more people on any given day than would be possible had he gone into clinical medicine.

Systems are built to scale. But they rarely consider individuals, especially individuals whose conditions don’t fit published protocols. That’s why it’s critical to always do your own research, question your doctor, and act in your own interest. The doctors may care about you to a certain degree, but the protocols they implement don’t.

Good luck!

Bill Joy’s Future

Since there’s a lot of AI slop and doom chatter all over the news, podcasts, and social media these days, I figured I’d revisit Bill Joy in April 2000 for some context. I remember that Joy’s massive article “Why the Future Doesn’t Need Us” hit pretty hard twenty-six years ago not only because of the content but also because of who wrote it. Joy was a cofounder of Sun Microsystems, he helped build the internet, and here he was warning that three technologies might eventually lead to human extinction. The technologies he cited that could end us all include robotics, genetic engineering, and nanotechnology. And his core argument was actually pretty simple. These are not like past technologies. These new things can potentially replicate themselves, and that’s the bit that would change everything if they were used as weapons or simply were let lose by accident.

I first read Joy’s piece at a coffee shop in Cupertino, California across the street from Sun where I worked in software systems marketing. The article was widely read at Sun and also across Silicon Valley and resulted in discussions about Joy and his analysis for months. Many people just called him crazy. But that only demonstrates to me that those who made such flippant statements never read his article. Others knew better, though. They knew full well that Joy was documenting in detail the very real risks of rapidly developing technology without considering the consequences. That is, of course, the standard and pervasive culture of Silicon Valley. The valley may be big, but it is remarkably insular. I didn’t know Joy at the time I read his article, but I went on to meet him several times and worked closely with his teams promoting projects like SPARC, Solaris, Java, Jini, and later on JXTA. I didn’t really know him well, but he was always friendly and professional to me. He was quiet, too, and I always found him a serious thinker who obviously knew far more than he ever expressed. Sun was filled with such characters. They all fascinated me to no end.

The timing of the Joy article is interesting. He said he had been working on the essay since his 1998 conversation with Ray Kurzweil and continued to revise drafts through 1999. When Wired published the piece in April 2000, the tech world was at its peak. The NASDAQ hit a high of 5,048 in March of 2000 just a few weeks before the article dropped. And at that time Sun’s stock reached $250 a share, which gave the company a market cap around $200 billion. That was a significant achievement for 2000. Sun was one of the hottest companies in the valley back then, and it was quite a wild experience working there. The place was buzzing with activity. I loved it. Many of us did. So, it was into that environment of overt tech optimism running at manic levels that Joy published his thoughts about our potentially perilous future.

Andy Bechtolsheim, Vinod Khosla, Scott McNealy, Bill Joy at the Sun Reunion in Silicon Valley in October 2019.
Andy Bechtolsheim, Vinod Khosla, Scott McNealy, Bill Joy at the Sun Reunion in Silicon Valley in October 2019. Photo by Jim Grisanzio.

Boom!

Then everything blew up. The bubble burst. By mid-April 2000 the NASDAQ suffered its worst week in history, falling more than 25 percent. Companies started dumping workers like I’ve never seen before. Joy’s dark warnings about unchecked technology landed precisely as that optimism crashed. He obviously couldn’t have seen the future, but his timing was remarkable.

Joy’s warnings took on an even darker tone the following year. After publishing the Wired article, he signed a book contract to expand on the article. He moved into a hotel room in New York City and surrounded himself with gloomy books on plagues and nuclear bombs and other such material he was studying on the future risks of technology. Then on September 11, 2001 came the terrorist attacks we have come to know as 9/11. I knew a few people from Sun who worked at the corporate building in New York City, but I did not know Joy was there at the time. He said he stood in the streets with everyone else and watched the impossible happen in real life. The next morning he walked out into the city streets past a long line of sanitation trucks parked on Houston Street ready to haul away the rubble. Everything below 14th Street was closed, he said. “It was quite a compelling experience,” he said in a TED Talk, “but not really, I suppose, a surprise to someone who had his room full of the books I was reading. I was not surprised that it happened at all.”

Joy eventually abandoned the book project. I point this out just as an aside since the event occurred shortly after he published the article I am writing about here in this post. Still, it does reflect the feeling of the times. How much had changed in Silicon Valley and the United States in just one year.

Anyway, back to the article.

Who Is Bill Joy?

Joy wasn’t a fringe thinker or outsider. He was born in 1954 in Farmington Hills, Michigan, and he was a child prodigy who started school early. One time his father took him to the local elementary school principal’s office when he was only three years old. Joy then promptly sat on the principal’s lap and read him a story. He later excelled in math and graduated high school at 16. He loved books and thinking and that became his escape. He also loved science fiction. He devoured Heinlein’s “Have Spacesuit Will Travel” and Asimov’s “I, Robot” with its Three Laws of Robotics. He wanted to be a ham radio operator, which were the Internet hackers of their day, but couldn’t afford the equipment. On TV Star Trek inspired his imagination every Thursday night when his parents went bowling. GeneRoddenberry’s “The Prime Directive” clearly resonated with Joy. You can see that ethic woven into his writing thereafter.

At Berkeley in the 1970s, he created the vi text editor, which, to his surprise, was still widely used more than twenty years later and some hard core developers still use it even now. He also developed the Berkeley version of the Unix operating system. When the other founders of Sun Microsystems (Andy Bechtolsheim, Vinod Khosla, and Scott McNealy) invited him to join them, he participated in the creation of advanced microprocessor technologies and Internet technologies such as Java and Jini. As codesigner of three microprocessor architectures — SPARC, picoJava, and MAJC — he helped drive innovations that shaped modern computing.

By the time he wrote his famous Wired essay, Joy was only 45 years old and at the peak of his influence among developers in Silicon Valley. But Joy was far more then just a coder. He was well connected to the broader scientific community. That is what made the article so jarring. He was not an uninformed critic glancing in from outside with yet another opinion. He was a core architect of the digital age expressing deep doubts about where his own work was leading. His self-reflection was pervasive during this time in his writings and conference presentations.

The Kurzweil Meeting

Joy’s concern seemed to begin at George Gilder’s Telecosm conference in 1998 when he met Ray Kurzweil, who was an inventor and futurist. Kurzweil talked about how the rate of technological improvement was accelerating and also how humans may merge with robots or download their consciousnesses to achieve near immortality. I remember attending several talks on this topic of immortality when I moved to Silicon Valley. It always sounded so silly to me. I wondered how such smart people could take that stuff so seriously. But even now some people in these circles talk about downloading themselves. It still sounds silly. The other bits, though, about intelligent robots and genetic engineering were much more reasonable given my own experience working in the biotech industry. Where to draw the line, however, sometimes really is not clear. Joy had heard such talk before and always felt sentient robots were science fiction. But hearing it from someone he respected changed things. Kurzweil gave him a preprint of “The Age of Spiritual Machines,” which outlined a utopian future where humans gained near immortality by becoming one with robotic technology. I don’t know how far Joy goes with respect to robotic sentience, but it’s clearly more than I’m willing to accept.

Nevertheless, Joy’s unease intensified after reading the book. He felt sure Kurzweil was understating the dangers. Then he found a passage in the book describing a dystopian future where machines become so capable that humans depend on them completely. The passage argued that we would not consciously hand over control to the bots. Instead, “the human race might easily permit itself to drift into a position of such dependence on the machines that it would have no practical choice but to accept all of the machines’ decisions.” In other words, we would become dependent gradually. I can surely see that as a potential reality, no question about it.

But that passage came from Ted Kaczynski, the Unabomber! Joy admits this realization was uncomfortable to say the very least since he was taking a point from a terrorist seriously. Many people have said the same thing after reading Kaczynski’s words. Kaczynski’s bombs had killed three people and wounded many others. One bomb gravely injured David Gelernter, one of Joy’s colleagues and friends. But Joy felt compelled to confront the argument because, however uncomfortable, he saw merit in that single passage about unintended consequences.

The Self-Replication Problem

Joy’s central concern circles around one key difference between powerful 21st-century technologies and those of the 20th-century. Nuclear weapons required huge facilities and rare materials. But genetic engineering, nanotechnology, and robotics, what Joy called GNR technologies, require less infrastructure and can potentially make copies of themselves. A bomb explodes once, but a self-replicating machine does not stop. And that could be a serious problem if something goes wrong.

This matters because knowledge spreads freely. You cannot control ideas at all like you may be able to control uranium. Once people know how to genetically engineer bacteria or design tiny self-replicating machines, that knowledge exists in the world and will move rapidly. A small group, even one person, could potentially cause massive harm. Joy calls this “knowledge-enabled mass destruction.” As he wrote, “I think it is no exaggeration to say we are on the cusp of the further perfection of extreme evil, an evil whose possibility spreads well beyond that which weapons of mass destruction bequeathed to the nation-states, on to a surprising and terrible empowerment of extreme individuals.”

He made the point even sharper later. The real danger, he said, is no longer nation-states but individuals or small groups now empowered with “pandemic power.” These new digital, self-replicating technologies give extreme individuals the kind of destructive capability once reserved only for governments. That shift changes everything because the ramifications of mistakes or ill intent can’t be calculated or controlled.

Joy learned about complex systems and non-linear systems from physicists Stephen Wolfram and Brosl Hasslacher in the early 1980s. These are systems where small changes move in unpredictable ways and where feedback loops create unexpected outcomes. Later, conversations with Danny Hillis, biologist Stuart Kauffman, and Nobel laureate Murray Gell-Mann deepened his understanding. Hasslacher and Mark Reed also gave him insight into molecular electronics, which is the manipulation of matter at the atomic and molecular level where individual atoms replace transistors. When you get to this point in Joy’s article you can’t help but realize that he’s well beyond just a smart software developer who happened to strike it rich by helping found a successful tech company.

Joy knew that the merger of computers and physical sciences was creating enormous power, which up to that point others hadn’t expressed such thoughts in the popular tech media. By 2030, Joy calculated, we would likely build machines a million times as powerful as personal computers of 2000. That is enough computing power to make the scenarios that worried him technically possible. As he wrote, “But now, with the prospect of human-level computing power in about 30 years, a new idea suggests itself: that I may be working to create tools which will enable the construction of the technology that may replace our species. How do I feel about this? Very uncomfortable.” He admitted he had once been too optimistic about nanotechnology. Having struggled his entire career to build reliable software systems, it seemed to him more than likely that this future would not work out as well as some people might imagine.

Three Scenarios

Joy walks through what could actually happen with these technologies. What’s interesting is that evolution itself would be one of the driving forces moving these technologies from a positive outcome to something very negative.

One — Robots might simply out-compete us for resources the way better-adapted species have always displaced others. We would not need a robot uprising. Hans Moravec argued that “in a completely free marketplace, superior robots would surely affect humans as North American placentals affected South American marsupials.” Economic forces alone could push us aside. Also, the dream of robotics includes downloading our consciousnesses into machines. But Joy questions whether a downloaded consciousness would be human in any meaningful sense. The robots would not be our children, and on that path our humanity might be lost entirely. This is one of the reasons why I take Joy seriously. He sees the obvious problem with the downloading issue in a way that others in this field simply do not.

Two — Genetic engineering gives us power to create devastating plagues, either by accident or intention. Joy calls this the “White Plague” scenario, which references a Frank Herbert novel where a molecular biologist weaponizes his knowledge. We now know these profound changes in biological sciences are imminent and will challenge all our notions of what life is, and Joy points out that the public remains skeptical even by the standards of 2000.

Three — Nanotechnology could produce “gray goo,” which are self-replicating nanobots that consume the biosphere. Eric Drexler warned that “tough omnivorous ‘bacteria’ could out-compete real bacteria. They could spread like blowing pollen, replicate swiftly, and reduce the biosphere to dust in a matter of days.” Joy notes grimly, “Gray goo would surely be a depressing ending to our human adventure on Earth, far worse than mere fire or ice, and one that could stem from a simple laboratory accident. Oops.”

The Manhattan Project Parallel

Joy uses the atomic bomb as his template. After Hiroshima and Nagasaki, physicists were shocked by what they created. Oppenheimer later said the physicists “have known sin.” There was a real opportunity through the Acheson-Lilienthal report and Baruch Plan to prevent a nuclear arms race by internationalizing nuclear power. But it failed because political distrust and competitive pressure got in the way. Within years, the Soviets had the bomb, and the arms race was on.

Freeman Dyson captured the moment: “The glitter of nuclear weapons. It is irresistible if you come to them as a scientist. To feel it is there in your hands, to release this energy that fuels the stars, to let it do your bidding. To perform these miracles, to lift a million tons of rock into the sky. It is something that gives people an illusion of illimitable power, and it is, in some ways, responsible for all our troubles, this, what you might call technical arrogance, that overcomes people when they see what they can do with their minds.”

Up to that point, everyone feared nuclear bombs as the ultimate expression of madness. But Joy fears we are repeating this pattern with even more dangerous technologies, and the commercial incentives for their production are enormous. Nations compete. Corporations compete. Researchers want breakthrough innovations. The momentum builds, and pretty soon it is almost impossible to stop. We are being propelled into the new century with no plan, no control, no brakes. However, the driver is not military necessity this time but instead private sector economic gain and competitive pressure. This is human nature, though. Our history is literally filled with these processes. It’s who we are.

The Relinquishment Argument

This is the part that bothered many people. Joy argues for “relinquishment,” or the voluntary decision not to pursue certain lines of knowledge or technology because they are too dangerous. This goes against everything we believe about the value of knowledge and open inquiry, especially within the walls of the Silicon Valley and across the scientific community generally.

That may be why Joy’s article struck such a nerve. The general population is used to various power centers attempting to curtail their freedoms for whatever reason of the day. But regular people are largely powerless to do much about it beyond voting, which is time consuming, or protesting, which brings it’s own personal risks. The scientific and technological elite, however, is something different. They are one of the power centers themselves, and here was one of their own, a high profile one at that, advocating for relinquishment.

But Joy asks, what if unlimited pursuit of knowledge puts us all in mortal danger? He points out that we have done this before. At a 1989 nanotechnology conference, Joy said, “We cannot simply do our science and not worry about these ethical issues.” The United States unilaterally abandoned biological weapons development because the logic was clear. These weapons were easy to replicate and could easily end up in the wrong hands. We would be more secure if nobody developed them, and we embodied this in the 1972 Biological Weapons Convention and 1993 Chemical Weapons Convention.

Joy quotes Thoreau: “We do not ride on the railroad. It rides upon us.” Then he asks directly, “The question is, indeed, Which is to be master? Will we survive our technologies?”

This does not mean stopping all research. It means being strategic about which lines of inquiry we pursue and which we intentionally avoid. It means international agreements, verification systems, and scientists adopting ethical codes like the Hippocratic oath. It requires transparency and cooperation.

Personal Responsibility

Joy writes honestly about his own sense of responsibility: “I feel, too, a deepened sense of personal responsibility, not for the work I have already done, but for the work that I might yet do, at the confluence of the sciences.” He was not speaking as an outside observer but as someone who helped create the technologies that might enable the dangers he feared.

He finds hope in the Dalai Lama’s “Ethics for the New Millennium,” which argues that the most important thing is to conduct our lives with love and compassion for others, and that our societies need a stronger notion of universal responsibility. Neither material progress nor the pursuit of knowledge is the key to happiness. We need to find alternative outlets for our creative forces beyond the culture of perpetual economic growth.

In the TED Talk he gave six years after publishing his article, Joy made the point even clearer. The solution, he said, cannot be technology alone. We need both better public policy and deeper moral progress. He spoke of the need for “the head and the heart,” echoing Russell and Einstein. He argued that scientists, technologists, and businessmen must be held personally accountable under the law for the consequences of their inventions. Today they face no such responsibility. That, he believed, has to change. What’s striking hearing him articulate these perfectly reasonable points is that they aren’t taken seriously at all. many thoughtful people say the same things at conferences or in political speeches. We all clap and support the concepts. Yet, very little actually changes. Or at least the changes take so long and occur in such small steps that we’re left unsatisfied in the present moment.

Was Joy Right?

Joy’s article was unique for its time in the popular press. When it appeared on Wired’s cover in April 2000, it created quite the rumble in tech circles. Wired had been a cheerleader for the digital age for nearly a decade. Its shift from cheering to warning marked an important and surprising moment in the digital zeitgeist. Also, Bill Joy was not some alarmist outsider. He was one of the architects. His warning came from inside the cathedral and it certainly resonated. At the time, I was working in marketing at Sun, and one of my jobs was to promote Sun’s technologies to the media. In press interviews during this period reporters would always bring up Joy’s article even if the meeting was booked on another issue. We had to draft briefing and messaging documents for prepare executives, managers, and engineers that we brought into all interviews because we knew Joy’s article would alway come up in the discussion.

The article presaged much of what we are experiencing now but not necessarily in the ways Joy anticipated. His specific predictions about nanotechnology have not materialized. The gray goo scenario is now considered flawed and implausible. Most scientists believe built-in limitations make runaway nanotechnology highly improbable.

But Joy’s underlying concerns were prescient. He worried about knowledge-enabled destruction, powerful technologies becoming widely available, and complex systems we do not fully understand. All of these have become more relevant as artificial intelligence has advanced today far faster than most people expected in 2000. Interestingly, Joy only explicitly mentioned artificial intelligence once in his article, possibly because he was writing at the tail end of the second “AI winter.” Yet his concerns about self-replicating technologies and systems beyond human control have found new relevance with modern AI these days.

The Sun Sets

After leaving Sun in 2003, Joy moved into venture capital and focused on green energy investments at Kleiner Perkins until 2014. Now 71, he works as principal investigator and chief scientist at Water Street Capital. But despite the renewed relevance of his warnings with the rise of AI, Joy has remained largely silent regarding the public debate he started 26 years ago. It is almost as if he said what he wanted to say in 2000 and we’re still digesting his thoughts all these years later.

He didn’t, however, stay stuck in doom or alarm. In the years after the Wired article he actively tried to move toward better outcomes. He joined Kleiner Perkins specifically to invest in solutions. He backed innovations in education, new materials for the environment, and a major $200 million biodefense fund aimed at closing the gaps that could lead to a pandemic. He came to believe we cannot solve the management of dangerous technology with more technology alone. Instead we need better policy, markets that price in the true cost of catastrophe, and a deeper moral shift.

Joy put it simply in that later talk at TED. “We can’t pick the future, but we can steer the future.” Over the years technologies have changed, but the fundamental challenge he identified in 2000 remains relevant. Figuring out how to pursue knowledge and innovation while maintaining enough wisdom and caution to survive the unintended consequences seems to be a question others should carefully consider today.

In his article, Joy compared coding to Michelangelo releasing statues from marble. He described his software engineering in a similar way with those ecstatic moments when the code emerged from his imagination as if it were already waiting in the machine to be freed. He ended his essay with that same image. After eighteen pages exploring multiple scientific disciplines and warning about existential dangers from the exploitation of technology, he wrote, “I am up late again, it is almost 6 am. I am trying to imagine some better answers, to break the spell and free them from the stone.”

Twenty-six years later, we are still up late too. We’re still searching for those answers, as well. We’re still trying to release better possibilities from the block of marble that is our technological future. And now we go forward in the wild world of AI.

Good luck to us all.

The AI Doom Vibe Change

For a couple of years now, the single story pounding our heads about AI all day every day has been exclusively about looming disaster. AI takes the jobs, then it takes everything else, then few people get rich. A love story. It was an intentional positioning of the technology, obviously, but the question remains why. Well, Cal Newport has a good hypothesis that I wrote about the other day. But others are noticing as well. So, the phenomenon of the old pitch evolving into something new is probably real.

In a recent episode of The AI Daily Brief, host Nathaniel Whittemore says the previous extremist narrative may finally be cracking, and he cites a fair number sources to substantiate his claim. It’s a different take from Cal Newport’s but there is some overlap. The signals are faint, he says, but they’re showing up in two key places at once so that may mean that the shift will likely have some legs. I think Whittemore may be pulling his punches a bit by saying “the signals are faint” just to cover himself since this shift has been so recent, such as really only the last few weeks. I think the shift is clearly underway. Remember, the IPOs are coming soon, baby! The companies and their simps can’t continue with the doom rhetoric. The American public has rejected that strategy. And it’s interesting that some public opinion polls in China lack this doom positioning. Anyway, back to Whittemore’s daily brief.

The first place Whittemore notices the vibe shift is within the never-ending chattering class in the media. He points to Ezra Klein’s recent New York Times article, “Why the AI Job Apocalypse (Probably) Won’t Happen.” I like the “probably” bit. But coming from a big voice on the political left and also one that’s outside the AI bubble, Klein may carry some weight in Whittemore’s eyes that a similar post from others, say, Marc Andreessen, simply wouldn’t. Klein cites economist Alex Imas from the University of Chicago and also a wider body of economic research to make a case rooted in Jevons Paradox. When something gets cheaper, we tend to use more of it, not less. So although computers may have changed or even eliminated specific tasks, the cost savings created enough new demand that the occupations expanded overall. As Klein puts it, “Every enthusiastic AI adopter I know is working harder than ever because there is more they can do.”

Whittemore points to more data that’s emerging. Software engineering, which is the job category most exposed to AI, is the one where postings have actually increased recently. Citadel Securities cites the increase at 18 percent since May of last year. Federal Reserve numbers also show software engineering jobs at their highest level since November 2023, although the current number is still well under the previous mark three years ago. Also, Stripe Atlas just hit 100,000 incorporations, with Q1 up 130 percent year over year. As Derek Thompson says, “AI agents are better at creating firms than destroying jobs.” A new trend?

The second place the shift is showing up for Whittemore is in markets themselves. Anthropic’s revenue, according to SemiAnalysis, has gone from 9 billion to more than 44 billion this year, which is roughly doubling every six weeks. Atlassian’s stock jumped about 30 percent recently after strong earnings with customers using its new Rovo AI tool growing their own ARR at twice the rate of those who weren’t. The skeptics have been questioning how you justify trillions in infrastructure when seats only sell for 20 dollars a month. Well, that’s being answered by the move from seats to tokens taking place recently in the intelligent agent era. A single engineer with Claude Code might burn through hundreds or thousands of dollars in tokens each month, and the companies selling those tokens cannot keep up with demand.

There’s another piece of the vibe shift worth noting, one which I found most interesting since I’ve worked in both industries. The Associated Press recently reported on construction companies teaming up with big tech to push back on community opposition to data centers. Rob Bear of the Pennsylvania Building and Construction Trades Council told the AP that communities should figure out what they actually want from these projects rather than just saying no. “If you don’t ask, you’re never going to get,” he said, pointing to things like better project plans or money for local schools and infrastructure. Whittemore’s take is sharper. He calls it “an insane indictment of how poorly tech companies have run these projects that the issue has gotten this bad” given how many ways there are to make data centers genuinely valuable to nearby communities at a fraction of the total cost. He’s spot on. The AI companies deserve the public backlash. We’ll see how they adapt to the very real world they are now entering.

Even the AI labs are softening their messages. Sam Altman recently wrote that “jobs doomerism is likely long-term wrong” and that OpenAI wants “to build tools to augment and elevate people, not entities to replace them.” Whittemore says this is a meaningful pivot from a company whose stated goal used to look a lot more like replacement.

But Whittemore is careful not to declare victory too fast. The AI transition will still be painful for specific workers and communities, and history shows we generally don’t help them much at all when economies move through technological advancements. But he ends on a hopeful note.

“I find it extremely encouraging to feel the collective foot being taken off the gas of the AI doomerism for just a moment. If nothing else, it creates an opportunity to have a different type of conversation. One that’s neither doom nor utopia, but about how to adapt to and maximize the opportunity of the change that’s here and coming. I think the more time we spend on that conversation rather than in the extremes, the better off we’ll be.”

The AI Doom Fever Finally Fades

Is the AI Doom Fever Breaking? (It’s About Time!) — Cal Newport, AI Reality Check, Deep Questions Podcast

For many years now executives leading the big AI companies have been telling the public that their own products will destroy the economy and gut the white-collar workforce. As Cal Newport observes this is the rough equivalent of a Pfizer executive announcing a new pill that cures psoriasis but also turns half the population into zombies. But lately the extremist rhetoric on AI has started to soften significantly. Instead of totally replacing entire segments of the workforce, these new AI systems will now simply augment existing workers and also lead to massive new opportunities for employment. That’s quite a radical shift in attitude, especially coming from people whose breathless messaging has been so bold. 

Nevertheless, tech companies are still laying off tens of thousands of employees and citing AI as the reason. So, we’ll see. Newport thinks the previous over-the-top positioning on AI resulted more from culture, whereas the recent shift in tone is likely more tactical. His analysis is comprehensive and seems pretty accurate given that he’s been pushing back on this rhetorical issue for years now. 

A Strange Sales Pitch

In his podcast, Newport runs through many of the recent doom statements. Mustafa Suleyman of Microsoft AI has suggested that AI will be capable of automating most knowledge work within roughly a year. Dario Amodei of Anthropic has warned that the technology will soon replace up to half of all entry-level white-collar jobs in finance, consulting, and tech. Sam Altman of OpenAI speculated last summer about a future in which AI would “do everything,” leaving humans to find new ways to “participate” in the world.

“That’s basically what we’re getting from the AI CEOs,” Newport says. “And I think it’s just lunacy.” Newport has held this position for some time now. And it’s an opinion shared online by many advanced engineers who have been working with AI systems for a long time. However, there are many so-called social media influencers in the AI space who still push the end-of-the world theme. It’s been an odd experience for sure. Granted, software executives have a long history of bragging in their corporate earnings calls that their systems will enable customers to cut expensive employees. It’s hard to think, though, of another industry whose leaders so cheerfully predict that their products will wreck civilization while making their founders and a few insiders rich beyond their wildest dreams. Seems like a difficult sell, eh?

The New Vibe

In late April 2026, Altman posted on X that OpenAI wants to “build tools to augment and elevate people, not entities to replace them,” and added that “jobs doomerism is likely long-term wrong.” A few days later, Nvidia CEO Jensen Huang pushed back even harder. In a Fortune interview posted May 2, he called the half-of-jobs prediction “ridiculous” and warned that becoming a CEO can leave a person with what he described as “a God complex,” speaking as if their position alone gave them the authority to predict civilization-scale outcomes. Huang also estimated that AI has already created more than half a million jobs, because companies that adopt it grow faster and hire more, and noted that demand for software engineers is actually rising.

Newport reads Huang’s comments as a slightly concealed dig at Amodei, but perhaps it was a subtle signal to the industry that things will be shifting. Either way, the tone has clearly migrated from outright apocalypse to smooth augmentation. Who knows. But at least it’s a welcome change for those directly affected by the recent massive layoffs attributed to AI systems that haven’t even been fully built and deployed yet. 

Where the Doom Came From

To understand the old rhetoric, though, Newport argues that you have to look at the tech culture of San Francisco and Silicon Valley, especially among engineers, and especially content articulated in a few internet forums in recent decades. The most influential was LessWrong, founded by Eliezer Yudkowsky and devoted to refining the art of human rationality. Also, the blog Slate Star Codex, written by Scott Alexander, helped push the same themes into a wider readership. Out of this loose network grew the rationalist movement, which is the idea that if you trained yourself to think like a logical engineer, you could overcome cognitive bias and act more effectively in the world. Newport, who was trained in computer science at MIT, recognizes this culture. “I’m around engineers. I am an engineer. I know this way of thinking,” he says, adding that his wife once told him, “Don’t take me to the MIT Christmas parties because you guys are all so weird.”

Two important offshoots followed. One was effective altruism, which applies something called expected-value reasoning to charitable giving and was made famous — and then infamous — by Sam Bankman-Fried when he led FTX. The other was the existential risk community (X-risk), which argued that very rare disasters with very large costs deserve serious consideration right now. Newport summarizes the X-risk crowd as focusing especially on three threats: asteroid strikes, deadly pandemics, and superintelligent AI. Nick Bostrom’s 2002 paper Existential Risks is the foundational text, but the wider X-risk literature also covers nanotechnology, nuclear war, and what Bostrom calls “totalitarian lock-in.” Newport doesn’t mention Bill Joy’s shock article “Why the Future Doesn’t Need Us” in WIRED in 2000 but it certainly fits the paradigm.

The X-risk crew organized a closed-door conference that produced the open letter signed by Stephen Hawking, Elon Musk, Bill Gates, and many of the leading AI researchers of the day. Newport places it in Puerto Rico in 2017, but the actual event was the Future of Life Institute’s “Future of AI: Opportunities and Challenges” conference in San Juan in January 2015. The 2017 follow-up was the Beneficial AI conference at Asilomar in California. Robert McMillan’s WIRED piece from January 2015, “AI Has Arrived, and That Really Worries the World’s Brightest Minds,” captured the elite anxiety that emerged from the Puerto Rico meeting and may be the article Newport has in mind when he describes the era. There were other similar pieces in the elite media during this time period as well. The elites aren’t shy with the media. 

ChatGPT and the Hero Complex

Then ChatGPT arrived. For people who had spent ten years writing footnoted lists about the coming superintelligence, it felt like the moment they had been preparing for. Newport thinks this was both terrifying and intoxicating. “What if we were right about this risk,” Newport imagines them thinking, “and not only were we right, but it’s happening?” The rationalists sensed they were going to be Neo. They were going to be John Connor. They were the ones who saw it coming and would now lead everyone else through it. It may be hard for normal people to think this way, but we are talking about the tech elite, after all. They do actually live in a different world, one that’s in many ways disconnected from the normal reality of people who have to work for a living. Newport stresses that it’s important to realize that the current AI companies we see now all grew from that culture. 

OpenAI, Newport says, originally presented itself as a nonprofit AI safety organization heavily shaped by X-risk concerns. It was started as almost a hobby project for the rationalist crowd before commercial ambitions reshaped what it is now. Anthropic was founded by former OpenAI staff who, according to Newport, felt their old employer was not being rigorous enough about safety. Grok came out of the same orbit. The CEOs, Newport says, were not playing 4D chess with investors. They were just talking the way everyone they knew talked in Silicon Valley. The trouble started when their companies got too big to keep speaking only to their own closed subculture. As Newport puts it, “we’re not, you know, in the Mission District anymore.”

Why It’s Breaking Now

Newport sees three potential forces that may be accelerating the recent change in AI positioning:

First, there is real IPO pressure building. As OpenAI and Anthropic move toward public markets, more sober East Coast investors (who wear suites, Newport says) are quietly asking the founders to stop terrifying the customers they hope will pay for AI products. 

Second, public opinion is turning. A Quinnipiac poll from March 2026 found that 55 percent of Americans now believe AI may do more harm than good in daily life, which is up from 44 percent a year earlier, with about seven in ten people expecting fewer job opportunities in the future.

And third, journalists are running out of patience. Ezra Klein’s May 2026 New York Times column “Why the A.I. Job Apocalypse (Probably) Won’t Happen” reports that the economists he interviewed are skeptical of mass joblessness. Also, a recent Ronan Farrow piece in The New Yorker even raises the question of whether Altman is actually a strong chief executive for a trillion-dollar company.

Newport says there may be additional pressures in the market pushing AI executives to temper their rhetoric in recent months, but his analysis on the three issues above seems pretty comprehensive as a working hypothesis. 

A Welcome Maturation

The Silicon Valley monoculture has finally collided with the rest of the country, Newport argues. Wall Street realism, journalistic scrutiny, and ordinary public sentiment are forcing the language to evolve to the realities of the market. Newport sounds almost relieved. Somebody, he suggests, finally had to tell these founders to “stop talking like you’re Sarah Connor from Terminator 2.” His understated parting advice still applies. Take AI seriously, but not everything you hear about it.

Cluetrain Yesterday and Today

I read The Cluetrain Manifesto late in 2000 when I lived in California. It was a short book published in 1999 by Rick Levine, Christopher Locke, Doc Searls, and David Weinberger. I loved it. It represented a radical departure in marketing and communications at the time because it was heavily influenced by developers in the rapidly growing Free and Open Source Software (FOSS) movement. I was already familiar with the topics in the book because I worked at Sun Microsystems and mixed with FOSS engineers every day.

Cluetrain made bold predictions about markets becoming conversations, about authentic voice displacing canned corporate messaging, and about employees and customers breaking free from rigid command-and-control structures. We read it widely at Sun, especially those of us who were managing FOSS projects. Back then we were opening millions of lines of code, so the lessons in Cluetrain provided useful guidelines as we built projects and engaged existing development communities. It also came in handy for dealing with the media, which was my primary job. And Cluetrain fit Sun’s culture perfectly since the place seemed at times more like a frat house than a corporation.

The Prediction: Markets Are Conversations

The Cluetrain Manifesto’s central argument was straightforward. Mass media had interrupted human conversation and turned people into passive consumers and markets into targets. Even within Sun’s generally open culture, there were still many power centers in marketing and product development talking in terms of targeting the media, developers, and customers. That positioning drove me nuts because the engineers never spoke that way. Cluetrain argued the Internet would reverse that old paradigm. People could now talk directly to each other about products, communities, and companies because the Internet enabled many-to-many conversations efficiently across traditional corporate firewalls.

The authors based their conclusions on real observations and personal experience. They showed how customers were gathering in online forums to discuss products, share experiences, help each other make decisions, and actually get work done. Engineers working on projects could now openly explain in detail about their software or services to their peers in the community and to potential customers. Companies that tried to control these conversations through legal threats or PR spin got mocked online. But many times the companies didn’t even know they were the butt of jokes online, so it fell to people like me to educate people internally. That was a painful process. However, I never had to justify the views articulated in the book to developers because in the FOSS community open communication was considered normal.

What Actually Happened

The first part of the prediction in Cluetrain proved accurate. Conversation did explode online. Developer conversations thrived. Customer reviews became crucial to purchasing decisions. Social media gave employees and customers platforms to speak publicly, and some companies actually encouraged these engagements. Sun and Microsoft, for example, were the first two big Silicon Valley software companies to build blogging platforms for their employees to communicate with the outside world. At first it was the development project engineers who engaged externally, but over time Sun had over two thousand employees blogging and talking to whoever they needed to talk to get their jobs done. This openness was such a relief from the constraints of corporate life. The old broadcast model of one-way corporate messaging did lose some of its power at least initially.

But the manifesto underestimated how quickly new gatekeepers would emerge and how differently those gatekeepers would operate. The authors imagined the Internet as inherently democratic, a medium that would naturally promote authentic conversation between equals. What actually emerged was far messier.

Facebook, Twitter, Instagram, YouTube, and other social media platforms did create spaces for open conversation. But they also monetized that conversation through advertising. Every interaction became data to be collected, analyzed, and used to target ads more effectively. The algorithm, not human choice, determined what conversations most people saw. The marketing empire was clearly striking back. Engagement metrics rewarded outrage and emotional volatility over thoughtful discussion. Over time bots and fake accounts muddied the distinction between authentic voices and manufactured ones. Fortunately most FOSS projects in the early days used Mailman mailing lists, so those discussions among engineers largely escaped the advertising swamp. As the years went on, though, more and more engineers did start engaging on social media, so they were also exposed to the new world of monetized discussions.

The manifesto also assumed that truth would naturally win in open conversation. But algorithmic amplification doesn’t work that way. A well-funded disinformation campaign using fake accounts can reach more people than a customer’s honest review. Negative news and conspiracy theories that trigger strong emotional responses get shared more widely than nuanced analysis about how complex products are built and deployed. We learned that the loudest voice does not necessarily contain the most authentic message. It’s just the most algorithmically optimized.

The Transformation of Corporate Communication

On the surface, companies did adapt to the trend of open communications. They hired social media managers for customer engagements and community managers for developer programs. They created corporate Twitter accounts. They posted on Facebook. They built vast user and developer communities around their brands. They also encouraged employees to become brand ambassadors. The language of the manifesto even got absorbed into corporate training materials and MBA programs, which gave us the impression that we were making progress on the goal to get more corporate conversations out into the open.

But something got lost in translation. The manifesto called for people to speak from genuine concern and real knowledge. What companies actually created was a new form of managed authenticity. Social media posts are now written by communications departments and checked by legal teams before posting. Influencers are paid to seem organic. Employee advocacy programs train workers on what they can and cannot say publicly. So the appearance of conversation replaced actual conversation, and that change happened quickly.

Some companies did embrace transparency more genuinely. They admitted mistakes. They let employees talk. They engaged with critics. At Sun we regularly aired our dirty laundry out to dry in public blogs, and it was clear the community appreciated those efforts. But these examples remain exceptions. Most corporations treat blogs and social media as another broadcast channel or a new pipe to push messages rather than a genuine space for dialogue.

The Complexity of Authentic Voice

The Cluetrain Manifesto also underestimated how difficult an authentic voice actually is to maintain in organizational settings. It assumed that if you removed restrictions on what employees could say, genuine conversation would naturally emerge. But people are complicated. They worry about job security. They experience social pressure. They have competing loyalties. They get tired and cynical. Organic growth only takes place up to a point when systems are relatively small. To scale things, however, takes significant effort on a consistent basis.

More fundamentally, Cluetrain treated authenticity as something simple and good. But authenticity can also be cruel, narrow-minded, and destructive. A customer’s authentic voice might include racist slurs. An employee’s honest opinion might be shaped by biases they don’t recognize. Not all conversation improves markets. Some of it pollutes them, and this process is well known among propagandists intent on wrecking any community. This is true even in more technical conversations on FOSS projects where trolls and flame wars were common, so over time projects had to write and enforce communications policies.

The manifesto also didn’t anticipate how authenticity itself would become a commodity. Brands now hire consultants to develop an authentic voice. Corporations spend millions to appear genuine. The performance of authenticity replaced authenticity itself. People learned to perform realness the way they once performed professionalism.

What the Manifesto Got Right

Despite its blind spots, the Cluetrain Manifesto identified something real about how business would change. Markets did become more transparent. Information asymmetries did shrink. Customers and development communities did gain more power relative to corporations. Microsoft, for example, went from calling Linux a cancer to later engaging in many Open Source development projects. Companies that ignored customer concerns did suffer damage. And authenticity and trustworthiness did become more valuable.

The manifesto was also right that conversation matters. The most successful companies today are not those that shout the loudest in advertising but those that foster real engagement at least at some level. Whether through product design, customer service, or community building, companies that create spaces for genuine interaction outperform those that don’t.

The insight about hyperlinked organizations proved prescient as well. Work did become more distributed and massively networked. Information did flow less predictably through hierarchies. Employees did gain access to information and connections that bypassed management. And remote work and globally distributed teams became normal.

What the Manifesto Missed

The authors didn’t anticipate the staying power of command-and-control management. They believed the Internet would make hierarchy obsolete. In reality, many hierarchies simply moved online. Power still concentrates at the top. Information still gets controlled. Employees still get managed rather than trusted.

They underestimated the problem of scale as well. Small online communities can maintain authentic conversation naturally. But millions of people cannot. When conversation scales to platforms with billions of users, something fundamental changes. Moderation breaks down, abuse flourishes, and manipulation becomes almost trivially easy.

They also didn’t foresee the rise of algorithmic curation. The manifesto assumed conversation would be transparent and visible. But now conversations get filtered through algorithms that most people don’t understand. What you see is not necessarily what others see. Your feed shows you different information than your neighbor’s feed. Markets become fragmented not into niches of increasing value but instead into isolated bubbles that can easily represent entirely different worlds.

Most importantly, the authors didn’t anticipate how effectively new forms of power would entrench themselves. Tech platforms became more concentrated than traditional media ever was. A handful of companies now control where most conversations happen online. They set the rules. They decide what gets amplified and what gets suppressed. They collect data on every interaction and sell that data on the open market. Some even delete users at will. The promise of decentralized conversation gave way to centralized control in new forms.

The AI Complication

And then came artificial intelligence, which is what we have now. Twenty-six years after the manifesto declared that markets are conversations and authentic human voice matters above all, we now face the very real possibility that voices themselves can be manufactured — not just managed or performed, but entirely synthesized with virtually no effort.

Generative AI can now write customer reviews that sound authentic. It can create social media posts that mimic human personality and show fake images and videos that look real. It can respond to customer service inquiries with warmth and empathy that it clearly doesn’t feel. Chatbots can hold conversations that fool many people into thinking they’re talking to humans. The manifesto worried about corporations faking authenticity. Modern AI now makes those fake voices mostly indistinguishable from real human beings.

This issue creates a crisis for the manifesto’s core premise. If authentic voice is what matters, but we can no longer tell which voices are authentic, what happens to open markets? If conversation is the basis of trust and connection, but many conversations are run by AI agents optimized to simulate trustworthiness, where does that leave us who want to connect?

Companies are already deploying AI at scale to handle customer interactions. The voice answering your question might be artificial. The review you’re reading might be AI-generated. The social media account sharing opinions about a product might be a bot. The email from customer service might never have been seen by human eyes. AI is now even writing the software that human developers used to code by hand. We’ve arrived at a strange inversion. Corporations can now fake human voice so convincingly that the manifesto’s call for authenticity becomes meaningless. Perhaps this is why many people now more than ever crave real experiences at live conferences because at least they know they are talking to other human beings.

Some argue that these changes demonstrate progress in efficiency. AI can provide customer service around the clock, respond instantly, never get tired or rude, and scale infinitely. It can analyze millions of customer conversations to identify patterns and improve products. It can find bugs in software faster than human engineers can. It can personalize communication to each individual. From a purely functional standpoint, AI might deliver better customer service than most human representatives in some industries.

But the manifesto wasn’t arguing for functional efficiency. It was arguing for something deeper. Human voice matters not because it transfers information efficiently but because it carries presence, concern, and authentic relationships. When you talk to someone who actually cares about your problem, who has real expertise honed from experience, or who might make a mistake but is genuinely trying to help, something happens that transcends the mere exchange of data. You know it when you experience it because you can literally feel it. AI can’t replicate that and everyone knows it.

Where Cluetrain Still Works

Here’s what the manifesto got fundamentally right and what remains valuable today. The principles work well at human scale — small developer communities working on software, local businesses with genuine relationships to their customers, teams working together on projects they care about. These are places where authentic conversation still thrives and still matters.

I see this every day in the FOSS projects I work with. A developer asks a question on a mailing list, and other developers quickly respond with actual knowledge and genuine helpfulness. They argue about the best approach. They share, review, and integrate code. They build things together. No corporate messaging, no brand management, no legal review. Just people talking about work they care about with others who share that interest. It just works.

The problem isn’t that Cluetrain’s principles were wrong. The problem is scale. When you try to apply those principles to platforms with billions of users with no clear meritocratic culture, things break down fast. The authentic conversation gets drowned in the noise. The algorithms take over, and the bots multiply. Corporate power reasserts itself in new forms.

But nothing prevents smaller operations from running on Cluetrain principles. Some examples include the following: a local bookstore that knows its customers by name and recommends books based on real conversations, a software consultancy where developers talk directly with clients about what both sides actually need, and a regional business where employees are trusted to speak for the company because they actually know what they’re talking about and care about doing good work. Even within large multinational companies in the software space, Cluetrain works well at the project level but rarely at the larger corporate level.

The manifesto’s failure wasn’t in its vision of how human conversation should work. It was in assuming that vision could scale to global platforms without fundamental transformation. It couldn’t. But at human scale, in smaller communities where people can actually know each other, the principles still hold.

This is why I still talk about Cluetrain at developer conferences, even though almost no one in the audience has heard of it. The book itself may be forgotten by younger generations, but the principles matter more than ever. In a world increasingly mediated by AI and controlled by algorithmic platforms, we need spaces where genuine human conversation can still happen. We need to protect and nurture those spaces. We need to remember that markets and communities are conversations, but only when they’re small enough for conversation to be real.

What We Need Now

The Internet did change business. Conversation did become more important. Authenticity still matters. But not in the straightforward way the manifesto imagined. Markets are conversations, yes, but those conversations now happen on platforms controlled by automated systems deployed by tech giants. They’re shaped by algorithms. They are monitored and sometimes conducted with AI. Genuine human exchange still occurs, but it competes with manufactured authenticity, disinformation, coordinated inauthentic behavior, and synthetic voices that can perfectly mimic humanity.

Companies still struggle with transparency and authentic communication. They still try to control their messages at the corporate level. They still fail when they ignore what customers, employees, and developers are saying. But now they have a new tool to cut people out of the experience. They can literally replace human conversation with automated authenticity. They can scale the human voice without involving actual people.

AI development raises questions the manifesto never considered. Does it matter if the voice responding to you is human as long as it gives you what you need? Is there value in knowing you’re talking to a person rather than a computer? Can markets function as conversations if we can’t tell who or what we’re conversing with? What happens when the boundary between human and synthetic voice dissolves completely? These questions don’t address some time well into the future. We are there right now.

Some companies are being transparent about their use of AI. They disclose when you’re talking to a bot. They label AI-generated content. They treat artificial intelligence as a tool that assists human workers rather than replaces them. Others are less open. They let customers assume they’re talking to humans. They generate reviews and social media posts without disclosure. They use AI to create the appearance of engagement and concern while minimizing human involvement. And now some companies are cutting tens of thousands of highly paid jobs to support the development of new AI systems that will cut even more people.

And what’s most ironic is that it’s the software developers themselves who are building the systems that will ultimately replace their services. I guess that’s progress because it’s been the goal of most companies who build and deploy technology. If you listen to corporate earnings calls you will hear executives explaining in detail how their newly autonomous systems will enable customers to better manage operations by reducing people. Instead of needing 25 administrators to manage the system now you only need 5.

Check this out. I went clothes shopping recently in Tokyo. I expected a person would check me out by carefully selecting each item I brought, scan the bar code or key in the price by hand, and finally fold each item for me. This is Japan, after all. Service matters. But that wasn’t my experience. Instead in this store I simply tossed my items into a bin and the AI immediately scanned everything accurately, listed each item on the screen, and I paid with a quick tap of my card. Poof. No human involved whatsoever. Let me be clear about this. I didn’t have to scan each item’s bar code myself. No. I just dumped a pile of 10 items in a big tangled mess into the bin and that was just fine for the AI. That’s remarkable.

The Cluetrain Manifesto identified real changes that were happening in 1999. But it underestimated the adaptability of power as well as the opportunities and unintended consequences of new technology. The Internet did liberate voice. It just liberated all voices, including exploitative ones. Markets did become conversations. They just became conversations mediated by profit-seeking platforms with incentives that often conflict with genuine dialogue. And now those conversations are increasingly generated by bots designed to simulate the very authenticity the manifesto celebrated.

The manifesto remains worth reading because the core insight is still true. People want to talk to each other. They want to be heard. They want to connect. And sometimes they also want to have a quick chat at the checkout counter with a real human being and have their clothes folded and placed neatly in a bag. That desire hasn’t changed. What has changed is everything else about how that conversation happens. The platforms, the algorithms, the AI, the incentives, the noise level, and the stakes all represent and entirely different world from 1999 when Cluetrain was published.

Understanding what The Cluetrain Manifesto got right and wrong helps us see more clearly what we actually need now. We need not just the ability to speak but systems designed to preserve genuine human connection in an age when that connection can be convincingly faked. We need ways to distinguish authentic voice from synthetic simulation. We need to decide whether that distinction even matters, and if so, why. We need to figure out what human conversation is actually for now that machines can do a remarkably competent imitation of it.

Most importantly, we need to recognize that Cluetrain principles still work. They just work at human scale, not platform scale. We need to create and protect spaces where authentic conversations and direct human interactions can thrive. Small communities. Local businesses. Project teams. Places where people actually know each other and can build trust through genuine connection. The future of authentic conversation isn’t in taming the big platforms. It’s in building alternatives that stay small enough to be human.

AI’s Perpetual Present

I’ve been reading “Why We Need Continual Learning” by Malika Aubakirova and Matt Bornstein recently. I also listened to a podcast interview from Malika on a16z . Now, I’m no AI researcher or developer. But I do like exploring the scientific foundations on which advanced software tools are built, especially since I use these applications every day and hope to leverage them more in the future. So although I don’t fully understand what’s actually happening underneath, poking around a bit is an interesting exercise. What follows below is what I’ve learned from the article. Consider it a work in progress. If you want the expert version from Malika and Matt, go read their original piece for a deep dive. This text here is just me working through things as best as I can at my level. At the end of this post, I include a list of terms and definitions. I’ll make that a standard feature in similar upcoming posts for my own short-term recall practice and also for long term memory consolidation. Memory practice (the human kind) is a hobby of mine.

Anyway, here we go. The authors open their article on continual learning by referring back to Christopher Nolan’s “Memento,” which is a film about a man named Leonard Shelby who suffers from anterograde amnesia that prevents him from forming new memories. Every few minutes his world resets and he wakes up in the same perpetual present with no idea what just happened in the past. He tattoos notes on his body and carries Polaroids as memory aids just to function throughout the day. It turns out that he’s very resourceful because he uses whatever he can in his environment to get by. He even appears pretty capable within any given scene in the movie. But, as the authors put it, his tragedy is that “he can never compound. Every experience remains external.” So, I guess that means he can’t learn based on his present moment to prepare for the future like most of us who have normal memories.

That seems to be a good general description of where AI models are right now. Back before I knew about this issue, I actually inadvertently tripped over it when I first used ChatGPT and Grok a few years ago. It was clear from my chats at the time that the models were not “learning” from our conversations at all. I kept spinning around in circles explaining myself over and over again. And, in fact, some of those earlier models didn’t know even basic facts from current events, which was shocking since AI was sold to us as being so super smart. That’s when I realized that the “learning” for LLMs took place at some point in the past and then they were locked shut while life continued on. That experience of an AI not knowing simple bits in the news rarely happens now so the user experience has improved significantly. However, there’s a lot more to it that I didn’t realize from those first few frustrating conversations.

What’s Actually Happening When You Type Into That Text Box

Here’s what I didn’t fully understand before reading the article. When you type into a chat window and stuff happens before you get an answer, that process is not the model learning anything from your input. It’s reading what you gave it and generating a response. When the conversation ends, the model does not carry that conversation forward in its memory. The next conversation starts from exactly the same place as every other new conversation. Initially, that felt unnerving so I had to figure out ways to leverage the knowledge from the LLM without all that forgetting going on.

The text box we type into is just a door into the system. What matters is the context window behind the door, which is everything the model can see at once. So, your message, the whole conversation history, any documents you shared, and any background instructions — all of these things represent what the model is working with when it responds. And it has a size limit. When it fills up, older content gets dropped to make room for new content. So if you spend an hour explaining your company’s internal processes to an AI assistant and then start a fresh conversation the next day in a new text box, the AI has no memory of the previous conversation. You have to start over. Not because it forgot. Because it never learned in the first place.

There’s a name for this phenomenon. The article calls it in-context learning, which is really just the model making smart use of whatever sits in front of it right now. It’s temporary by design. The model reads, responds, and moves on. It’s similar to glancing at your notes before a meeting rather than actually deeply studying, internalizing, and using the material beforehand. When the meeting ends, those casual notes go back in the drawer and are forgotten.

The Frozen Model Problem

To understand why this matters, you need to know a little about what’s inside these models. During training, a model reads an insane amount of text and gradually adjusts billions of numerical values called parameters or weights. You can think of each weight as a dial on a pipe connecting two nodes in the network controlling how much signal flows through. The model trains by turning billions of those dials very slightly over and over again until it gets good at predicting language. That right there is really impressive to me given the scale of information these models are working with. But when the training process ends, all those dials get locked. That stage represents deployment. The model then goes out into the world with its knowledge frozen in place.

Training works because it’s a compression process. The model can’t store everything it reads verbatim. It has to find the underlying patterns, generalize the data, and build something compact that transfers to new situations it’s never seen before. The authors describe this as lossy compression, and that lossiness is actually what produces what seems like intelligence to us when we talk to an AI. When I first read that I thought of a camera compressing a RAW file to a JPEG file. The RAW image contains all the available data but it’s a massive size and requires editing in post production to produce a beautiful image. The JPEG, however, is much smaller because it’s been compressed by the camera to just what’s needed to display a good quality image at a certain size. I’ve always understood that process in photography, but I didn’t realize that LLMs are going through a similar process.

Here’s another way to think about it. Remember when you first learned how to ride a bike? You didn’t read the entire manual every time. You just got some guidance from a friend or a parent and you practiced. You fell down a few times and adjusted your technique, and then eventually your brain distilled your experience into something automatic and compact. That’s compression. You still remember falling down, but that falling down process is no longer helpful for riding once learning has taken place. What remains is the final skill of balancing to ride. An AI model that memorizes every training sentence perfectly would be less useful, not more, because it could retrieve but never generalize. It would behave more like a simple retrieval system than a sophisticated learner.

The painful irony the authors identify is this. The very mechanism that makes these models powerful during training is exactly what we stop them from doing once they’ve been deployed. We freeze the compression at the moment of release and replace it with what’s called external memory. That clarified the argument for me. The system is layered, and each layer is essentially a workaround for the fact that the compression stopped. Understanding that made the next part of the article click.

The Filing Cabinet

To compensate for frozen models, developers have built elaborate scaffolding systems, such as chat histories, retrieval databases, system prompts, external document stores, and more. All of these things make up what the article calls external memory. They are flexible and they live outside the model’s internal, frozen weights. When you need information, the system retrieves it and feeds it into the context window. Then the model reads it and responds.

This architecture works as is and the authors are honest about that. However, they make a point I hadn’t considered before. “A bigger filing cabinet is still a filing cabinet.” Retrieval is not learning. The model is looking things up, not actually knowing them. It just does it very quickly and uses natural language so you get the impression you are talking to someone who is intelligent.

Here’s another practical example. Say a hospital deploys an AI assistant to help with real world clinical decisions. That model was trained on medical literature through some cutoff date. A major new clinical trial or medical policy comes out afterward that changes how doctors treat a particular condition. The hospital can feed that paper into a retrieval database so the AI can surface it when it’s relevant. But the model doesn’t internalize that new research the way doctors would after reading it, applying it to patients, observing the outcomes, and revising their practice accordingly. The AI can retrieve the abstract. But it can’t reason from the new finding the way someone who has truly learned it can in practice. That’s the limitation these researchers are trying to fix.

The same problem exists in cybersecurity with treats evolving daily. A frozen model can be given descriptions of new attack patterns through retrieval, but it can’t compress and generalize from those patterns the way an analyst does who has spent months chasing a specific class of threat. The knowledge stays external. It never becomes part of what the model actually knows unless the model is updated with a new learning process, which is time consuming and very expensive.

What Real Learning Requires

So what’s the alternative? The article introduces a concept called continual learning, which is the field of research aimed at letting models actually update their weights based on new experience after deployment. Not just read notes. Actually learn live like humans do.

And here’s where the Memento metaphor really makes sense. The authors say that today’s AI is stuck in Leonard Shelby’s perpetual present. The scaffolding, the Polaroids and tattoos, and other memory aids work well enough within any given scene. But the model can never compound in real time. Every new thing it encounters stays external.

Think about the difference between a doctor who simply retrieves a recent study and a doctor who has spent years treating patients with that knowledge fully and personally internalized. Or consider the difference between someone who has your email history in front of them and someone who actually knows how you think over time. The article frames this cleanly. “The difference between ‘Here is what you responded to this email before’ versus ‘I understand how you think well enough to anticipate what you need’ is the difference between retrieval and learning.” Even in normal human memory, immediate retrieval is necessary to manage your present experience. However, it’s also required that your present experience be embedded into long term memory for continual learning.

The authors bring up Fermat’s Last Theorem as one powerful example of the kind of hard discovery problem they have in mind. Mathematicians worked on the issue for 350 years. Eventually the problem was solved by Andrew Wiles. But he didn’t crack it by retrieving the right papers. He solved it by working in near total isolation for seven years, and inventing entirely new mathematical techniques to bridge two previously disconnected fields. That kind of discovery required genuine compression, generalization, and creative combination. Not simply fast retrieval. And the article asks directly whether a model that can’t compound from experience could ever do anything like that. The honest answer is they don’t know yet.

Why Updating Weights Is So Hard

At this point I had to ask myself if real time continual learning is so important, why can’t the LLM models do it now? The short answer is that updating a model’s weights after deployment is genuinely dangerous and technically unsolved at scale.

The most obvious problem is called catastrophic forgetting. When you update a model’s weights to learn something new, it tends to overwrite what it already knew. New learning crowds out old learning. If you fine tune a general model specifically on medical records, it might get better at clinical language while getting noticeably worse at everything else because the new training has nudged weights that were also doing other jobs. The model gets better at one thing and potentially worse at everything it was already good at. When you understand this you can really appreciate how humans have benefited from millions of years of evolution. The AI machines seem rather clunky by comparison. When humans learn, new neural connections are made in the brain that stick for a long time as new learning is layered on top. But even in humans, old learning and memory does actually fade gradually over time if a specific neural pathway isn’t continually or at least occasionally reinforced. It just takes a very long period of time. With AI systems, however, new learning can wipe out old new learning immediately. The authors didn’t address this issue directly in humans, but the example seems similar if you study biology.

There’s also the problem of data poisoning. If a model’s weights can be updated through interactions after deployment, bad actors could gradually manipulate its behavior through carefully crafted inputs over time. Unlike a one-time attack, poisoned weights persist across every future conversation. The damage would live in the model itself so safety alignment would degrade unpredictably immediately or some time in the future. The article notes that “even narrow fine-tuning on benign data can produce broadly misaligned behavior,” which is a sobering thought to sit with. Yet we all know this would happen right away based on our own experience being online every day fighting bots and hackers.

These aren’t hypothetical concerns. They’re real problems without clean solutions yet.

Where Things Are Heading

The article maps out a spectrum of approaches to continual learning that are organized around a question I found clarifying: where does the compaction actually happen? It seems there is a stack of technologies managing the process.

On one end you have pure retrieval. No compaction. The model just reads notes. That’s most of what exists today. In the middle there are modules, which are attachable and specialized components that let a model develop some expertise in a specific domain without retraining the entire thing from scratch. A hospital might attach a medical module to a general model so it performs at a specialist level on clinical questions, while the same base model with a different module handles legal contracts. Each module is swappable independently. That’s a practical and reasonable middle ground for now.

On the far end you have full parametric learning, where the model’s weights actually update from new experience after deployment. This is the goal, but it remains largely unsolved at scale with the current technologies. But there are serious research efforts moving in this direction with things like test-time training where the model runs brief learning cycles before it generates a response. Also there are self-improvement approaches where models like AlphaEvolve have generated their own training data and genuinely improved from it, at least within constrained problem domains like mathematics.

The authors frame the path forward as layered. In-context learning stays as the first line of adaptation because it works now and keeps getting better. Modules offer some personalization and domain specialization. But for genuinely novel problems, adversarial scenarios, and knowledge too tacit to put into words, models may eventually need to compress new experience directly into their parameters after training. Otherwise, as the authors put it, we stay stuck in Memento’s perpetual present.

What I Took Away

I started reading this article as someone who uses AI tools every day without really thinking much about what’s happening underneath. What I came away with is a better sense of the gap between what these systems appear to do, what they’re actually doing, and what they’ll potentially do in the future. Right now they can respond to new information and adapt to what you give them. And most times they feel like they understand you. But the reality is that they don’t compound. They don’t learn. They don’t internalize new experience the way continual learning systems or humans would. Their dials are locked. And until engineers figure out how to update those dials safely and continuously after deployment, the models we’re using now are doing something more like reading notes than actually learning from the experience. That’s a distinction with a very big difference.

Check out the original article and Malika’s podcast for the technical details. Below is a list of related terms and definitions.


Continual Learning: Vocabulary List

This list of terms below is based on the a16z article “Why We Need Continual Learning” by Malika Aubakirova and Matt Bornstein and also the podcast with Malika discussing the article. Some definitions closely reflect the article itself, but others expand into broader concepts from the field for additional context. I error checked the terms and definitions with Grok, ChatGPT, Gemini, Perplexity, and DeepSeek.

Agentic Loops

A mode of operation where the model works autonomously step by step toward a goal without you typing each instruction. Each step produces output that feeds into the next. This process can go on for many cycles. The article identifies two related problems as steps accumulate: (1) the immediate symptom is coherence degradation, where the agent loses the thread and starts making poor decisions, and (2) the underlying cause is that maintaining a growing context becomes increasingly expensive and inefficient. Both concerns together represent why the article frames agentic loops as one of the pressure points on the current in-context learning paradigm. For example, an agent tasked with researching a topic, drafting a report, checking sources, and revising the draft might handle the first twenty steps cleanly. But by step eighty the accumulating context has grown so large and costly that the agent starts losing track of earlier decisions and repeating work it already did.

Attention Heads

A key mechanism inside transformers that allows the model to weigh how relevant each part of the context is to every other part when generating a response. Multiple attention heads run in parallel, each learning to focus on different kinds of relationships in the text. One head might learn to track grammatical agreement between subject and verb across a long sentence, while another tracks thematic connections between paragraphs. Together they allow transformers to handle complex, long range dependencies in language that earlier architectures struggled with. For example, in the sentence “The lawyer who argued the case, despite the objections raised by her colleagues, ultimately won,” an attention head helps the model correctly connect “won” back to “lawyer” across all the intervening words.

Catastrophic Forgetting

When a model updates its weights to learn something new, it tends to overwrite what it already knew. In other words, new learning crowds out old learning and sometimes dramatically. This is one of the central unsolved problems in continual learning, and one of the main reasons models are not updated continuously after deployment. Think of it somewhat like overwriting parts of a hard drive. The new files go in, but the old ones can be partially or fully lost. For example, if you fine-tune a general purpose model specifically on a medical records archive, the model will get better at clinical language but noticeably worse at writing poetry or explaining history because the new training has nudged weights that were doing other jobs.

Compression / Compaction

The process of taking a vast amount of raw information and distilling it into something compact and generalized. During training, a model compresses an enormous amount of human writing into its parameters and finds the underlying patterns rather than storing things verbatim. The article uses “compaction” as a broad organizing term for how deeply new information gets digested, which ranges from not at all (pure retrieval, where facts just sit in a database) to fully (weight-level learning, where the model actually internalizes new knowledge). For example, rather than memorizing every recipe ever written, a well-trained model compresses the underlying logic of cooking: how heat transforms food, how flavors balance, how techniques generalize across cuisines.

Continual Learning

The broader field of research aimed at letting models learn from new experience after deployment, ideally by updating their weights rather than relying on external scaffolding. It’s the opposite of the current norm, where training and deployment are completely separate and weights are frozen the moment a model is released. The goal is something closer to how humans learn continuously from experience without needing to be retrained from scratch every time the world changes. For example, a customer service model using continual learning could gradually internalize patterns from thousands of resolved support tickets over time and get genuinely better at its job rather than just retrieving past examples.

Context Window

The full body of text the model can see at once when generating a response. It includes your message, the full conversation history, any documents you shared, and any background instructions passed to the model. It has a size limit measured in tokens. When it fills up, older content must be dropped to make space for new content. For example, if you have a long conversation with an AI assistant and then ask it to recall something you mentioned earlier, it may not be able to answer because that part of the conversation has already been pushed out of the window.

Data Poisoning

One of several serious governance and security risks the article raises around continuous weight updates. If a model’s weights can be updated after deployment interactions, bad actors could gradually manipulate its behavior through carefully crafted inputs over time, which is a slow and hard-to-detect form of corruption that lives in the weights rather than just in the context. Unlike a one-time prompt injection attack, poisoned weights persist across every future conversation. The article groups this alongside other unsolved challenges: alignment degradation, the impossibility of unlearning toxic knowledge, auditability failures, and privacy risks from user interactions being compressed into parameters. For example, an adversary could repeatedly feed a customer-facing AI subtly misleading information about a competitor’s product until the model begins reproducing those inaccuracies on its own with no obvious sign of tampering.

Distillation

A process involving two models: (1) a large, capable, frozen teacher and (2) a smaller student. The student is trained to match the teacher’s outputs as closely as possible and absorb its knowledge in a more compact form. The result is a smaller, more efficient model that performs nearly as well as the larger model on the tasks it was trained for. It’s like an apprentice learning by closely watching and mimicking a master until the skill becomes their own. For example, a large hospital system might use a massive general-purpose model as the teacher and distill its medical reasoning capabilities into a smaller model that can run efficiently on local hospital hardware without requiring a cloud connection.

External Memory

Anything outside the model’s weights used to store and retrieve information. Chat history, databases, document stores, and agent notes are all examples of external memory. Information gets fed back into the context window when necessary. In current deployment architectures, the model typically does not update its weights from that information during inference. The key limitation is that external memory requires retrieval. The model has to be given the right information at the right moment, and if it isn’t, the knowledge might as well not exist. For example, a legal AI might have a database of ten thousand case summaries it can search, but if the retrieval system surfaces the wrong cases, the model has no way to compensate from its own knowledge.

Few-Shot Learning

The ability of a model to perform well on a new task after seeing only a handful of examples, rather than requiring thousands of training samples. Transformers are surprisingly good at this when examples are provided in the context window. Meta-learning approaches aim to make weight-level, few-shot learning just as effective, so the model can internalize new tasks from just a few examples even without them being available in the context. For example, if you show a model three examples of how you want your emails formatted and then ask it to format a fourth, it adapts immediately without any retraining. That’s few-shot learning in action.

Fine-Tuning

A more targeted form of additional training done after the initial training run. Instead of training from scratch on everything that’s known, you take an already-trained model and update it on a smaller or specific dataset. The new information shapes the model’s behavior for a particular use case without rebuilding it from the ground up, but the process still risks catastrophic forgetting if pushed too hard. For example, a company might take a general-purpose language model and fine-tune it on thousands of their internal support conversations, so the model learns the company’s terminology, tone, and common issue patterns without losing its broader language capabilities.

Gradient Descent

The mathematical process by which a model adjusts its weights during training. It measures how wrong the model’s predictions are on a given example and then calculates which direction to nudge each weight to reduce that error slightly. It’s called “descent” because the process is navigating downhill on a mathematical landscape, always moving toward lower error rates. Repeat this across billions of examples and the model gradually gets much better. For example, if the model predicts “cat” when the correct answer is “dog,” gradient descent works backward through the network to figure out which weights contributed to that wrong answer and adjusts them a tiny amount. Do that enough times and the model learns to tell cats from dogs reliably.

In-Context Learning (ICL)

Everything the model reads and uses during a single conversation without updating its underlying knowledge. You paste in a document, it reads it and responds. You describe a task, it follows your instructions. But when the conversation ends, none of that experience changes the model itself. The next conversation starts with the same frozen weights as always. This is a smart use of temporary information, but it’s not genuine learning. For example, if you spend an hour teaching an AI assistant about your company’s internal processes and then start a new conversation the next day, the model will have no memory of the previous conversation. You would need to paste in that information all over again.

Inference

The act of a model generating a response from input. It’s the opposite of training. Training occurs when the model learns by adjusting its weights. Inference occurs when the frozen model performs and takes what it knows and produces an output. Any time you send a message and get a reply, that’s inference. The term “inference-time compute” (below) builds on this and refers specifically to spending extra computational effort during inference to get a better result. But plain inference just means the model is running, not learning. For example, asking a model what the capital of France is and getting back “Paris” in a fraction of a second is inference in its simplest form. No learning took place. The model generated an output from its existing weights without updating them.

Inference-Time Compute

The current dominant paradigm for improving model performance by spending more computational effort at the moment of response rather than updating weights. This includes chain-of-thought reasoning, tool use, search, and iterative problem-solving, all of which cost more compute at response time but produce better results. The article positions this process as a workaround, a scaling of what already works rather than a true solution to the learning problem. Test-time training is the most aggressive form of this learning because it actually runs gradient updates on new information during inference, which begins to compress it into weights in real time. This process sits at the boundary between the current paradigm and genuine parametric learning. For example, when you ask a model a complex math problem and it works through each step before giving a final answer rather than just guessing immediately, that is inference-time compute. The model is using more processing in the moment to arrive at a better result.

Instruction Tuning

A form of fine-tuning where the model is trained specifically on examples of instructions paired with ideal responses. It’s one of the main reasons modern models are so much better at following directions than earlier versions, which tended to just complete text rather than actually do what you asked. The model learns not just facts but the shape of helpful behavior, including how to interpret requests, how to structure answers, and when to ask for clarification. For example, an early language model asked to “summarize this article” might just continue writing in the same style as the article. An instruction-tuned model understands that the request calls for a concise, distinct summary and produces one.

KV Cache

Short for key-value cache. A technical mechanism that stores intermediate computations during inference so the model does not have to redo them from scratch for every token it generates. The article discusses it specifically in the context of KV cache compaction where the cache functions as a form of non-parametric memory but grows substantially as conversations and agent loops get longer. The authors argue that learning to compress this cache more efficiently is one of the meaningful challenges in moving from pure retrieval toward more durable knowledge storage. For example, in a long agentic task, the KV cache holds the computed representations of everything the model has processed so far. Without it, each new token would require reprocessing the entire history from scratch, which would be prohibitively slow.

Lossy Compression

Compression where some information is permanently lost in the process, as opposed to lossless compression where everything can be recovered exactly. For LLMs, the inability to store everything verbatim during training forces the model to find patterns, generalize, and abstract. That forced abstraction is precisely what makes the model seem intelligent and useful in new situations it has never seen before. A JPEG image is the familiar everyday example. Save a photo as a JPEG and the file shrinks dramatically because fine detail is discarded. But if you zoom in close enough you can see the degradation. For most purposes, though, the image is perfectly usable. The tradeoff is the point. For a language model, the equivalent is that it cannot recite every sentence it ever trained on, but it can write a new sentence in any style on any topic because it extracted the underlying structure rather than memorizing the surface.

Meta-Learning

Teaching a model how to learn rather than what to learn. The model is pre-trained in a way that positions it to update quickly and effectively with just a few new examples, rather than requiring extensive retraining. It’s the difference between educating someone to be a quick study versus simply giving them a lot of facts to memorize. A quick study can walk into an unfamiliar subject and get up to speed fast, whereas someone who only memorized facts cannot. For example, a meta-learned model shown three examples of a new classification task, say sorting customer complaints into categories it has never seen before, should be able to generalize accurately to new complaints after just those three examples rather than needing hundreds.

Modules

The article uses this as a broad middle-ground category on the compaction spectrum that sits between pure retrieval and full weight-level learning. In practice, modules can take several forms: adapter layers, LoRA-style weight updates, memory components, or cached representations. What they share is the ability to specialize a general-purpose model for a specific domain without retraining the entire model from scratch. They offer more than retrieval in that some digestion of information happens, but less than full parametric learning in that the core model is typically left unchanged. For example, a hospital might attach a medical module to a general-purpose model so it performs at a specialist level on clinical questions, while the same base model with a legal module performs at a specialist level on contract review, with each module being swappable independently.

Multi-Agent Architectures

Systems where multiple AI models work in parallel with each one handling a slice of a larger task and communicating results to each other or to an orchestrating layer. If a single model is limited by its context window, a coordinated group of agents can collectively handle far more. But this shifts the problem rather than eliminating it. Each agent still faces its own context limit, and coordinating many smaller contexts introduces its own complexity for the system to manage. It’s a non-parametric workaround for scale, not a solution to the underlying constraint. For example, a research task that would overflow one model’s context window might be split across ten agents, each reading a different section of source material with a coordinating agent assembling their summaries into a final report.

Neural Network

The underlying computational structure of an LLM. It’s a network of interconnected nodes organized in layers, loosely inspired by neurons in the brain. But the analogy should not be pushed too far. Each connection between nodes has a weight that determines how strongly one node influences another. During inference, information flows forward through the layers, gets transformed at each step, and eventually produces an output. The network learns by adjusting those weights during training until it gets good at its task. For example, in an image recognition network, early layers might learn to detect simple edges and colors, middle layers might learn to recognize shapes, and later layers might learn to identify objects. Language models work on the same principle but applied to sequences of text.

Parameters / Weights

The billions of numerical values inside a model that encode everything it learned during training. Each value represents the strength of a connection between two nodes in the neural network. During training, these values get adjusted gradually until the model becomes good at predicting language. After training they are frozen, and the model’s knowledge and capabilities are entirely determined by those fixed numbers. “Parameters” and “weights” refer to the same thing and are used interchangeably throughout the article. For example, frontier models contain billions or even trillions of parameters. Each one is a small dial that was tuned during training and now stays locked in place, collectively encoding an enormous amount of compressed knowledge about language, facts, and reasoning patterns.

Parametric Learning

Learning that actually updates the model’s weights based on new experience, as opposed to in-context learning which uses information temporarily without changing anything permanent. It’s the deeper form of learning the article is ultimately arguing we need more of. When a model learns parametrically, new knowledge gets compressed into its weights the same way training data did and becomes a durable part of what it knows rather than a note it holds briefly and then discards. For example, a parametric update after a model encounters thousands of conversations about a new programming language would leave it genuinely better at that language going forward across all future conversations, not just within the session where it learned.

Regularization

A cautious approach to weight updates that penalizes changes to parameters deemed important to existing knowledge. Before updating a weight, the system estimates how critical that weight is to the model’s current capabilities. If it’s very important, the update is constrained or slowed down. This is one of the older approaches to continual learning and helps manage the stability-plasticity dilemma. But it tends to be brittle at scale. Think of it like a renovation rule that protects load-bearing walls. You can still remodel, but certain structures are off-limits because removing them would collapse the building. For example, EWC (Elastic Weight Consolidation), one of the most cited regularization methods, computes an importance score for each weight after training on a task and uses that score to resist changes when training on subsequent tasks.

Reinforcement Learning (RL)

A training approach where a model learns from feedback signals rather than from labeled examples. It tries things, receives a reward or penalty based on how well it did, and adjusts its behavior accordingly over many iterations. The article mentions RL-based feedback loops as one direction in continual learning research where models could improve from real-world deployment signals like user corrections or task outcomes. However, it’s not the central mechanism the authors emphasize. The core focus of the article is on compaction, weight updates, and memory structures. For example, the systems that learned to play chess and Go at superhuman levels used reinforcement learning by playing millions of games against themselves and adjusting strategies based on wins and losses rather than being taught explicit strategies.

Retrieval-Augmented Generation (RAG)

A common approach to giving models access to current or specialized information without retraining. Instead of baking knowledge into weights, you build a searchable database the model can query at response time. The retrieved content gets injected into the context window and the model uses it to generate its answer. It’s purely non-parametric. The model retrieves information but never internalizes it. The limitation is that retrieval only works if the right information gets surfaced at the right time, and no amount of retrieval can substitute for knowledge the model needs to reason with flexibly. For example, a financial AI might use RAG to pull in the latest earnings reports before answering questions about a company’s performance because that information changes constantly and cannot be baked into training data.

Safety Alignment

The work done during training to make a model helpful, honest, and safe to use. It involves carefully curated training data, human feedback on model outputs, and specific training objectives designed to shape the model’s values and behavior. One of the serious risks of continuous weight updates after deployment is that alignment can degrade unpredictably even from adding seemingly benign new data. It seems that fine-tuning on almost anything can shift the weights that govern behavior, not just the ones governing the specific knowledge update. For example, researchers have shown that even brief fine-tuning on ordinary instructional text can weaken safety guardrails in ways that are not obvious until the model is probed specifically for harmful outputs.

Self-Improvement

An approach where the model generates its own training data, filters out low-quality results, trains on the high-quality results, and repeats the cycle. It learns from its own work rather than from human-provided data and can improve capability over repeated iterations in constrained settings. The article cites AlphaEvolve and AlphaProof as examples of this kind of closed-loop improvement. But these systems operate in constrained domains like mathematics and algorithm optimization, not open-ended real-world learning. The article uses these examples to illustrate iterative self-training loops, and what qualifies as a genuinely new discovery in this context remains debated. For example, AlphaEvolve used self-generated solutions and automated evaluation to discover improvements to algorithms that human programmers could not find because it worked within a well-defined problem space where correctness could be verified automatically.

Stability-Plasticity Dilemma

The fundamental tension in any learning system between staying stable, meaning not forgetting what it already knows, and staying plastic, meaning remaining able to learn new things. Push too hard toward plasticity and you get catastrophic forgetting. Push too hard toward stability and the model cannot adapt to anything new. Solving this dilemma is one of the core engineering challenges in continual learning, and no approach has fully solved the problem at scale. The dilemma exists in biological brains too. From what I understand about biology, human memory consolidation is strongly associated with sleep and offline processing, which suggests the brain has its own version of this stability-plasticity problem built right in. For example, a model trained to be highly stable might refuse to update its belief that a particular drug is safe even after being shown new clinical evidence, while a model trained to be highly plastic might update so aggressively that it forgets basic grammar rules after a week of medical fine-tuning.

State Space Models (SSMs)

An alternative to traditional transformer architecture that the article highlights for offering a fundamentally better scaling profile for long contexts. The article describes them as using fixed memory layers interspersed with normal attention, which unlike transformers does not grow unboundedly with every token added to the context. Traditional transformers scale quadratically with context length, while SSMs aim for near-linear scaling. However, this remains an active area of research rather than a fully settled property. The article treats SSMs as a promising architectural direction for enabling much longer agentic loops rather than a definitive solution to the broader continual learning problem. For example, a transformer handling a 100,000-token conversation requires vastly more compute than handling a 10,000-token request. But an SSM handling the same expansion would ideally require only proportionally more, which could make very long agentic tasks far more practical.

Temporal Disentanglement

A core limitation of parametric memory since a model’s weights do not separate timeless facts from information that changes over time. Both get compressed into the same parameters and are tangled together with no internal label distinguishing what’s permanent from what’s mutable. This makes continual weight updates risky because changing a time-sensitive piece of knowledge can corrupt stable knowledge stored in nearby weights. The article frames this as one of the fundamental unsolved problems standing between today’s frozen models and genuinely adaptive ones. For example, the fact that two plus two equals four and the fact that a particular person holds a particular job title are both encoded somewhere in the weights. Updating the job title risks disturbing the arithmetic, because the model has no mechanism for knowing which facts are stable laws and which are contingent facts about the world.

Test-Time Training

An approach that blurs the line between training and responding by letting the model do a small amount of learning before it generates a final answer. Rather than relying entirely on what it learned during the original training run, the model runs brief gradient updates based on what it’s currently seeing and then responds. The article describes this as running gradient descent on test-time data, compressing new information into parameters at the moment it matters, and treats it as one of the more substantive moves toward genuine continual learning because it actually changes weights at inference time. For example, if a model is asked to analyze a long, unusual technical document, test-time training would let it briefly train on that document before responding, compressing its key patterns into weights rather than just reading it as context. This method potentially produces a much more accurate analysis as a result.

The Bitter Lesson

A well-known observation in AI research. It holds that given more compute and data, general methods that let models figure things out at scale consistently outperform clever human-engineered solutions over time. Every time researchers have tried to hardcode structure and shortcuts into AI systems, the simpler but more scalable approaches have eventually won. The article invokes this phenomenon to question why we still hand-engineer memory and compression pipelines rather than letting models learn to do it themselves. For example, early chess programs used elaborate human-crafted rules about piece values and board positions. They were eventually crushed by systems that simply learned from millions of games with minimal human guidance and relied on scale rather than cleverness. The same pattern has repeated across nearly every domain in AI.

Token

The basic unit of text that a large language model processes. A token is roughly a word, though it can also be a fragment of a word, a punctuation mark, or a short common sequence like “ing” or “un.” Models do not read text the way humans do, character by character or word by word. Instead, they break input into tokens first and then process the sequence. The size of a context window is measured in tokens, not words or characters. For example, the sentence “The cat sat on the mat” would be broken into something like seven tokens, roughly one per word. But a word like “unbelievable” might be broken into two or three tokens: “un,” “believ,” “able,” because it’s less common and gets split into recognizable subunits the model has seen frequently.

Training Run

The large-scale and expensive process of building a model’s knowledge by exposing it to massive amounts of data and adjusting its weights. Training involves feeding these huge datasets through the network repeatedly and using gradient descent to nudge weights toward better predictions. The process runs on clusters of specialized hardware for weeks at a time and consumes substantial amounts of electricity. It’s all carefully controlled, occurs before deployment, and produces a fixed set of weights that define everything the model knows. Once training ends, the weights are frozen and the model goes out into the world as-is. For example, training a frontier model like GPT-4 or Claude is estimated to cost tens or hundreds of millions of dollars and requires specialized data centers. This is precisely why continuous post-deployment learning is so appealing because rerunning a full training run every time the world changes isn’t practical.

Transformer

The dominant architecture underlying most major AI models today including Claude, GPT, and Gemini, and more. At its core, a transformer predicts the next token in a sequence of text based on everything that came before it. It generates outputs token by token at very high speed. That sounds simple but at scale it’s not. The architecture was trained on so much human-generated text that it models statistical relationships in language and attempts to produce behavior consistent with understanding context, logic, and meaning. For example, when you ask a transformer-based model to explain a complex idea, it makes predictions about what a good explanation would look like given your question based on patterns it absorbed from vast amounts of human writing on similar topics. That’s why it seems smart. It’s familiar. Whether the final output constitutes genuine understanding is a separate philosophical debate that the article doesn’t address.

A Maverick for the People 

I’ve been following Judy Shelton online for years. I’m also reading her new book “Good as Gold: How to Unleash the Power of Sound Money.” For decades she’s been articulating a perfectly reasonable monetary policy that works for everyone who works. Here’s her latest interview in Gold Telegraph: The Authentic Judy Shelton: A Maverick Economist Takes on Washington. It’s one of the better interviews recently so I figured I’d do a quick writeup. The content below is drawn from the interview itself, and I also weave in my own comments as well.

Judy Shelton is a serious economic thinker. She’s been arguing forever that money should actually mean something and that it should hold its value over time. In that sense, the average American on the street would surely agree. And she thinks that governments should not be free to just print money at will, which only destroys the property of anyone who must work for a living. It’s utterly immoral what’s been done to us in the name of colonialist policies implemented intentionally like free trade, deindustrialization, outsourcing, endless rehypothecation, constant price inflation, and much more. For the last few decades, official Washington thought Shelton was eccentric at best, but it’s also clear that she’s been a threat to them all long. So, what do we have now? Economic confidence has collapsed throughout the Western world while stable monetary collateral assets like gold, silver, and Bitcoin have hit record highs denominated in every major currency globally. More and more people are finally starting to realize Shelton has been right from the beginning.

How It Started: The Soviet Union

Although she talks about gold constantly, Shelton’s path to sound monetary policy didn’t began with gold but instead started with the collapse of the Soviet Union. Back then she was a post-doctoral fellow at the Hoover Institution studying the internal monetary and financial condition of the USSR. She noticed something that others had missed. The Soviet Communists were running a massive internal budget deficit, financing losing enterprises, and printing money to cover the gap. But because prices were fixed, the inflation didn’t show up as rising costs. The problem, however, was visible as empty shelves, lack of growth and prosperity, and long lines to purchase necessary daily goods.

“I ended up thinking with like a green eyeshade accountant,” she said, “that the country was going bankrupt.” Some of her colleagues at Hoover, disagreed, such as Condoleezza Rice, who was focused on Soviet military capabilities and felt that it would be militarization that would take down the government. Shelton and Rice used to argue about the issue. Shelton stuck to her view that economics would destroy the Soviet Union. She was right. Over time her book at the time, The Coming Soviet Crash, caught the attention of former President Richard Nixon, who reportedly kept rereading it after 1991. He even began sending her handwritten letters, which she displays in her home and showed the audience during the interview with Gold Telegraph. In one letter, Nixon described her as “a star being blessed with both beauty and brain.” Sounds like Nixon.

But what mattered more than Nixon’s flattery was his candor about the monetary system itself. When Shelton told him her next book would be about Bretton Woods, the economic agreement that tied the dollar to gold after World War II until Nixon ended the policy in 1971, he wrote back: “I know very little about monetary policy.” She found that extraordinary because he was the man who ended the system and said so himself. Although a well known expert in foreign policy and geopolitics, Nixon was never skilled in financial matters. What’s interesting is that he felt confident enough to state that directly.

What Was Lost in 1971

Nixon’s August 1971 speech in which he directed the Treasury to “suspend temporarily the convertibility of the dollar into gold” was supposed to be a short-term fix. Others outside the power elite at the time knew differently. However, years later Shelton met former Fed Chair Paul Volcker in 1994 at a conference marking the 50th anniversary of the Bretton Woods agreements, and he largely supported Nixon’s policy. Volcker said he thought they might need to reprice gold from $35 to perhaps $38 or $40 an ounce, and then reinstate the old system. That’s obviously not what happened.

“Did we essentially trade discipline for flexibility when we ended the gold standard?” Shelton was asked. Her answer was careful and pointed. “Flexibility is kind of a weasel word that can sound good,” she said. “What it really means is the flexibility to not be disciplined, and then that translates into the flexibility to reduce purchasing power, to incur inflation, to debase the currency.” In other words, it was intentional. She invoked James Madison, who argued in the early years of the country’s founding that depreciating the currency is the same as stealing property and that’s unconstitutional. The founders, she said, would be appalled. “Madison was so clear on that. He said a depreciating currency is just like stealing property.” She didn’t mention Hamilton, but you have to think that he would have been equally outraged about how the system he largely created eroded over the generations.

The early republic, she pointed out, treated the mint as the first priority. Jefferson saw a common currency as something that would bind the new country together, strengthen its commerce, and honor the work of its people. When citizens earn money, she argued, that money is property. Inflation expropriates it without due process. It’s theft, basically. They’ve stolen our money. They’ve destroyed our future.

The Nomination Fight: Pundits and the Washington Machine

In 2020, Shelton was nominated by President Trump to the Federal Reserve’s Board of Governors. The confirmation hearing in the Senate that followed was unlike anything she had expected, she said. I remember watching the hearings myself. It was clear that it was a hit job from both Democrats and Republicans to make sure that Shelton didn’t end up on the Federal Reserve. Given the politics in 2020, that result was expected. But I was struck by the obvious ignorance being articulated by the Senators. They just don’t know very much at all and certainly don’t deserve to be seen as leaders of anything.

Things weren’t much better in the financial or general media, though. “I was amazed at the power of pundits who I knew didn’t know as much as I knew about monetary systems and history,” she said. The attacks were personal and ideological. She was called a “gold bug” among other things. Her advocacy for sound money was described as a “dog whistle” to right-wing extremists, a common attack that represents nothing more than idiocy. Senator John Kennedy called her ideas “nutty,” which for him only serves as self reflection. Several other Republicans announced they were concerned. The nomination failed in November 2020 and the broken system continued.

“I had a hard time understanding that the Washington machine, built explicitly to protect politicians’ ever-growing spending demands, quickly closed ranks,” she said. Her family was in the chamber for the hearings. Her mother came from Los Angeles in her nineties. Her husband was quiet and supportive. Everyone knew what was going on.

But she draws a direct line between the failure of her nomination and what followed. The Federal Reserve then unleashed a level of money printing that contributed to the worst inflation in a generation. She didn’t mention COVID during the interview, but that was “scary virus” time as you may recall. Seems shutting down the world and gutting millions of jobs may have had some consequences, eh? By the summer of 2022, inflation was running at 9.3 percent. Federal Reserve Chairman Jerome Powell called it transitory. It wasn’t. And nobody was fired. Nobody resigned. That’s the way it always works for people who hold power.

“Not only did Powell not apologize,” she said, “but he refuses to resign. I think that’s outrageous, and it’s cheap grace to say you take responsibility when nobody gets fired.” Price stability, she noted, is the stated mandate of the Federal Reserve. “We didn’t get price stability. We still don’t have it.”

The Fed’s Footprint and the Case for Reform

Calls for Federal Reserve reform are now coming from the highest levels of the Trump administration, including from Treasury Secretary Scott Bessent. Shelton understands why, although she would go further than most reformers.

“Inflation continues to be a lead issue for people,” she said. But her objection runs deeper than the inflation numbers. She’s troubled by the sheer size of the Fed’s presence in economic life and the way every financial decision is now made in reference to what the Fed might do next with respect to interest rate manipulation.

“We’d have a healthier economic system if people could just take for granted that the money is not going to depreciate, that a unit of account can figure into your planning for your whole future, and that you’re not just trying to keep up constantly with inflation and forced to put your money at risk.” She paused. “I think that would be a much better world.” Of course, she’s right. Her position is supported by people who need to make ends meet on a weekly basis, but certainly not the oligarchs hell bent on using the masses as their own little wage slaves.

As for who owns the Federal Reserve, she was way too careful. The legal structure is genuinely hybrid. There are twelve district reserve banks that are close to the private institutions in their regions. There are seven members of the Board of Governors, appointed by the president and confirmed by the Senate. Together they make up the 19-member Federal Open Market Committee. “It’s a quasi-public, quasi-private institution,” she said. “And that’s the arrangement that’s being tested now” under this second Trump Administration. It’s clear she knows more. It would have been nice to see her cut loose on the ownership issues like others regularly do.

Fort Knox, Treasury Trust Bonds, and a Live Video

One proposal that has attracted growing attention involves the gold held at Fort Knox. The last full audit was conducted in 1953. Shelton thinks an audit is long overdue, and not merely for symbolic reasons.

“There are a lot of Americans who don’t even trust the government to accept that the gold is there,” she said. “I think it’s needed.” But the audit would need to go beyond confirming the physical presence of the gold. It would also need to address whether any of it is encumbered, an issue no one ever talks about.

When asked whether she would support Elon Musk’s idea of a live video tour of Fort Knox, she didn’t hesitate. “I would love it!” she said. Can you imagine such a real time demonstration of reality? What if the vaults are empty? But what if they are overflowing with more gold than previously expected? Either way, it could be shocking to markets around the world and especially to Americans who have watched their life savings evaporate over the last few decades. We’ll never know, though.

But Shelton’s larger goal is to establish that whatever gold is actually in Fort Knox be held as official collateral for what she calls Treasury Trust Bonds, which she describes as long-term government obligations with a gold convertibility feature. The bonds would give holders the option at maturity to redeem the asset either at the nominal dollar value or in a pre-established amount of gold. She believes this kind of financial instrument would be massively popular. She also wants to prevent any future administration from simply selling the gold to capture a temporary windfall profit. Locking the gold reserves in as long term collateral, she argues, prevents exactly that.

The bonds, she believes, could inspire other sovereign nations to act accordingly as well. And the bonds could also become a condition of trade arrangements and a way to address currency manipulation without relying exclusively on tariffs. Shelton, says: “What does your currency do relative to gold, and what does our currency do relative to gold?” If one country depreciates more, that gap should be quantifiable. That, she argues, helps level the global playing field.

The Battle Never Ends

Toward the end of the interview, Shelton was asked about the risks of ongoing poor monetary policy. “I think what’s at risk is this sense of people increasingly [feeling] that they are victims of monetary favoritism, that the Fed maintains policies that increase the inequality of wealth and income, that they reward people who are already wealthy enough to have financial assets.” The people who cannot protect themselves are the people who work for wages, who save in dollars, whose property is quietly depreciated every year through pervasive inflation.

“I think we need a revolution of valuing honest work, honest government, and honest money,” she said. “And by that I mean celebrate people who actually make goods, who produce goods and services, not just people who arbitrage the anomalies of financial markets.” That’s an interesting comment. It clearly reflects the intention of the current Trump government as they implement policies to re-industrialize the United States after so many decades of willful decline.

She was then asked whether she would potentially join Trump’s new Board of Peace, which is focused on economic development as a tool of diplomacy to build global stability. She said she would join if she were invited. And that’s possible under the current administration given their similar positions.

And then, as the interview ended, she offered one last thought. It wasn’t a summary. It was a reminder of what we’re really facing. “The battle itself,” she said, “it never ends.” That’s sobering. It’s a battle. It’s a war. Instead of being passive, we have to actually get active and fight our own leaders to save our lives and build a future for our children. Just saying that feels reprehensible on every level. But that’s reality.

Imagine a world where Shelton’s simple, practical, sound monetary policies had been fully implemented? We’d all be thriving now. Perhaps that’s the problem. We’re not supposed to.

Bruno Borges at JavaOne 2026

Duke’s Corner Java Podcast — Bruno Borges at JavaOne 2026

Jim Grisanzio from Oracle Java Developer Relations talks with Bruno Borges from Microsoft at JavaOne 2026. Bruno works on GitHub’s Core AI developer relations team. The conversation covers the future of Java in a world of AI, the value of learning core computer science fundamentals in school, the shifting role for software developers from just writing code to architecting higher level systems, the new business value opportunities for developers as they leverage AI technologies, and Bruno’s new AI-assisted website called Java Evolved that visually compares old and new Java code patterns.

Pain Invading the Mind

Dealing with physical pain — Thanissaro Bhikkhu, also known as Ajahn Geoff.

This Dhamma talk is worth a listen from time to time since pain is many times our savage enemy that can easily grind you into dust and ruin you life. So, dealing with it takes some skill, which Ajahn Geoff from Thai forest Buddhist tradition talks about all the time. Listen during meditation. Try the technique. It’s all about where you set your awareness. This takes practice. But don’t worry. Since the pain isn’t going away you have plenty of time to learn and get it right.

The Buddha’s core instruction on physical pain is pretty brief. The most important part is to keep the pain from invading your mind and letting it take up residence. The goal is not necessarily to eliminate pain but instead to change your relationship with it.

Ajahn Geoff talks about the famous forest teacher Ajahn Lee who offers a practical guide. When pain arises, first focus on the comfortable parts of the body and breathe into those areas. That comfort can grow into a kind of foundation or a place to stand and eventually into a resource to direct through the pain itself. Rather than walling off the pain, you breathe through it because walls often maintain the very thing they were meant to contain.

The next step is to examine the stories and perceptions we build around pain. We tell ourselves the pain has been here for a long time and will continue, and in doing so we drag past and future suffering into the present moment. Dropping the stories can lighten the pain considerably.

You can also question whether the pain is as solid or permanent as it seems. Move your awareness toward the sharpest point of the pain, rather than away from it. This can sometimes reveal that the pain shifts and dissolves under close attention. The fact that it moves at all is many times revealing. I’ve noticed this many times. It’s obvious. I’ve also questioned whether I’m feeling the pain when my attention is directed entirely out of my body to other matters, such as a phone call, a story from a friend, or a TV program.

The larger point is that while pain may arise from causes beyond our control, what the mind does with it is very much under our control. With practice, that distinction can be seen and understood. Remember, everything in Buddhist mediation is a skill. You get good only with many years of practice.

That’s just one of Ajahn Geoff’s Dhamma talks. Here are hundreds more. It’s an absolutely amazing archive of Pāli Canon content. I visit daily.


Full Transcript

Pain is one of those things the Buddha says we have to learn how to endure. But he gives remarkably little instructions on how to endure it.

It’s not like painful words. We’re supposed to endure painful words, too. And we do get instructions on how to depersonalize the words, how to think about them in such a way that they don’t invade the mind and remain.

As for physical pains, the Buddha says that similarly we should try to keep the pains from invading the mind and remaining there. That should be our intention with regard to them. In other words, we don’t want them to go away necessarily. At least we don’t make that our purpose in dealing with them.

We want to understand what does it mean for them to invade, what does it mean for them to remain. This is something we have to figure out. And how does it happen?

The Buddha gives only a sketch on how we should deal with pain. It comes in as instructions on breath meditation under the section on feelings. Try to breathe in and out in a way that gives rise to rapture. Breathe in and out in a way that gives rise to pleasure. Breathe in and out sensitive to mental fabrication — feelings and perceptions. Perceptions are the labels we put on things, the images we use to tell ourselves what something is, what it means. And finally, calming mental fabrications as we breathe in and breathe out.

That’s about it.

For more detailed instructions in these areas, we have to look to the forest tradition. And Ajahn Lee specializes in those first two steps. Breathing in a way that gives rise to rapture and pleasure.

As he says, when there’s a pain in part of the body, first you focus on other parts of the body that you can make comfortable by the way you breathe. This serves several functions. One, it gives you a foundation to stand on as you deal with the pain. It also gives you a place to retreat. And it gives you some ammunition to use against the pain.

Because as we’ve noted, sometimes the simple fact that there is a pain there is something that you’ve brought into being yourself by the way you’ve worked with the raw material that comes from past karma. And by breathing in a comfortable way, you’ve got some ammunition to use against the physical pain, in that you can imagine whatever that raw material is being permeated by the comfortable breath.

Once the comfortable breath is established in the other part of the body, send it through the pain. See if that undoes some of the subconscious things you’re doing to aggravate the pain or even maintain the pain.

Sometimes we put a wall up around the pain in hopes of confining it, but that actually maintains it. The source of the pain may have gone away, but the wall is still there and it’s still painful. So breathe through the pain as the comfortable breath goes through. Make sure it goes through and doesn’t stay stopped at the wall formed by the pain.

You may sense that as you breathe, you’re using the painful parts of the body to do the breathing. They’re the most obvious parts when there’s pain in different sections. So think of the more comfortable parts doing the breathing. The painful parts get a free ride. Think of the breath permeating them and going out to the other side. See what that does.

At the same time, you’re changing the balance of power. Instead of running away from the pain, you face it, and you’ve got some ammunition to face it with.

Then the next steps are to try to understand what are the perceptions that you have around the pain. This is where Ajahn Maha Bua was good. When he talks about perceptions, that’s mental fabrication. But mental fabrications come together with verbal fabrications. In other words, the way you talk to yourself. It also comes together with the act of attention and the factor of name and form in dependent co-arising. So attention has to do with the questions you ask. Verbal fabrication has to do with the stories you tell yourself.

You start questioning your perceptions, questioning your stories around the pain. One of the big stories you may have is that the pain has been here for such an amount of time and it’s going to continue being here for such an amount of time. All of a sudden you’ve got the present moment weighed down by past and future pain. So that’s a story you’ve got to stop.

The past pain is gone. Tell yourself it’s nowhere to be found. You can’t go rummaging back in the past to find it. It’s gone. It’s no longer there to weigh you down. Future pain hasn’t come yet. Why do you have this story where you stitch things together — the pain was here and it’s going to be there? Why do you do that? Just try to be with the pain as it is right now.

Then you can question your perceptions. Do you see the pain as being the same thing as the body? As I said, sometimes we use the painful parts of the body to do the breathing, as if they were one and the same thing. So can you see that the body is made of the properties of earth, water, wind, fire — solidity, liquidity, energy, warmth? The pain is something else. It may seem to be solid or it may seem to be hot. But those are perceptions that we’ve attached to the pain. Can you separate them out?

Sometimes you try to separate them and they go on their own. So you get at it in a more indirect way. Just ask yourself, where is the sharpest point of the pain right now? Asking that question and following through changes the balance of power again. Instead of running away from the pain, you run at it.

Like the stories of the forest ajahns doing walking meditation at night and getting more and more convinced that there’s a tiger crouching beside the path. Instead of running away from the tiger, they run at it. It turns out there was nothing there.

So maybe there’s nothing to run away from in the pain. And you find that if you start tracing down where the sharpest point of the pain is, it moves. It avoids you. Especially if you learn to make your focus the kind of focus that doesn’t bear down on things. The focus is more open.

That’s when you’ve been working with the breath, you’ve learned that if you want to keep the breath comfortable, you don’t put too much pressure on it. You stay steadily with one spot as your center, but you think of that spot as being wide open, connecting with everything else. The same way you follow the sharpest point of the pain with that soft but steady focus. And it’ll move, because you’re giving it a chance to move.

You find that there will come a point where it suddenly separates out on its own like cream separating out of milk when it hasn’t been homogenized.

You can ask yourself if the pain is a solid block. What’s its shape? What’s its color in your mind? Remind yourself pain has no shape, it has no color. Those are just perceptions. Drop those perceptions. See what happens.

Is it just momentary flashes of pain arising, passing away? If you have that perception, ask yourself when it comes flashing in, does it come at you or does it go away from you? Try to hold in mind the perception that as soon as you sense it, it’s already going away. So you’re not a target.

These are calming perceptions. At the same time you’re asking questions that calm things down. You’re telling yourself stories that calm things down. All based on that intention. Trying to see how to keep the mind from being invaded by the pain. Or if it has invaded, not allowing it to remain in the mind.

That means you allow the pain to stay in the body. You’re not making it your purpose to make it go away. But you want to be aware of it. So you’re not sitting here waiting, when will the pain go away? When will the pain go away? You’re asking yourself now, how can I be right here, next to the pain, but not pained by it? Be with the pain, but not suffer from it.

Sometimes when you separate things out like this, the pain will go away. Again, that’s because what you’ve been doing around the pain has actually been continuing to create the pain. Other times it’ll still be there. That’s the raw material coming in from your past karma right now. But it’s there in the body. It doesn’t have to be in the mind.

Remember the Buddhist comment about wisdom — it’s not seeing the oneness of all things, it’s seeing things as separate. Things that you’ve held together in the past, you begin to realize they are separate things. Because they’re separate, they don’t have to weigh things down.

So remember, you want to maintain that original intention. Not that the pain go away, but simply that it not invade the mind and remain. And you’re here to find perceptions that calm the effect of the pain on the mind. Ways of talking to yourself, questions that you want to ask that calm the effect on the mind.

That puts you in the driver’s seat. Because the pain may come and go based on past karma. But the effect it has on the mind is something you can change right now.

We’ve got these tools. The Buddha points them out to you. The forest ajahns give you some ideas on how you can use them. It’s up to you now to develop these skills.

When you have these skills, it puts you in a much better position. You can treat the pain with a little less fear. And when you don’t have fear of pain, that’s one less thing that the world can use against you.

We see how people are driven, driven, driven by pain. And how people take advantage of other people’s fear of pain. When the mind is not invaded by pain, it’s not only for your well-being right now, but it’s also for your greater independence at large.