On Tuesday, just one day after President Donald Trump was inaugurated to his second term in office, he held a press conference alongside Softbank CEO Masayoshi Son, OpenAI CEO Sam Altman, and Oracle Executive Chairman Larry Ellison.

At the press conference they announced a $500 billion investment in Artificial Intelligence over the next four years. By any standard, half a trillion dollars is an incredible amount of money to raise and deploy in such a short amount of time.

The money is going to be put towards what they’re calling The Stargate Project. In addition to the companies represented by their CEOs at that conference, Arm, Microsoft, NVIDIA, and MGX are involved—either in a technical capacity or contributing funding. A partnership of this scale is unprecedented, even for government sponsored projects, and perhaps even then.

The Stargate Project, however, is not even close to the only large scale AI investment happening globally today. Google recently signed a deal to use nuclear reactors to power their data centers running AI. In addition to their Stargate involvement, Microsoft is spending $3.3 billion to build out a data center in Wisconsin. Meta(Facebook) also spent north of $10 billion on their AI capabilities in 2024 alone and is expected to increase that number.

China is also increasing investments in AI, though at a slower rate than the U.S. right now, at around $35-$40 billion a year—at least based on what they’re reporting publicly.

So what gives? The world’s leading companies and governments surely wouldn’t be deploying as much capital in such short periods for it not to pay off, and the expected result must be massive to justify it.

In this post I want to (ambitiously) cover the origins of AI, the popularization of it, what it is, how it works, the urgency to build it and what the future might look like.

“Please take whatever precautions are necessary to prevent this terrible disaster. Your friend, Marty”

“Artificial intelligence is the future, not only for Russia, but for all humankind, It comes with colossal opportunities, but also threats that are difficult to predict. Whoever becomes the leader in this sphere will become the ruler of the world.” – Vladimir Putin, 2017.

It’s widely understood by technologists, scientists, and world leaders that the last technology humans need ever invent is a true Artificially Intelligent computer (we’ll get to exactly why later).

Both Sam Altman and Dario Amodei (CEO of Anthropic), have recently gone on the record expressing their thoughts on the timeline to Artificial General Intelligence(AGI)—think of a computer being as smart and capable as about the smartest human alive—and Artificial SuperIntelligence(ASI), think of a computer that, not to be hyperbolic, is equivalent to a God.

In the most conservative case, both believe we will reach these levels of AI within the decade—and most likely sooner.

The country that develops such incredible capability, will, overnight, have the ability to shape civilization as it deems, with relatively little resistance. The urgency of this issue, though understood in elite circles around the world, has yet to permeate mainstream discourse surrounding artificial intelligence.

To many, these projections about AI probably sound fantastical, even absurd. And I get it. History is full of overhyped “world-changing” technologies that never quite delivered. Plus, pop culture’s depiction of AI often misses just how fast and dramatically it could reshape society. So, I suppose it’s no surprise there’s some hesitance to fully grasp the scale of what’s coming.

“Roads? Where we’re going, we don’t need roads.”

If you haven’t studied computer science or the history of these ideas, you might be wondering how we got here. It was only about 2.5 years ago that the AI system most people know—ChatGPT—was released. So, how could it be that we’re already talking about reaching the endgame in the next 5 or so years?

In 1906, the Spanish physician and scientist Santiago Ramón y Cajal won the Nobel Prize in Physiology or Medicine for his work on the human nervous system. Santiago identified the distinct parts of neuronal cells and theorized that they were part of an interconnected network responsible for processing information.

It was his initial discoveries, and several more that came in the following decades—Like Alan Turing’s theory of computation—that eventually gave way to the idea that if the human mind functions by sending electrical signals between clusters of cells, then perhaps a digital one would be capable of doing the same.

For decades, all attempts to give rise to this theorized phenomenon failed, of course. Nearly 70 years of advances in neurology, physics, and computer science were still needed before we would be able to take a real crack at building an AI system. That didn’t stop people from trying, though.

Quick Aside on Technological Progress

It’s impossible to tell the story of artificial intelligence without taking a little detour to talk about computing in general. Here’s a quick briefer so we can get back to the good bits.

The first semi-modern computer was invented in 1871. It was a mechanical machine—more like a severely limited calculator than anything else. Then, in 1946, we took a massive leap in computing when the first general-purpose computer, ENIAC, went live.

General-purpose computers resemble the ones that we use today, in that they typically are able to accomplish a number of different tasks depending on how they’ve been programmed, and have similar hardware architectures.

The next leap occurred in 1960 when Bell Labs invented the transistor*, ushering in the beginning of the digital age and where the story of modern technological progress begins.Just five years later, Intel’s CEO, Gordon Moore, made a simple observation with profound implications:

The number of transistors we could fit on integrated circuits (computer chips) doubled every 18 months.

This observation, now known as Moore’s Law, suggested that the processing power of computer chips would double roughly every 18 months as we managed to pack smaller and smaller transistors onto chips. When Bell Labs invented the first transistor, it was about a centimeter long. Today, the transistors inside your devices are just 3 nanometers in length. To give you a sense of scale, atoms are typically between .1 to .3 nanometers in size and research into 2 nanometer transistors is already well underway.

We’ve gone from a world that once believed it would only need a handful of computers to one where digital computers are embedded in nearly every product we create, thanks to this rapid progress.

“I had a horrible nightmare. I dreamed that I went… back in time. It was terrible.”

Alright, back to the reason we’re here… (I can’t believe I’m still writing).

The need to transform data into information is as old as humanity. Our brains are constantly working to do just that—mapping our surroundings, cataloging experiences, and connecting the dots to form ideas about the world around us.

A teacher I had at some point in life (I’m paraphrasing here, thanks to my selective memory) would say: “Data is useless without context or insight. Interpreting data is how we transform it into information“

At some point, humans discovered math and invented ways to analyze and interpret data—what we now call statistical modeling. We use statistical modeling for three main purposes:

Predictions
Extracting Information
Learning about datasets that appear random

These modeling methods worked well enough—until the late 20th century brought an explosion in data volume, complexity, and advanced use cases outpaced what traditional modeling could handle. That’s when computational modeling and machine learning became practical solutions.

The desire for a thinking machine traces back to the dawn of classical computing. And as computers grew more capable, the allure of using machine learning to achieve that goal only grew stronger.

Machine Learning

According to Wikipedia: “Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions.”

The foundational studies conducted by Santiago Ramón y Cajal at the end of the 19th century—and the many that followed, became crucially important when computer scientists began designing and leveraging neural networks as components of machine learning model architectures.

By the 2010s the field of ML shifted from relying primarily on traditional statistical modeling to a focus on data-driven learning. The most popular kinds of learning methods used to “teach” (the process of training) a model were:

Supervised Learning
- Provide the computational model with an initial dataset that has been labeled, and an output dataset which are the predictions expected from the model for each element in the data.
Unsupervised Learning
- Only provide input data to the model. The model learns what patterns exist in the data as it runs its associated algorithms over the data.

“Doc, do you have a 75-ohm matching transformer?”

In 2017, I was two years into my career when Google released a paper titled “Attention Is All You Need“. If you read I Am Very Dumb?, you can probably guess that while I came across this paper and the discussions of how groundbreaking it was, I didn’t go much deeper than the surface.

It turns out that this paper fundamentally changed the field of machine learning and artificial intelligence forever.

Through “Attention is All You Need”, Google introduced a new neural network architecture called the Transformer. This architecture, by fully leveraging what’s known as the attention mechanism (a system that assigns weights to each element of input data, allowing the model to focus on the most important parts), enabled machine learning models to better process and understand complex sequences of data far more effectively.

Seemingly overnight, this new method brought breakthroughs in the fields of natural language processing, computer vision, and generative AI.

It’s through this new paradigm that we find ourselves where we are today, on the precipice of creating an artificial mind equal in capability to the smartest human.

I naively thought I would be able to cover the history of this topic and get to the modern state (ChatGPT, Claude, ElizaOS, Digital Agents, etc.) as well as detailing capabilities of AGI & ASI and how society might be impacted..but I’ll have to pause here and do a part two.

*Transistors control the flow on electricity on computer chips. If you’ve ever heard of binary or seen references to zeroes and ones, with regard to computers, transistors are where that concept arises as they are either 1 (allowing electricity to flow through them) or 0 (not allowing electricity to flow). The rapid switching of these on and off states by transistors is what makes a computer a computer.

Writing and Doing

Tag: intelligence

The Last Arms Race (1/2)