I agree with a lot of what you're saying, but the one thing I would question is how much more advanced of an AI the government has than what is publicly available..Again, I don't think this really matters. Sentience is not necessary for "AI" to become hugely disruptive. The smartphone is not sentient, and look what an enormous impact its had on society. The same could be said for contraception, the internal combustion engine, electricity and a myriad of other technologies. Even if AI only advanced a relatively small degree from its current public releases (which are well behind the most advanced closed door models) it will still have a profound destabilizing impact on society over both the short and long term, if only from the economic dislocation of millions of knowledge workers who currently inhabit largely makework white collar jobs.
I encourage you to at least read the first chapter of the Situational Awareness paper, to gain some understanding about the rapid pace of progress the LLM/AI models have made over the past couple of years. A few excerpts:
The pace of deep learning progress in the last decade has simply been extraordinary. A mere decade ago it was revolutionary for a deep learning system to identify simple images. Today, we keep trying to come up with novel, ever harder tests, and yet each new benchmark is quickly cracked. It used to take decades to crack widely-used benchmarks; now it feels like mere months.
We’re literally running out of benchmarks. As an anecdote, my friends Dan and Collin made a benchmark called MMLU a few years ago, in 2020. They hoped to finally make a benchmark that would stand the test of time, equivalent to all the hardest exams we give high school and college students. Just three years later, it’s basically solved: models like GPT-4 and Gemini get ~90%.
More broadly, GPT-4 mostly cracks all the standard high school and college aptitude tests.
(And even the one year from GPT-3.5 to GPT-4 often took us from well below median human performance to the top of the human range.)
Or consider the MATH benchmark, a set of difficult mathematics problems from high-school math competitions. When the benchmark was released in 2021, the best models only got ~5% of problems right. And the original paper noted: “Moreover, we find that simply increasing budgets and model parameter counts will be impractical for achieving strong mathematical reasoning if scaling trends continue […]. To have more traction on mathematical problem solving we will likely need new algorithmic advancements from the broader research community”—we would need fundamental new breakthroughs to solve MATH, or so they thought. A survey of ML researchers predicted minimal progress over the coming years; and yet within just a year (by mid-2022), the best models went from ~5% to 50% accuracy; now, MATH is basically solved, with recent performance over 90%.
Over and over again, year after year, skeptics have claimed “deep learning won’t be able to do X” and have been quickly proven wrong.
If there’s one lesson we’ve learned from the past decade of AI, it’s that you should never bet against deep learning.
Now the hardest unsolved benchmarks are tests like GPQA, a set of PhD-level biology, chemistry, and physics questions. Many of the questions read like gibberish to me, and even PhDs in other scientific fields spending 30+ minutes with Google barely score above random chance. Claude 3 Opus currently gets ~60%, compared to in-domain PhDs who get ~80%—and I expect this benchmark to fall as well, in the next generation or two.
I have enough exposure to government procurement bids for AI functionality to think that the government is not privy to a more advanced class of AI.
The analogy I would make is this: When automobile technology broke out, the private sector was ahead of the government. Likewise airplane technology. I think computer technology was like this as well in the era when PCs developed.
Sure, the government got interested early on, and started issuing contracts to apply these technologies for military purposes, but they did not have some kind of secret next generation capability. I think this is the case with AI.
I would say they are spending money to apply recent advances in AI, but it hasn't put them on a whole different level. The whole field is advancing too fast for the government to be able to get out in front of it. The government procurement process simply does not have that kind of foresight and organizational efficiency. They're too busy observing Juneteenth and making sure they meet their DIE quotas.