I've started running some LLM's on my computer locally as I want to see how they compare to chatGPT and copilot, and I've been pleasantly surprised. With that being said, the new stuff coming out like the below is getting crazy:
I've started running some LLM's on my computer locally as I want to see how they compare to chatGPT and copilot, and I've been pleasantly surprised. With that being said, the new stuff coming out like the below is getting crazy:
As a programmer who's been out of work for 8 months now, the market is very very tight and salaries are down 20%.This AI stuff is just hype. Silicon valey types desperetaly trying to be the next Jobs.
As far as programmers are concerned, they are safe for at least the next decade or two.
The current models developed very quickly to a level of being moderately useful to programmers, but they are still miles away from replacing them. From now on, it's going to take massive amounts of money and time to make tiny gains.
I predict in a few years, the hype will die down and every one will find their lost minds.
I think this is the normal ebb and flow of the job market. These lost jobs weren't replaced by AI, were they? Too many people entered the job market during the coof. It's going to swing upwards again in a year or two, I believe.As a programmer who's been out of work for 8 months now, the market is very very tight and salaries are down 20%.
I also work in a technical field and it took me over 6 months to find my current position. The interview process is currently extremely burdensome. For one company offering mediocre pay, I had to sit through five regular interviews, a technical interview, and an exam. Only to be denied in the final round.As a programmer who's been out of work for 8 months now, the market is very very tight and salaries are down 20%.
This AI stuff is just hype. Silicon valey types desperetaly trying to be the next Jobs.
As far as programmers are concerned, they are safe for at least the next decade or two.
The current models developed very quickly to a level of being moderately useful to programmers, but they are still miles away from replacing them. From now on, it's going to take massive amounts of money and time to make tiny gains.
I predict in a few years, the hype will die down and every one will find their lost minds.
Compare the computer technology of when Windows 95 was released to today’s, a time span of three decades.This AI stuff is just hype. Silicon valey types desperetaly trying to be the next Jobs.
I've started running some LLM's on my computer locally as I want to see how they compare to chatGPT and copilot, and I've been pleasantly surprised. With that being said, the new stuff coming out like the below is getting crazy:
The Casio calculator you had in the 80's is just as "inert" and without an "inner-life" as the ML tools of today.
Moore's Law has already been disproven. You can't predict the future rate of development based on the past. There are initial gains to be made fast in every tech, but then you hit a plateau. Hardware progress has already slowed down significantly and will continue to slow down even more in the future, even if all the nerds of the world unite to prevent that. Unless another tech is found to make cpus like quantum, but I've been hearing about that for 25 years and it seems it was all hype just like AI.Compare the computer technology of when Windows 95 was released to today’s, a time span of three decades.
Those 30 years saw more technological change than from the foundation of the Roman Republic to the end of the Western Roman Empire, a period of a thousand years.
Great post.“Artificial Intelligence” is no intelligence at all. It is a very high speed matching and prediction algorithm. Developers, marketing gurus, and other geeks will come up with futuristic terms such as “Neural Networks”, “Training Datasets”, “Machine Learning”, “Mixture of Experts”, and so forth, to give the illusion that it’s a magical and sentient system.
Let me ask you this: Have you ever wondered why this is all done on a GPU and not a CPU? You could do it on a CPU, but it would be extremely slow... why is that? It all boils down to high speed floating point calculations, primarily used – in this specific case – to solve lots and lots of matrices.
CPUs have generally always dealt in integers (i.e. a negative or positive number without a decimal point). Calculations in decimals would – in the past – usually take place using additional CPU cycles, because the unit itself wasn’t directly handling the decimal numbers. This is why anyone who was a teenager interested in computers in the late 80s and early 90s, may remember the advent of FPU Co-Processors.
That was a big deal back then: All of a sudden, specialized software could offload floating point calculations (arithmetic with decimal numbers) to the co-processor (either on the motherboard or an expansion card). An FPU (Floating Point Unit) was not integrated into Intel consumer/workstation processors, until the 486 series. It was improved on from there onward with the Pentium, Pentium II, and so forth.
Why is all this important to understand? Because it brings us back to the original question: Why GPU and not CPU? The GPU is perfect for such calculations, because inherent to its nature are geometry and physics calculations.
Enter Nvidia and marketing. A CUDA “Core” (short for Compute Unified Device Architecture) is effectively a high speed, glorified FPU. So when you see Nvidia advertise that their GPU has 3072 CUDA cores, those are not cores at all – rather 3072 Floating Point Units. That doesn’t mean it’s not a big deal. It’s a very big deal, hardware-wise. What used to be a single FPU built into the Pentium processor in 1995, is now available 3072 times inside a consumer GPU.
There are also Nvidia "Tensor Cores" which, unlike CUDA cores, can do multiple operations at once, making them even better for solving matrices. At the end of the day, it's just one massive math problem being solved, over and over again.
Combine all of this with cheaper storage/memory, high speed transfer rates (between SSD + RAM + CPU + GPU), and the massive number of FPUs, and you have the perfect candidate for a prediction machine. All of this “AI” could’ve been possible in the early 90s, but when you take into account the hardware and storage required, it becomes apparent that even a “supercomputer” of that day couldn’t really pull off what you and I can in our high-end desktop system (some laptops, even).
What’s the point of all the above? That “AI” is a “smart” prediction mechanism. For example, in the case of the language models, it is simply predicting the next word (or few words) to follow. That is why we are also able to see the “streaming” output from some chat bots (i.e. as it types the response word by word). All the language model needs to do is mistakenly predict the incorrect word at some point, and if the mistake is material, it will build the rest of its response on that one error. Inherently, it is unable to “learn” or “understand” anything.
The same concept can then be applied to images/videos (diffusion techniques), and physics calculations. Once all three (language, visual, and physics) are combined and aligned together in a high speed fashion, it gives the illusion of intelligence, wowing the masses.
Can it be useful in speeding up certain things? Sure. But it's important to grasp that these systems are neither intelligent, nor truly learning in the traditional sense of the word. There is not and can never be any desires, emotions, or consciousness.
Indeed. These companies started calling these things AI because it was just an obviously good marketing move. Serious people don't think it's actually intelligent. A lot of normies do think that, but they also think AI can take over the world like in that movie they watched featuring the former governor or California, because they will frankly just believe anything.“Artificial Intelligence” is no intelligence at all. It is a very high speed matching and prediction algorithm. Developers, marketing gurus, and other geeks will come up with futuristic terms such as “Neural Networks”, “Training Datasets”, “Machine Learning”, “Mixture of Experts”, and so forth, to give the illusion that it’s a magical and sentient system.
Let me ask you this: Have you ever wondered why this is all done on a GPU and not a CPU? You could do it on a CPU, but it would be extremely slow... why is that? It all boils down to high speed floating point calculations, primarily used – in this specific case – to solve lots and lots of matrices.
CPUs have generally always dealt in integers (i.e. a negative or positive number without a decimal point). Calculations in decimals would – in the past – usually take place using additional CPU cycles, because the unit itself wasn’t directly handling the decimal numbers. This is why anyone who was a teenager interested in computers in the late 80s and early 90s, may remember the advent of FPU Co-Processors.
That was a big deal back then: All of a sudden, specialized software could offload floating point calculations (arithmetic with decimal numbers) to the co-processor (either on the motherboard or an expansion card). An FPU (Floating Point Unit) was not integrated into Intel consumer/workstation processors, until the 486 series. It was improved on from there onward with the Pentium, Pentium II, and so forth.
Why is all this important to understand? Because it brings us back to the original question: Why GPU and not CPU? The GPU is perfect for such calculations, because inherent to its nature are geometry and physics calculations.
Enter Nvidia and marketing. A CUDA “Core” (short for Compute Unified Device Architecture) is effectively a high speed, glorified FPU. So when you see Nvidia advertise that their GPU has 3072 CUDA cores, those are not cores at all – rather 3072 Floating Point Units. That doesn’t mean it’s not a big deal. It’s a very big deal, hardware-wise. What used to be a single FPU built into the Pentium processor in 1995, is now available 3072 times inside a consumer GPU.
There are also Nvidia "Tensor Cores" which, unlike CUDA cores, can do multiple operations at once, making them even better for solving matrices. At the end of the day, it's just one massive math problem being solved, over and over again.
Combine all of this with cheaper storage/memory, high speed transfer rates (between SSD + RAM + CPU + GPU), and the massive number of FPUs, and you have the perfect candidate for a prediction machine. All of this “AI” could’ve been possible in the early 90s, but when you take into account the hardware and storage required, it becomes apparent that even a “supercomputer” of that day couldn’t really pull off what you and I can in our high-end desktop system (some laptops, even).
What’s the point of all the above? That “AI” is a “smart” prediction mechanism. For example, in the case of the language models, it is simply predicting the next word (or few words) to follow. That is why we are also able to see the “streaming” output from some chat bots (i.e. as it types the response word by word). All the language model needs to do is mistakenly predict the incorrect word at some point, and if the mistake is material, it will build the rest of its response on that one error. Inherently, it is unable to “learn” or “understand” anything.
The same concept can then be applied to images/videos (diffusion techniques), and physics calculations. Once all three (language, visual, and physics) are combined and aligned together in a high speed fashion, it gives the illusion of intelligence, wowing the masses.
Can it be useful in speeding up certain things? Sure. But it's important to grasp that these systems are neither intelligent, nor truly learning in the traditional sense of the word. There is not and can never be any desires, emotions, or consciousness.
As I've said many times before: "Remember, everything comes down to marketing."Great post.
This writers captures the essence.
Just another load of fakery (after NASA etc) to fool the masses.