F2: The Small Model That Could Change the Landscape of AI in 2024
In the world of generative AI, 2023 was winding down with more focus on mistletoe and merry-making than models and the MMLU. But, as it turns out, this was the week of powerful new small models that could change the landscape of AI in 2024. This video is about F2 and what it means, as well as the madness of MML brinkmanship, with a reminder of all of that exam's mistakes, plus Optimus Gen 2, Imagin Gen 2, and much more.
Table of Contents
1. Introduction
2. What is F2?
3. F2's Benchmark Performance
4. F1, F1.5, and F5
5. The Importance of Synthetic Data
6. The Potential of Small Models
7. Comparisons to Gemini Nano and Llama 2
8. The Flaws of Benchmarks
9. Imagin 2: Text to Image Model
10. Optimus Gen 2: The 10 kg Lighter Generation 2 Humanoid Robot
11. Touch, Temperature, and Pressure Sensitivity
12. AI Insiders: Supporting the Channel
13. The Mistakes of the MMLU
14. Conclusion
What is F2?
F2 was announced by Sassan Adella, the CEO of Microsoft, last month. In a nutshell, F2 is a 2.7 billion parameter model, which is small by today's standards. In fact, it is so small that it could fit locally on a smartphone, according to the benchmarks. F2 outperforms models of comparable size, like ones trained using Mamba, as well as Google's new Gemini Nano. If that wasn't enough, it also outperforms models 202-25 times its size.
F2's Benchmark Performance
F2's benchmark performance is impressive. It was trained in just 14 days on less than 100 A100 GPUs. The amount of data, including rereads or EPO, was 1.4 trillion tokens, which is five times more than F1.1.5 web. More parameters mean more connections could be made, and with more compute, of course, that means that they can feed in more data and go over that data more times.
F1, F1.5, and F5
F1, F1.5, and F5 are the predecessors of F2. They retrieved a pile of permissively licensed open code from what's appropriately called the stack. They extracted just the Python code from the many programming languages in that data set and also filtered for duplications. For F1, they then gave GPC4 a task filter for textbook quality code. They swapped a tiny classifier in to finish the job that GPT 4 started, with that classifier essentially imitating the labeling that GPT 4 had kicked off. They then got GPT 3.5 to generate its own diverse textbook quality data and synthetic Q&A exercises.
The Importance of Synthetic Data
A side benefit of training on synthetic data is that it tends to be less toxic data. The toxicity scores went down across the board for F2, and this is before any reinforcement learning from human feedback. F2's efforts prove that we have been wasting enormous amounts of compute on rather ineffective training data, as one of the researchers on the project said.
The Potential of Small Models
F2's small size is significant because it means that it could fit locally on a smartphone. This opens up a whole new world of possibilities for AI on mobile devices. It also means that smaller companies and startups can compete with larger companies that have more resources.
Comparisons to Gemini Nano and Llama 2
F2 outperforms models of comparable size, like ones trained using Mamba, as well as Google's new Gemini Nano. It also outperforms models 202-25 times its size, like Llama 2, which has 70 billion parameters.
The Flaws of Benchmarks
The MMLU is a flawed benchmark in many respects. It has innumerable factual errors, misspellings, grammatical ambiguity, and formatting ambiguity throughout the test. There are also many potential lessons from F2, as one of the researchers on the project said.
Imagin 2: Text to Image Model
Imagin 2 is a diffusion model that uses text to generate images. It is available via API and is indemnified by Google from copyright. All generations are watermarked, and the quality looks stunning.
Optimus Gen 2: The 10 kg Lighter Generation 2 Humanoid Robot
Optimus Gen 2 is the 10 kg lighter generation 2 humanoid robot from Tesla. Watching this video makes me think of touch, temperature, and pressure sensitivity as whole new modalities yet to be fully explored.
AI Insiders: Supporting the Channel
AI Insiders is a Patreon tier that supports the channel. It features classic AI explain videos, bonus content, the AI Insiders podcast, tutorials, and the Insiders Arena. The Insiders Arena is where you and any other member with a passion for AI can submit explainers, and the best of the bunch will feature a cameo on AI Explained.
The Mistakes of the MMLU
The MMLU has hundreds of questions that are ambiguous or erroneous. This type of question was massively prevalent in the global facts category. There were also many potential lessons from F2, as one of the researchers on the project said.
Conclusion
F2 is a small model that could change the landscape of AI in 2024. It outperforms models of comparable size, like ones trained using Mamba, as well as Google's new Gemini Nano. It also outperforms models 202-25 times its size, like Llama 2, which has 70 billion parameters. F2's small size is significant because it means that it could fit locally on a smartphone. This opens up a whole new world of possibilities for AI on mobile devices.