AI models that can think and reason are becoming easier and cheaper to create.
A group of researchers from UC Berkeley's Sky Computing Lab, called NovaSky, has introduced a new model called Sky-T1-32B-Preview. This model competes well with an older version of OpenAI’s reasoning model on important tasks. What's special about Sky-T1 is that it's open-source, meaning anyone can recreate it because NovaSky shared the data and code used to train it.
Impressively, Sky-T1-32B-Preview was trained for under $450, a big reduction in cost compared to past models, which could cost millions of dollars to train. This lower cost is partly due to using data generated by other AI models for training. For example, another model, Palmyra X 004, cost $700,000 to develop using mostly synthetic data.
Reasoning models, unlike other AI models, can check their own facts. This helps them avoid mistakes and makes them more reliable in areas like science and math. These models may take a bit longer to solve problems but provide better results.
NovaSky used data from another reasoning model called QwQ-32B-Preview, made by Alibaba, to start training Sky-T1. They then used OpenAI's GPT-4o-mini to improve the data. Training Sky-T1, which has 32 billion parameters (similar to its problem-solving ability), took about 19 hours using eight Nvidia H100 GPUs.
Sky-T1 performs better than an early version of OpenAI’s o1 model on math challenges and coding problems but isn’t as strong on science questions. However, OpenAI's final version of o1 is still stronger than Sky-T1 in some areas, and OpenAI is working on an even more advanced model, o3.
NovaSky sees Sky-T1 as just the beginning of its journey to create more powerful and efficient open-source reasoning models. They plan to continue improving the performance and accuracy of their models.
The official link to this news is https://novasky-ai.github.io/posts/sky-t1/.