Microsoft launches its smallest AI model ever Phi-3

Microsoft launched the next version of its lightweight AI model Phi-3 Mini, the first of three smaller models to be released by the company.

Phi-3 Mini measures 3.8 billion parameters and is trained on a data set that is small relative to larger language models like GPT-4. It is now available on Azure, Hugging Face and Olama. Microsoft plans to release Phi-3 Small (7B parameters) and Phi-3 Medium (14B parameters). Parameters refer to how complex instructions a model can understand.

The company released the Phi-2 in December, which performed just as well as larger models like the Llama 2. Microsoft says the Phi-3 performs better than the previous version and can provide the same response as a model 10 times larger.

Eric Boyd, corporate vice president of Microsoft Azure AI Platform, told The Verge that Fi-3 Mini is capable of similar capabilities to LLMs like GPT-3.5 "just in a smaller form."

Compared to their larger counterparts, smaller AI models are often cheaper to run and perform better on personal devices like phones and laptops. Earlier this year it was reported that Microsoft was creating a team focused specifically on lightweight AI models. Along with Phi, the company also created Orca-Math, a model focused on solving mathematics problems.

Microsoft's competitors also have smaller AI models of their own, most of which target simple tasks like document summarization or coding assistance. Google's Gemma 2b and 7b are good for simple chatbots and language-related tasks. Anthropic's Cloud 3 can read in-depth research papers with Haiku graphs and summarize them instantly, while the recently released Llama 3 8b from Meta can be used for some chatbot and coding assistance.

Boyd says the developers have trained Phi-3 with a "curriculum." They were inspired by how children learn from bedtime stories, books with simple words and sentence structures that talk about big topics.

"There aren't enough children's books out there, so we took a list of over 3,000 words and asked LLM to create 'children's books' for Fei to teach," says Boyd.

He said Fi-3 is based only on what has been learned from previous iterations. While Phi-1 focused on coding and Phi-2 started learning how to reason, Phi-3 is better at coding and reasoning. Although the Phi-3 family of models knows some common sense, it cannot beat GPT-4 or any other LLM – there is a huge difference in the kind of answers you can get from LLMs trained across the Internet. Small model like Phi-3.

Boyd says companies often find that smaller models like Phi-3 work better for their custom applications, because for many companies, their internal data sets will be small anyway. And because these models use less computing power, they are often far more affordable.