Microsoft Phi-3 Vision-the first Multimodal model By Microsoft, a multimodal model that brings together language and vision capabilities. the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
-
Notifications
You must be signed in to change notification settings - Fork 0
Phi-3-Vision-128K-Instruct Demo
License
divakarkumarp/Phi-3-Vision-MS-Multimodal
About
Phi-3-Vision-128K-Instruct Demo
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published