openai/whisper-large-v3 Automatic Speech Recognition β’ 2B β’ Updated Aug 12, 2024 β’ 4.75M β’ β’ 5.56k
google/pix2struct-widget-captioning-large Visual Question Answering β’ 1B β’ Updated Apr 10, 2024 β’ 30 β’ 20
Salesforce/blip-image-captioning-large Image-to-Text β’ 0.5B β’ Updated Feb 3, 2025 β’ 1.7M β’ 1.47k
Runtime error Featured 272 Edit Video By Editing Text β 272 Audio-based video editing using AI-generated transcription
Runtime error Featured 307 AudioLDM2 Text2Audio Text2Music Generation π 307 Generate audio and waveform video from text