Furkan Gözükara

Sota Image Captioning Model Kosmos-2 Added To Our Scripts Arsenal (Patreon)

Published:

2024-02-15 01:42:40

Imported:

Content

Patreon exclusive posts index

Join discord and tell me your discord username to get a special rank : SECourses Discord

Our Scripts arsenal here : https://www.patreon.com/posts/90744385

We have captioners_clip_interrogator , LLaVA_auto_install , CogVLM , Qwen-VL , blip2_captioning , Kosmos-2

You can see full post here : scripts_arsenal_full_screenshot.png

Kosmos-2: Grounding Multimodal Large Language Models to the World > https://github.com/microsoft/unilm/tree/master/kosmos-2

I have modified it and added batch processing too
Download Kosmos-2_v1 .zip and extract into any folder you want to install
All our scripts generate a new separate venv so they will never conflict with other apps
Double click install_windows .bat and install
It will install everything fully automatically
Then use run_kosmos .bat file to start the app

Our Gradio APP both supports single image caption generation and also batch image caption generation as can be seen below

Files