Home Artists Posts Import Register

Content

Patreon exclusive posts index

Join discord and tell me your discord username to get a special rank : SECourses Discord

Our Scripts arsenal here : https://www.patreon.com/posts/90744385

We have captioners_clip_interrogator ,  LLaVA_auto_install , CogVLM , Qwen-VL , blip2_captioning , Kosmos-2

You can see full post here : scripts_arsenal_full_screenshot.png

Kosmos-2: Grounding Multimodal Large Language Models to the World > https://github.com/microsoft/unilm/tree/master/kosmos-2

  • I have modified it and added batch processing too
  • Download Kosmos-2_v1 .zip and extract into any folder you want to install
  • All our scripts generate a new separate venv so they will never conflict with other apps
  • Double click install_windows .bat and install
  • It will install everything fully automatically
  • Then use run_kosmos .bat file to start the app

Our Gradio APP both supports single image caption generation and also batch image caption generation as can be seen below



Files

Comments

Steve Bruno

This looks awesome thank you

Tech Meowpunk

Lava seems more accurate to me, I've tested a lot of bugs, maybe I didn't specify the accuracy controls

John Dopamine

This has been very helpful. If you ever make an update one thing I'd love added is an option to -skip- captioning any files that already have .txt files generated. I often run captioners on folders w/ many images, and sometimes may add new images after a set had already been captioned. If images w/ .TXT files already are skipped running the captioner is super quick and it'll just add captions for the few new ones. As it is now I have to put anything new in a seperate folder and run it on them / then move them otherwise it'll redo the whole set (which I may have edited to improve in somecases). Not a huge thing, but if you do revisit any of the captioners you've coded it'd be cool to have that type of option/toggle implemented.