blip2

Here are 23 public repositories matching this topic...

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

llama large-language-models video-language-pretraining vision-language-pretraining cross-modal-pretraining blip2 minigpt4 multi-modal-chatgpt

Updated Jun 4, 2024
Python

sled-group / chat-with-nerf

Star

[ICRA 2024] Chat with NeRF enables users to interact with a NeRF model by typing in natural language.

nerf gpt-4 nerfstudio chatgpt blip2 lerf

Updated Apr 17, 2024
Python

mlpc-ucsd / BLIVA

Star

(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions

chatbot llama lora multimodal visual-language-learning llm instruction-tuning blip2 bliva

Updated Apr 14, 2024
Python

152334H / MiniGPT-4-discord-bot

Star

A true multimodal LLaMA derivative -- on Discord!

ai discord-bot llama multimodal vicuna llm blip2

Updated Apr 17, 2023
Python

kyegomez / qformer

Sponsor

Star

Implementation of Qformer from BLIP2 in Zeta Lego blocks.

machine-learning ai machine artificial-intelligence multi-modal attention-mechanism multi-modality blip2

Updated Nov 11, 2024
Python

BUAADreamer / SPN4CIR

Star

[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

transformer data-generation llama clip image-retrieval blip multi-modal-retrieval multimodal-learning cross-modal-retrieval composed-image-retrieval llava blip2 memory-bank acmmm2024

Updated Sep 9, 2025
Python

eric-ai-lab / ComCLIP

Star

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

causality clip svo slip vision-and-language compositionality flickr8k-dataset image-text-matching flickr30k image-text-retrieval winoground blip2

Updated Aug 18, 2024
Python

arashsajjadi / ai-powered-video-analyzer

Star

An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Ollama). It ensures privacy and offline use with a user-friendly GUI.

gui privacy yolo image-captioning object-detection whisper offline-processing speech-transcription llm whisper-ai blip2 ollama panns image-captioning-ai ollama-api yolo11 ai-video-analysis audio-event-detection llm-summarization

Updated Feb 23, 2025
Python

nngocson2002 / ViVQA

Star

The Multimodal Model for Vietnamese Visual Question Answering (ViVQA)

vqa multimodal-deep-learning efficientnet bartpho beit-3 blip2 vivqa

Updated Jul 29, 2024
Python

zer0int / CLIP-Interrogator-LongCLIP-hallucinwords

Star

CLIP Interrogator, fully in HuggingFace Transformers 🤗, with LongCLIP & CLIP's own words and / or *your* own words!

clip captioning-images blip gradient-ascent interrogator blip2 clip-interrogator longclip

Updated Jul 18, 2025
Python

ZhaoPeiduo / BLIP2-Japanese

Star

Modifying LAVIS' BLIP2 Q-former with models pretrained on Japanese datasets.

japanese pytorch captioning multimodal-deep-learning blip2

Updated Sep 12, 2025
Python

jacobmarks / fiftyone-image-captioning-plugin

Star

Caption images across your datasets with state of the art models from Hugging Face and Replicate!

computer-vision image-captioning huggingface huggingface-transformers fiftyone fuyu llava blip2 qwen

Updated Apr 4, 2024
Python

matlok-ai / bampe-weights

Star

This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for risk(s).

ai deep-learning blender tiff transformers weights image-to-image blender-python llm stable-diffusion foundational-models generative-ai safetensors blip2 gptq

Updated Dec 18, 2023
Python

craigsdennis / scairy

Star

Uses AI to scare people...more.

halloween ai replicate elevenlabs blip2 sadtalker llama2

Updated Sep 18, 2023
Python

leeyunjai / image2text

Star

caption generator using lavis and argostranslate

captions caption image-analysis captioning-images img2txt image-text caption-generation caption-generator blip2

Updated Mar 21, 2023
Python

KARTIKDHYANI / Skin-Cancer-Detection-Using-BLIP-Fine-Tuned-with-DoRA

Star

Skin cancer ranks among the most prevalent cancers globally, with early identification crucial for improving patient outcomes. This study presents a strategy for skin cancer detection using a fine-tuned BLIP-2 (Bootstrapped Language-Image Pre- training) model, optimized via Weight-Decomposed Low-Rank Adaptation (DoRA).

dora peft blip2

Updated May 22, 2025
Python

otdavies / AIOrganizeMyDesktop

Star

Too lazy to organize my desktop, make gpt + BLIP-2 do it /s

python automation ai organization desktop machinelearning example-project gpt-3 blip2

Updated Oct 24, 2023
Python

thisisiron / QFormer_Pretraining

Star

Implementation of Qformer pre-training

vlm vision-language-model blip2 blip-2 qformer

Updated Nov 18, 2024
Python

yessasvini23 / ContextVision-AI-Powered-Visual-Assistant-for-Accessibility

Star

ContextVision is an AI-powered real-time scene understanding assistant that helps visually impaired individuals interpret their surroundings through live video analysis, speech interaction, and AI-driven insights

opencv artificial-intelligence gradio gpt4 yolov8 langchain blip2

Updated Mar 29, 2025
Python

doammii / medblip-ai-agent

Star

🥼MedBLIP-based Multi Agent

multiagent multiagent-systems blip2

Updated Sep 3, 2025
Python

Improve this page

Add a description, image, and links to the blip2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the blip2 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blip2

Here are 23 public repositories matching this topic...

DAMO-NLP-SG / Video-LLaMA

sled-group / chat-with-nerf

mlpc-ucsd / BLIVA

152334H / MiniGPT-4-discord-bot

kyegomez / qformer

BUAADreamer / SPN4CIR

eric-ai-lab / ComCLIP

arashsajjadi / ai-powered-video-analyzer

nngocson2002 / ViVQA

zer0int / CLIP-Interrogator-LongCLIP-hallucinwords

ZhaoPeiduo / BLIP2-Japanese

jacobmarks / fiftyone-image-captioning-plugin

matlok-ai / bampe-weights

craigsdennis / scairy

leeyunjai / image2text

KARTIKDHYANI / Skin-Cancer-Detection-Using-BLIP-Fine-Tuned-with-DoRA

otdavies / AIOrganizeMyDesktop

thisisiron / QFormer_Pretraining

yessasvini23 / ContextVision-AI-Powered-Visual-Assistant-for-Accessibility

doammii / medblip-ai-agent

Improve this page

Add this topic to your repo