Generative AI

This course provides hands-on training in Generative AI models like GANs, VAEs, Diffusion Models, and LLMs, covering text, image, video, and multi-modal applications with deployment and real-world projects.

Foundations of Generative AI

  • Introduction, Key Concepts, and Evolution

  • Applications in Healthcare, Finance, Media, Education, Entertainment

  • Types of Generative Models: VAEs, GANs, Diffusion Models, LLMs

  • Tools & Frameworks: PyTorch, TensorFlow, Hugging Face, LangChain

  • Hands-On: Setting up development environment


Generative Adversarial Networks (GANs)

  • Fundamentals: Generator & Discriminator

  • Loss Functions, Training Dynamics

  • Variants: DCGAN, CycleGAN, StyleGAN

  • Applications: Deepfakes, Art, Style Transfer, Image-to-Image Translation

  • Hands-On: Build a GAN; Experiment with StyleGAN


Variational Autoencoders (VAEs)

  • Encoder-Decoder Architecture & Latent Space

  • Applications: Anomaly Detection, Data Compression, Image Generation

  • Hands-On: Create a VAE; Explore multi-modal VAEs (text-to-image)


Large Language Models (LLMs) & NLP

  • Transformer Architecture & Attention

  • Pre-training vs. Fine-tuning

  • Tokenization, Embeddings

  • Applications: Chatbots, Summarization, Translation, Code Generation

  • LangChain for Prompting, Chaining, and Agent Workflows

  • Retrieval-Augmented Generation (RAG) with Vector DBs (Pinecone, Weaviate)

  • Hands-On: Build a GPT-like model; Fine-tune LLMs; RAG-based Chatbot


Diffusion Models

  • Theory: Forward & Reverse Diffusion

  • Tools: Stable Diffusion, DALL-E, ControlNet

  • Applications: Text-to-Image, Image-to-Image, Video Generation (Sora)

  • Hands-On: Image/Video generation with Stable Diffusion & ControlNet


Image, Video & Multi-Modal AI

  • Photorealistic Image Generation

  • Video Models for Gaming/Entertainment

  • Multi-modal AI: Combining Text, Image, Audio (CLIP, ALIGN)

  • Hands-On: Image generator, AI video creation, multi-modal captioning


Training & Optimization

  • Training Techniques: Regularization, Dropout, Early Stopping

  • Hyperparameter Tuning: Grid, Random, Bayesian Optimization

  • Fine-Tuning Pretrained Models: Transfer Learning, PEFT (LoRA, QLoRA)

  • Evaluation Metrics: FID, Inception Score, BLEU, ROUGE

  • Hands-On: Fine-tune open-source models; PEFT on custom dataset


Deployment & AI Agents

  • Web & Mobile Integration

  • Serverless Deployment (AWS Lambda, GCP Functions)

  • API Development (Flask, FastAPI, Streamlit, Gradio)

  • LangChain for AI Agents & Tool Integration

  • Monitoring: Drift, Feedback Loops, Retraining

  • Hands-On: Deploy a GenAI app; Build LangChain-powered agent


Ethics & Limitations

  • Deepfakes, Copyright, and Bias in AI

  • Security, Privacy, and Regulations (GDPR, CCPA)

  • Intellectual Property Issues

  • Guest Lecture: AI Ethics & Policy Expert


Capstone Projects

  • Build a full Generative AI application

  • Example Projects:

    • RAG-powered Chatbot

    • Marketing Content Generator

    • Multi-modal AI System (text + image + audio)

  • Hands-On: End-to-end project, documentation, and presentation

 

Course Outcomes:

  • Gain strong knowledge of GANs, VAEs, Diffusion Models, and LLMs.

  • Build, fine-tune, and deploy generative models for text, image, video, and multi-modal tasks.

  • Use tools like PyTorch, TensorFlow, Hugging Face, and LangChain for real-world AI solutions.

  • Apply ethical, legal, and responsible AI practices.

  • Create industry-ready projects for roles like AI Engineer and LLM Developer.