Generative AI

Artificial Intelligence

Generative AI

This course provides hands-on training in Generative AI models like GANs, VAEs, Diffusion Models, and LLMs, covering text, image, video, and multi-modal applications with deployment and real-world projects.

Category: Artificial Intelligence Tags: Diffusion Models, GANs, Generative AI, LangChain, LLMs

Description

Foundations of Generative AI

Introduction, Key Concepts, and Evolution
Applications in Healthcare, Finance, Media, Education, Entertainment
Types of Generative Models: VAEs, GANs, Diffusion Models, LLMs
Tools & Frameworks: PyTorch, TensorFlow, Hugging Face, LangChain
Hands-On: Setting up development environment

Generative Adversarial Networks (GANs)

Fundamentals: Generator & Discriminator
Loss Functions, Training Dynamics
Variants: DCGAN, CycleGAN, StyleGAN
Applications: Deepfakes, Art, Style Transfer, Image-to-Image Translation
Hands-On: Build a GAN; Experiment with StyleGAN

Variational Autoencoders (VAEs)

Encoder-Decoder Architecture & Latent Space
Applications: Anomaly Detection, Data Compression, Image Generation
Hands-On: Create a VAE; Explore multi-modal VAEs (text-to-image)

Large Language Models (LLMs) & NLP

Transformer Architecture & Attention
Pre-training vs. Fine-tuning
Tokenization, Embeddings
Applications: Chatbots, Summarization, Translation, Code Generation
LangChain for Prompting, Chaining, and Agent Workflows
Retrieval-Augmented Generation (RAG) with Vector DBs (Pinecone, Weaviate)
Hands-On: Build a GPT-like model; Fine-tune LLMs; RAG-based Chatbot

Diffusion Models

Theory: Forward & Reverse Diffusion
Tools: Stable Diffusion, DALL-E, ControlNet
Applications: Text-to-Image, Image-to-Image, Video Generation (Sora)
Hands-On: Image/Video generation with Stable Diffusion & ControlNet

Image, Video & Multi-Modal AI

Photorealistic Image Generation
Video Models for Gaming/Entertainment
Multi-modal AI: Combining Text, Image, Audio (CLIP, ALIGN)
Hands-On: Image generator, AI video creation, multi-modal captioning

Training & Optimization

Training Techniques: Regularization, Dropout, Early Stopping
Hyperparameter Tuning: Grid, Random, Bayesian Optimization
Fine-Tuning Pretrained Models: Transfer Learning, PEFT (LoRA, QLoRA)
Evaluation Metrics: FID, Inception Score, BLEU, ROUGE
Hands-On: Fine-tune open-source models; PEFT on custom dataset

Deployment & AI Agents

Web & Mobile Integration
Serverless Deployment (AWS Lambda, GCP Functions)
API Development (Flask, FastAPI, Streamlit, Gradio)
LangChain for AI Agents & Tool Integration
Monitoring: Drift, Feedback Loops, Retraining
Hands-On: Deploy a GenAI app; Build LangChain-powered agent

Ethics & Limitations

Deepfakes, Copyright, and Bias in AI
Security, Privacy, and Regulations (GDPR, CCPA)
Intellectual Property Issues
Guest Lecture: AI Ethics & Policy Expert

Capstone Projects

Build a full Generative AI application
Example Projects:
- RAG-powered Chatbot
- Marketing Content Generator
- Multi-modal AI System (text + image + audio)
Hands-On: End-to-end project, documentation, and presentation

Course Outcomes:

Gain strong knowledge of GANs, VAEs, Diffusion Models, and LLMs.
Build, fine-tune, and deploy generative models for text, image, video, and multi-modal tasks.
Use tools like PyTorch, TensorFlow, Hugging Face, and LangChain for real-world AI solutions.
Apply ethical, legal, and responsible AI practices.
Create industry-ready projects for roles like AI Engineer and LLM Developer.