TLDR;
Alright, so this generative AI mini-course is all about getting you up to speed with the fundamentals, diving into Langchain (a Python framework for building GenAI apps), and then rolling up your sleeves for two end-to-end projects. The first project uses a commercial GPT model to build an equity news research tool, and the second uses an open-source LLM to create a Q&A tool for the retail sector.
- GenAI fundamentals
- Langchain framework
- Building 2 GenAI projects
Overview [0:00]
This course will cover Gen AI fundamentals, Langchain, and building two end-to-end Gen AI projects. The first project involves building an equity news research tool using a commercial GPT model. The second project focuses on creating a Q&A tool for the retail industry using an open-source LLM.
What is Gen AI or Generative AI? [0:29]
AI is divided into generative and non-generative categories. Non-generative AI involves decision-making based on existing data, like diagnosing pneumonia from chest x-rays or assessing loan eligibility. Generative AI, on the other hand, creates new content, such as text, images, video, or audio. A prime example of generative AI is ChatGPT, which can be used for tasks like resume writing, trip planning, and image creation.
Gen AI evolution [1:23]
The evolution of AI started with statistical machine learning, which predicted outcomes like home prices based on features such as area, bedrooms, and age. Image recognition tasks, like identifying cats or dogs, involved complex features due to unstructured image data. Neural networks then led to deep learning, followed by recurrent neural networks for language translation. Language models, trained on large text corpora, predict the next word in a sequence, as seen in Gmail's autocomplete feature. Self-supervised learning allows training language models without extensive labeled data, using text from sources like Wikipedia. Large language models (LLMs) like GPT-4, which powers ChatGPT, have billions of parameters and are based on the Transformer architecture introduced in the "Attention is All You Need" paper. Various models exist, including Google's BERT, OpenAI's GPT, and image models like DALL-E and Stable Diffusion. OpenAI's SORA is an example of a text-to-video model.
What is LLM (Large Language Model)? [10:00]
LLMs are computer programs using neural networks to predict the next words in a sentence, similar to a stochastic parrot mimicking conversations. These models are trained on vast datasets like Wikipedia articles and Google News, enabling them to complete sentences on various topics. ChatGPT uses LLMs like GPT-3 or GPT-4, while other examples include Google's PaLM 2 and Meta's LLaMA. LLMs also use reinforcement learning with human feedback (RLHF) to reduce toxic language, with human intervention ensuring less harmful outputs. Despite their power, LLMs lack subjective experience, emotions, and consciousness, relying solely on trained data.
Embeddings, Vector Database [13:55]
An embedding is a numeric representation of text as a vector, capturing its meaning and enabling mathematical operations with words and sentences. Vector databases store these embeddings, allowing efficient semantic searches that understand the intent behind user queries. Semantic search uses embeddings to find similarities between words and sentences, even without exact keyword matches. Word embedding techniques, like word2vec, represent words numerically, enabling complex arithmetic. Transformer-based embedding techniques are popular in the era of ChatGPT. Vector databases use indexing techniques like locality-sensitive hashing (LSH) to speed up searches, creating buckets of similar-looking embeddings.
Retrieval Augmented Generation [21:24]
Retrieval Augmented Generation (RAG) involves using a large language model (LLM) to retrieve answers from a database, such as Excel files, PDF documents, or SQL databases. This approach is used when organizations want to build solutions similar to ChatGPT on their specific data. RAG takes a user's question and has the LLM refer to a database to pull the answer, using private internal organizational data or public data sources.
Tooling for Gen AI [28:16]
Gen AI tooling requires a model (LLM), cloud service, and framework. Commercial models like GPT-4 and open-source models like LLaMA or Mistral are available. Cloud services such as Azure Open AI, Amazon Bedrock, and Google Cloud can be used. Frameworks like Langchain and Hugging Face Transformer library are also essential, along with deep learning libraries like PyTorch and TensorFlow.
Langchain Fundamentals [29:14]
Langchain is a framework for building applications on top of large language models (LLMs). It addresses limitations of directly calling open AI API, such as cost, limited knowledge, and lack of access to internal data. Langchain provides integration with various models, search engines, and databases. It also offers plug-and-play support for integrating different models. Key components include prompt templates, chains, and sequential chains. Sequential chains allow connecting multiple chains where the output of one chain becomes the input of the next. Agents connect with external tools and use LLM reasoning capabilities to perform tasks. Memory is also a crucial aspect, allowing chatbots to remember past conversations.
End-to-End Project 1: Equity Research Tool [1:14:49]
This project builds a news research tool for equity research analysis, using Langchain, open AI, and Streamlit. The tool allows users to input news article URLs and ask questions, retrieving answers based on those articles. The technical architecture involves a document loader, text splitting, and a vector database. The tool uses semantic search to find relevant chunks of text, improving efficiency and saving on open AI costs. The project is built in Streamlit for a proof of concept, with a long-term architecture involving a database injection system and a chatbot UI. Tax loaders, such as text loader, CSV loader, and unstructured URL loader, are used to load data from various sources. Text splitting is performed using character text splitter and recursive text splitter classes. Vector databases, like FAISS, are used for faster search. The retrieval QA with sources chain is used to combine chunks and generate answers.
End-to-End Project 2: Retail Q&A Tool [2:28:25]
This project builds a Q&A tool for a t-shirt store, using a mySQL database and Google Palm. The tool converts natural language questions into SQL queries, providing answers from the database. The technical architecture involves Google Palm for SQL conversion, a SQL database chain class, and few short learning. Few short learning uses training data with sample questions and SQL queries, converted into embedding vectors and stored in a vector database. A UI is built in Streamlit. Google Palm is used for its free access, and a custom mySQL prompt is used to guide the LLM. The project demonstrates how to set up the API key for Google Palm and the mySQL database.