..

2023-08-10

Agenda

  • Generative AI and LLMs
  • embeddings
  • vector DB
  • search in LLM
  • RAG - Retrieval augmented generation

How large are LLMs

  • Human brain has $10^14$ connections
  • GPT4 has around $10^12$ parameters
  • Supervised model have a limit on the size of dataset. LLM are unsupervised thus having massive datasets

Next word Prediction is how most LLMs are used for

  • On the fly dataset creation
  • A window slides on set of words whose output is the next word
Input 1 Input 2 Output
Thou shall not
  • This enables parallelization

Embedding

  • Non number data to numbers
  • Eg:
    • Audio to audio vectors
    • text to text vectors
    • Video to video vectors etc .
  • Text vectorization is complicated
  • Older methods like Bag Of Word, TF-IDF can’t convey certain meanings
  • Word2Vec and Glove overcome these issues
  • Embedding can become very big
  • Vector databases are used to efficiently store embeddings and LLMs

Vector databases

  • DB that stores vectors
  • Vector databases are used in recommendation engines too

LLM Flow

  • $PrivateData \rightarrow DataChunks \rightarrow LLM \rightarrow VectorDatabase$
  • Break Data into chunks

Search Space

  • Find documents relevant to the query
  • https://cohere.ai
  • Give trial key to use on colab

Retrieval Augmented Generation

  • Data source not in LLM used to generate results in realtime
  • Use some new knowledge sources during runtime
  • Suppose we prompt for something that is not there in the LLM. Then we use external knowledge sources to satisfy the request
  • First the external data is converted to the appropriate embedding
  • Then the embedding along with the query is passed on the LLM

Steps

  1. Store all internal docs in suitable format for querying
    1. Split corpus into chunks
    2. Embedding (use the same embedding model as the LLM)
    3. Store vectors in vector DB
    4. Save text with pointers to embedding
  2. Embed Query
    1. Use the embedded query to get the relevant docs from the vector DB. Vector DB uses ANN for searching
    2. Send those docs to LLM