Header
Machine Learning

What is an LLM, Really?

P

Aven

April 21, 2026 • Study Resource

Large Language Models (LLMs): A Practical Deep Dive (with Code)

Large Language Models (LLMs) have shifted how we write software, search for information, and build products. But behind the impressive demos lies a stack of concepts, trade-offs, and patterns worth understanding if you want to use them effectively.

This guide walks through how LLMs work, how to use them in real systems, and how to write production-grade code around them.

1. What is an LLM, Really?

An LLM is a neural network trained to predict the next token (word/subword) in a sequence.

At scale, this simple objective leads to surprisingly powerful behavior: reasoning, summarization, translation, and even code generation.

Under the hood:

  • Input text → tokenized into numbers

  • Passed through a Transformer architecture

  • Model predicts probability distribution of next token

  • Tokens generated sequentially → output text

2. The Transformer Core (Intuition)

The breakthrough behind LLMs is the Transformer, introduced in the paper “Attention Is All You Need”.

Key idea: attention — every word can look at every other word in the sentence.

This allows the model to understand context like:

“The bank near the river” vs “The bank approved the loan”

Same word, different meaning — resolved using surrounding context.

3. Basic LLM Usage (Python Example)

Here’s a minimal example using an API-style interface:

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-5.3",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain black holes simply."}
    ],
    temperature=0.7
)

print(response.choices[0].message.content)

Key Parameters:

  • temperature: creativity (low = deterministic, high = creative)

  • max_tokens: response length

  • top_p: sampling diversity


4. Prompt Engineering: The Real Skill

LLMs are extremely sensitive to how you ask.

Weak Prompt:

Explain photosynthesis

Strong Prompt:

Explain photosynthesis in 5 bullet points suitable for a Class 10 student.
Include a real-life example.

Pattern:

  • Role

  • Task

  • Constraints

  • Output format

5. Few-Shot Learning

You can teach the model using examples:

messages = [
    {"role": "system", "content": "Convert English to Hinglish."},
    {"role": "user", "content": "I am going to school."},
    {"role": "assistant", "content": "Main school ja raha hoon."},
    {"role": "user", "content": "She is reading a book."}
]

The model learns from pattern → responds consistently.

6. Embeddings + Semantic Search

LLMs don’t just generate text — they can understand meaning.

Example: Semantic Search

from openai import OpenAI
client = OpenAI()

embedding = client.embeddings.create(
    model="text-embedding-3-large",
    input="What is photosynthesis?"
)

print(embedding.data[0].embedding)

Use cases:

  • Search engines

  • Recommendation systems

  • Clustering similar documents

7. Retrieval-Augmented Generation (RAG)

LLMs don’t “know everything” reliably. So we give them context.

Flow:

  1. Store documents in a vector database

  2. Convert query → embedding

  3. Retrieve relevant docs

  4. Feed into LLM

Example (Simplified):

context = "Photosynthesis occurs in chloroplasts using sunlight."

prompt = f"""
Answer the question using the context below.

Context:
{context}

Question:
Where does photosynthesis happen?
"""

response = client.chat.completions.create(
    model="gpt-5.3",
    messages=[{"role": "user", "content": prompt}]
)

print(response.choices[0].message.content)

8. Function Calling (Tool Use)

LLMs can interact with external tools.

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"}
                }
            }
        }
    }
]

This enables:

  • APIs

  • Databases

  • Automation systems

9. Streaming Responses

For real-time UX (like ChatGPT typing effect):

stream = client.chat.completions.create(
    model="gpt-5.3",
    messages=[{"role": "user", "content": "Tell a story"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

10. Guardrails & Safety

LLMs can hallucinate.

Mitigation strategies:

  • Provide context (RAG)

  • Use system prompts

  • Validate outputs

  • Add post-processing checks

11. Real-World Use Cases

  • AI tutors

  • Code assistants

  • Customer support bots

  • Content generation

  • Data analysis


12. Limitations (Important)

  • Hallucinations (confident wrong answers)

  • Context window limits

  • Bias in training data

  • Cost at scale

13. Where This Is Going

Based on observed trends:

  • Smaller, efficient models on-device

  • Better reasoning capabilities

  • Deeper integration with software systems

  • AI agents handling multi-step workflows

Final Thought

LLMs are not magic — they are prediction machines trained at massive scale.

Their real power doesn’t come from the model alone, but from how you:

  • design prompts

  • structure data

  • integrate systems

The future belongs to people who can combine LLMs with real-world workflows, not just call an API.

 

Need more ICSE materials?

Get our premium revision series for Class 10.

Browse All Notes