Learning Objectives:
From Session 0 (Practical LLM Primer):
Key insight: Everything in the context window influences the output
Basic prompt:
Annotate this sequence: ATCGATCG
Problems:
System prompt = instructions that set the LLM’s role and behavior
system_prompt = {
"role": "system",
"content": ("You are an expert molecular biologist "
"specializing in gene annotation. You provide "
"detailed, accurate information based on current "
"genomic databases.")
}
System prompts are “sticky” - they influence all subsequent responses
DO:
DON’T:
Few-shot learning = providing examples in the prompt
Why it works:
Tradeoff: Uses more tokens but increases accuracy
user_prompt = """
Classify variants as benign, pathogenic, or VUS.
Examples:
Input: BRCA1 c.5266dupC (p.Gln1756Profs*74)
Output: {"variant": "BRCA1 c.5266dupC",
"classification": "pathogenic",
"reasoning": "Frameshift leading to truncation"}
Input: TP53 c.215C>G (p.Pro72Arg)
Output: {"variant": "TP53 c.215C>G",
"classification": "benign",
"reasoning": "Common polymorphism, no functional impact"}
Now classify:
Input: CFTR c.350G>A (p.Arg117His)
Output:
"""
Rule of thumb: Start with zero-shot, add examples if accuracy is insufficient
Chain-of-Thought = asking the model to show its reasoning
Standard prompt:
Is this mutation likely pathogenic?
Mutation: ATM c.5762-1G>A
CoT prompt:
Is this mutation likely pathogenic? Let's think step by step:
1. What is the mutation type?
2. Where is it located?
3. What is known about this gene?
4. What does the literature say?
Mutation: ATM c.5762-1G>A
Theory:
The attention mechanism needs intermediate tokens to “work with”
More tokens → more computation → better answers for complex reasoning
Practice:
CoT significantly improves accuracy on multi-step problems
Tradeoff: Uses more tokens and takes longer
Great for:
Less useful for:
Problem: LLMs generate free text, but we often need structured data
Solution: Request specific formats (JSON, XML, tables)
Benefits:
user_prompt = """
Extract gene information from this text and return as JSON.
Text: "The TP53 gene on chromosome 17p13.1 encodes
a 393 amino acid tumor suppressor protein involved
in cell cycle regulation."
Return format:
{
"gene_symbol": "...",
"chromosome": "...",
"location": "...",
"protein_length": ...,
"function": "..."
}
"""
Some APIs support JSON schema to guarantee format:
from litellm import completion
schema = {
"type": "object",
"properties": {
"gene_symbol": {"type": "string"},
"chromosome": {"type": "string"},
"protein_length": {"type": "integer"},
"function": {"type": "string"}
},
"required": ["gene_symbol", "chromosome"]
}
response = completion(
model="anthropic/claude-sonnet-4-20250514",
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_schema", "schema": schema}
)
Challenge: Context windows have limits (e.g., 4K, 8K, 128K tokens)
Strategies:
Most APIs charge by token:
Example costs (approximate):
Lesson: Be precise but not verbose
For a 128K token window:
System prompt: ~500 tokens (0.4%)
Few-shot examples: ~2000 tokens (1.6%)
Background data: ~50000 tokens (39%)
User query: ~500 tokens (0.4%)
Response space: ~75000 tokens (58.6%)
Leave room for the response!
When outputs are wrong:
1. Write initial prompt
↓
2. Test on examples
↓
3. Identify failure modes
↓
4. Refine prompt (add examples, clarify, restructure)
↓
5. Test again
↓
6. Repeat until satisfactory
Remember: Prompting is empirical, not theoretical
We’ll work through three examples:
Let’s see theory in practice!
Task: Given a gene symbol, provide functional annotation
Iterations:
Let’s see the evolution…
Task: Classify DNA motifs by regulatory function
Challenge:
Approach: Use few-shot learning with biological examples
Task: Extract structured data from PubMed abstracts
Requirements:
Technique: Chain-of-thought + JSON schema
Theory: Attention mechanism weighs all tokens in context
Practice: Prompt structure and order matter significantly
Theory: Context windows are finite (N tokens)
Practice: Strategic information prioritization is crucial
Theory: LLMs are trained on next-token prediction
Practice: Examples (few-shot) leverage this training directly
Next topic: Retrieval-Augmented Generation (RAG)
The problem we’ll solve:
What if your knowledge doesn’t fit in the context window?
What if the model wasn’t trained on your specific data?
Solution: Retrieve relevant information dynamically and augment the context
Practice datasets:
Further reading:
Next session: Retrieval-Augmented Generation (RAG)
Demo code available in: lectures/demos/session_2/