Skip to content

LeetCode API

Status: Canonical Reference
Scope: LeetCode API cache module in tools/leetcode-api/
Last Updated: December 28, 2025 20:39:00
Created: December 27, 2025 01:15:58
Purpose: SQLite-backed cache and unified API for LeetCode question data
Location: tools/leetcode-api/

This module provides a local cache and unified API for accessing LeetCode question data, reducing network requests and improving performance for tools that need problem metadata.


Features

  • SQLite Cache: Persistent local storage for question data
  • LeetScrape Integration: Automatic fallback to network fetch when not cached
  • Transparent API: Same Question interface whether from cache or network
  • Bulk Import: Import all questions from LeetScrape JSON dump

Quick Start

Using the API

from tools.leetcode_api import get_question

# Get question (from cache or network)
q = get_question("two-sum")

# Access question data (LeetScrape-compatible interface)
print(q.title)           # "Two Sum"
print(q.QID)             # 1
print(q.difficulty)      # "Easy"
print(q.topicTags)       # "array,hash-table"
print(q.Body)            # HTML problem description
print(q.Hints)           # ["hint1", "hint2", ...]
print(q.SimilarQuestions)  # [15, 18, 167, ...]

Bulk Import from LeetScrape Data

# Import all questions (auto-detect source: json or csv)
python tools/leetcode-api/import_all_question.py

# Explicitly use all.json
python tools/leetcode-api/import_all_question.py --source json

# Use CSV files (questions.csv + questionBody.pickle + questionTopics.csv)
python tools/leetcode-api/import_all_question.py --source csv

# Dry run (preview without saving)
python tools/leetcode-api/import_all_question.py --dry-run

# Include paid-only questions
python tools/leetcode-api/import_all_question.py --include-paid

Architecture

tools/leetcode-api/
โ”œโ”€โ”€ __init__.py              # Package exports
โ”œโ”€โ”€ question_api.py          # Unified public API
โ”œโ”€โ”€ question_serializer.py   # LeetScrape โ†” SQLite conversion
โ”œโ”€โ”€ question_store.py        # SQLite storage layer
โ”œโ”€โ”€ import_all_question.py   # Bulk import script
โ”œโ”€โ”€ db/
โ”‚   โ””โ”€โ”€ leetcode.db          # SQLite database
โ””โ”€โ”€ data/
    โ”œโ”€โ”€ all.json             # Complete question data (primary)
    โ”œโ”€โ”€ questions.csv        # Basic metadata
    โ”œโ”€โ”€ questionBody.pickle  # Problem statements
    โ”œโ”€โ”€ questionTopics.csv   # QID to topic mapping
    โ”œโ”€โ”€ topicTags.csv        # Topic tag details
    โ””โ”€โ”€ companies.csv        # Company information

Data Flow

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    get_question(slug)                       โ”‚
โ”‚                           โ”‚                                 โ”‚
โ”‚         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”               โ”‚
โ”‚         โ–ผ                                   โ–ผ               โ”‚
โ”‚   [Cache Hit]                        [Cache Miss]           โ”‚
โ”‚   SQLite ่ฎ€ๅ–                        LeetScrape ๆŠ“ๅ–         โ”‚
โ”‚         โ”‚                                   โ”‚               โ”‚
โ”‚         โ–ผ                                   โ–ผ               โ”‚
โ”‚   deserialize()                     serialize() + save()    โ”‚
โ”‚         โ”‚                                   โ”‚               โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜               โ”‚
โ”‚                           โ–ผ                                 โ”‚
โ”‚                     Question ็‰ฉไปถ                            โ”‚
โ”‚               (่ˆ‡ LeetScrape ๅฎŒๅ…จ็›ธๅฎน)                        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

API Reference

get_question(slug, force_refresh=False)

Get a LeetCode question by slug.

from tools.leetcode_api import get_question

q = get_question("two-sum")
q = get_question("two-sum", force_refresh=True)  # Bypass cache

Parameters: - slug (str): Question slug (e.g., "two-sum") - force_refresh (bool): If True, bypass cache and fetch from network

Returns: Question object or None if not found

get_question_by_id(qid)

Get a question by LeetCode QID (cache only).

from tools.leetcode_api import get_question_by_id

q = get_question_by_id(1)      # By int
q = get_question_by_id("0001") # By string (padded)

Parameters: - qid (int | str): Question ID

Returns: Question object or None if not in cache

Question Object

The Question dataclass matches LeetScrape JSON format:

Attribute Type Description
QID int Question ID (e.g., 1)
title str Problem title
titleSlug str URL slug (e.g., "two-sum")
difficulty str "Easy", "Medium", "Hard"
acceptanceRate float Acceptance rate percentage
topicTags str Comma-separated tags (e.g., "array,hash-table")
categorySlug str Category (e.g., "algorithms")
Hints List[str] List of hint strings
Companies List[str] or None Company tags
SimilarQuestions List[int] List of related QIDs
Code str Code template string
Body str HTML problem description
isPaidOnly bool Premium-only flag
paidOnly bool Same as isPaidOnly

Cache metadata: - _from_cache (bool): True if loaded from SQLite - _cached_at (str): Timestamp when cached


SQLite Schema

Database location: tools/leetcode-api/db/leetcode.db

CREATE TABLE questions (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    qid INTEGER NOT NULL,
    title TEXT NOT NULL,
    title_slug TEXT NOT NULL UNIQUE,
    difficulty TEXT,
    acceptance_rate REAL,
    paid_only INTEGER DEFAULT 0,
    topic_tags TEXT,
    category_slug TEXT,
    hints TEXT,              -- JSON array
    companies TEXT,          -- JSON array or null
    similar_questions TEXT,  -- JSON array of QIDs
    code TEXT,
    body TEXT,
    is_paid_only INTEGER DEFAULT 0,
    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
    updated_at TEXT DEFAULT CURRENT_TIMESTAMP
);

CLI Tools

import_all_question.py

Bulk import questions from LeetScrape data files.

# Basic usage (auto-detect source)
python tools/leetcode-api/import_all_question.py

# Options
python tools/leetcode-api/import_all_question.py --help
Option Description
--source {json,csv,auto} Data source: json (all.json), csv (combine CSVs), auto (default)
--data-dir PATH Path to data directory
--dry-run Parse data without saving
--include-paid Include paid-only questions
--include-empty Include questions with empty Body

Data Sources: - json: Uses all.json which contains complete question data - csv: Combines questions.csv, questionBody.pickle, and questionTopics.csv - auto: Tries json first, falls back to csv

Example output:

Reading: tools/leetcode-api/data/all.json
Found 3245 questions
  Progress: 100/3245 (3%)
  Progress: 200/3245 (6%)
  ...

==================================================
Import Summary
==================================================
  Total in JSON:    3245
  Imported:         2890
  Skipped (paid):   312
  Skipped (empty):  43
  Errors:           0
==================================================

Database now has: 2890 questions

Integration with docstring module

The docstring domain module uses this API for fetching question data:

# tools/docstring/formatter.py
from question_api import get_question

def get_full_docstring_data(slug: str) -> dict:
    q = get_question(slug)
    # ... extract description, examples, constraints

This enables: - Caching: Avoid repeated LeetCode API calls - Offline support: Work with cached data without network - Faster processing: Batch operations use local cache

See Docstring Formatter for the design and API details.


Important: LeetCode Problem ID Types

LeetCode uses two different ID systems which can cause confusion when looking up problems:

Field Description Example
question_id Internal database ID 958
frontend_question_id Problem number displayed on website 922

Why This Matters

Solution filenames in this project use the frontend_question_id (the number shown on LeetCode's website):

0922_sort_array_by_parity_ii.py  โ†’ frontend_question_id = 922

However, the same problem has question_id = 958 internally.

Cache Lookup

The fix_docstring.py tool uses tools/leetcode-api/crawler/.cache/leetcode_problems.json which contains both IDs:

{
  "0958": {
    "question_id": 958,
    "frontend_question_id": 922,
    "title": "Sort Array By Parity II",
    "slug": "sort-array-by-parity-ii"
  }
}

When looking up by problem number (from filename), always use frontend_question_id, not question_id:

# โœ… Correct: Use frontend_question_id
for key, value in cache.items():
    if value.get('frontend_question_id') == problem_id:
        return value['slug']

# โŒ Wrong: Using question_id would return wrong problem!
for key, value in cache.items():
    if value.get('question_id') == problem_id:
        return value['slug']  # May return wrong slug!

Quick Reference

Filename frontend_question_id question_id Correct Slug
0001_two_sum.py 1 1 two-sum
0922_sort_array_by_parity_ii.py 922 958 sort-array-by-parity-ii
0886_possible_bipartition.py 886 922 possible-bipartition

Data Source

Question data schema follows LeetScrape: