CodeGen Status : Canonical Reference Scope : src/codegen/ - Solution skeleton generation with test extraction Related : Package README
CodeGen generates solution and practice skeleton files for LeetCode problems, providing the infrastructure needed for a LeetCode-like practice experience. It also extracts example test cases from problem descriptions and validates test file consistency.
Table of Contents Overview Scope Interfaces CLI Reference How It Fits in the System Typical Workflows Test Generation IO Schema Inference Format Migration Key Design Decisions Configuration Failure Modes and Constraints Related Documentation Overview CodeGen serves as the code generation engine for the NeetCode practice framework. Its primary purpose is to:
Generate reference skeleton files to solutions/ Generate practice skeleton files to practices/ Extract example test cases from LeetCode problem descriptions Provide a consistent structure that integrates with runner/ for testing The module is designed as a stateless generator - it produces output based on input without maintaining internal state.
Goals Goal Description Reference Generation Generate solution skeletons conforming to Solution Contract Practice Generation Generate practice skeletons that reuse reference infrastructure Test Extraction Extract example input/output from LeetCode HTML solve() Inference Auto-generate solve() based on method signature Focus on Solution Users only write class Solution; infrastructure is provided Reusable Components solution_header, Helper Catalog available for other modules
Non-Goals Non-Goal Reason β Auto-generate complete solutions Only generates skeleton; users implement solutions β Execute tests Handled by runner/ β Manage practice history Handled by practice_workspace β Fetch problem data Uses leetcode_datasource
Scope What this module handles β
Rendering file-level docstrings (solution_header) β
Parsing LeetCode code stubs β
Detecting and emitting helper classes (ListNode, TreeNode, etc.) β
Assembling complete module files β
Generating SOLUTIONS dict structure β
Creating solve() interface (placeholder or inferred) β
Extracting examples from HTML (example_parser) β
Inferring IO schema from signatures (io_schema) β
Generating test files (test_generator) β
Checking test consistency (checker) β
Migrating test formats (migrator) What this module explicitly avoids β Test execution (handled by runner/) β Practice file versioning (handled by practice_workspace) β Network requests for problem data (handled by leetcode_datasource) β CLI argument parsing for tools (handled by tools/) Interfaces High-level summary of public APIs. For complete API reference, see Package README .
Core Generation Interface Purpose generate_reference_skeleton() Generate skeleton to solutions/ generate_practice_skeleton() Generate skeleton to practices/ render_solution_header() Render file-level docstring parse_code_stub() Parse LeetCode code stub assemble_module() Assemble complete file from parts detect_required_helpers() Detect needed helper classes
Test Generation Interface Purpose generate_tests_from_datasource() Generate .in/.out files from examples parse_examples() Extract examples from HTML infer_io_schema() Infer IO format from signature generate_solve_function() Auto-generate solve() code
Validation & Migration Interface Purpose TestChecker Check test consistency migrate_problem() Migrate single problem's tests migrate_all() Migrate all tests
CLI Reference Generate Reference Skeleton # Basic generation
python -m codegen new <problem_id>
# With test files from examples
python -m codegen new <problem_id> --with-tests
# With auto-generated solve()
python -m codegen new <problem_id> --solve-mode infer
# Combined
python -m codegen new <problem_id> --with-tests --solve-mode infer --force
# Preview without writing
python -m codegen new <problem_id> --dry-run
Flag Description --with-tests Generate .in/.out files from LeetCode examples --solve-mode placeholder (default), infer (auto-generate), or tiered (for Tier-1/1.5 problems) --force Overwrite existing test files --dry-run Preview without writing files --header-level minimal, standard, or full
Generate Practice Skeleton python -m codegen practice <problem_id>
python -m codegen practice <problem_id> --all-solutions
Check Test Consistency # Check single problem
python -m codegen check <problem_id>
python -m codegen check <problem_id> -v
# Check all problems
python -m codegen check --all
python -m codegen check --all --limit 10
# JSON output
python -m codegen check --all --report json
Status Meaning match Test files match examples mismatch Test files differ from parsed examples missing_tests No test files exist parse_error Could not parse examples from HTML
# Preview migration
python -m codegen migrate <problem_id> --dry-run -v
# Migrate single problem
python -m codegen migrate <problem_id>
# Migrate all problems
python -m codegen migrate --all --dry-run
# Migrate without backup
python -m codegen migrate --all --no-backup
How It Fits in the System ββββββββββββββββββββββββββ
β leetcode_datasource β β Problem metadata + HTML
βββββββββββββ¬βββββββββββββ
β
βΌ
ββββββββββββββββββββββββββ βββββββββββββββββββββββββ
β codegen β βββΊ β practice_workspace β
β β βββββββββββββββββββββββββ
β ββββββββββββββββββββ β β
β β test_generator β β β manages history
β β solve_generator β β βΌ
β β checker β β practices/_history/
β β migrator β β
β ββββββββββββββββββββ β
βββββββββββββ¬βββββββββββββ
β generates
βΌ
solutions/
practices/
tests/
β
βΌ
ββββββββββββββββββββββββββ
β runner β β Executes tests
ββββββββββββββββββββββββββ
Module Relationships Module Relationship leetcode_datasource Uses - Fetches problem metadata and HTML practice_workspace Uses - Calls save_to_history() when practice exists runner Used by - Runs generated files and tests tools/ Used by - CLI wrappers invoke codegen
Typical Workflows Workflow: Generate Reference with Tests When codegen new <problem_id> --with-tests is invoked:
Check existence - If solutions/<id>_<slug>.py exists, stop Fetch metadata - Get problem info and HTML from leetcode_datasource Parse stub - Extract method signature, parameters, return type Detect helpers - Determine if ListNode, TreeNode, etc. are needed Infer IO schema - Map parameter types to input formats Generate solve() - Based on --solve-mode (placeholder or infer) Assemble module - Combine header, imports, helpers, SOLUTIONS, Solution, solve() Write solution - Output to solutions/<id>_<slug>.py Parse examples - Extract examples from HTML Generate tests - Create .in/.out files for each example Workflow: Check and Migrate # 1. Check current state
python -m codegen check --all
# 2. Preview migration
python -m codegen migrate --all --dry-run
# 3. Migrate with backup
python -m codegen migrate --all
# 4. Verify
python -m codegen check --all
Test Generation All generated test files use the JSON literal format:
Input File (.in):
Output File (.out):
Type Format Example Integer Plain number 42 Float Plain number 3.14 Boolean Lowercase true, false String Quoted "hello" Array JSON literal [1,2,3] 2D Array JSON literal [[1,2],[3,4]]
Type Support Tiers Tier Types solve() Generation Tier 0 int, str, List[int], List[str] β
Fully auto-generated Tier 1 List[List[int]], float β
Fully auto-generated Tier 2 ListNode, TreeNode β οΈ Placeholder with TODOs
IO Schema Inference Data Flow Question.Code (stub)
β parse_code_stub() β StubInfo
β infer_io_schema() β IOSchema
β generate_solve_function() β solve() code
IOSchema Structure @dataclass
class IOSchema :
method_name : str
params : List [ ParamSchema ] # [(name, type, format, separators)]
return_type : str
return_format : ParamFormat # SCALAR, ARRAY_1D, ARRAY_2D, etc.
needs_helpers : Set [ str ] # {"ListNode", "TreeNode"}
Format Type Hints Description SCALAR int, float, bool Single value STRING str String value ARRAY_1D List[int], List[str] 1D array ARRAY_2D List[List[int]] 2D matrix LINKED_LIST Optional[ListNode] Linked list TREE Optional[TreeNode] Binary tree
Purpose The migrator converts existing test files from legacy formats (space-separated, comma-separated) to the canonical JSON literal format.
Format Example Converted To space_sep 1 2 3 4 [1,2,3,4] comma_sep 1,2,3,4 [1,2,3,4] canonical [1,2,3,4] (no change)
Migration Report ============================================================
MIGRATION REPORT
============================================================
Problems processed: 45
Total files: 218
Migrated: 93
Skipped (already canonical): 125
Errors: 0
Key Design Decisions Decision Rationale Stateless design CodeGen has no internal state; outputs depend purely on inputs Parser doesn't guess stub_parser.py only parses; detection logic is separate Centralized assembly assemble.py handles file composition to avoid duplication Inline helpers by default Helper classes embedded in file for portability No template engine Pure Python string composition; no Jinja2 dependency Reuse over regenerate Practice skeletons reuse reference infrastructure when available JSON literal format Unambiguous, parseable, compatible with LeetCode examples Tiered type support Start with simple types, add complex types incrementally
Design Philosophy βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β codegen = stateless β Only generates, no state β
β workspace = stateful β Manages history/restore β
β runner = execution β Runs tests, no generation β
β β
β stub_parser: parse only β Separation of concerns β
β io_schema: infer format β Type-driven generation β
β helpers: centralized β Single source of truth β
β assemble.py: unified β Avoid duplication β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Configuration Config File Location Configuration Options Setting Default Description header.level "full" Header detail: minimal, standard, full helpers.mode "inline" Helper emit: inline, import, none skeleton.solve_mode "placeholder" solve() mode: placeholder, infer, tiered practice.multi_solution_mode "single" Practice mode: single, all
Priority Order CLI flag > .neetcode/codegen.toml > package defaults
Failure Modes and Constraints Constraint Behavior Problem not found Raises exception from leetcode_datasource Reference already exists Returns early with message (for codegen new) Invalid code stub Raises ParseError with details Missing TOML config Uses defaults Example parse failure Skips example, logs warning, continues Test file exists Skips (unless --force specified) Unsupported type Generates placeholder solve() with TODOs
Exit Codes Code Condition 0 Success 1 Metadata fetch failed or validation error 2 --strict-tests enabled + 0 tests generated (reserved)
Appendix: Output File Structure Reference skeleton output follows the Solution Contract :
"""
Problem: Two Sum
Link: https://leetcode.com/problems/two-sum/
...
"""
from typing import List , Optional
from _runner import get_solver
# Helper classes (if detected)
class ListNode :
...
# SOLUTIONS dict
SOLUTIONS = {
"default" : {
"class" : "Solution" ,
"method" : "twoSum" ,
"complexity" : "TODO: O(?)" ,
"description" : "TODO: describe your approach" ,
},
}
# Solution class
class Solution :
def twoSum ( self , nums : List [ int ], target : int ) -> List [ int ]:
# TODO: Implement your solution
pass
# solve() interface (auto-generated with --solve-mode infer)
def solve ():
"""
Input format (JSON literal, one per line):
nums: List[int]
target: int
Output: List[int]
"""
import sys
import json
data = sys . stdin . read () . strip () . split ( ' \n ' )
nums = json . loads ( data [ 0 ] . strip ())
target = int ( data [ 1 ] . strip ())
solver = get_solver ( SOLUTIONS )
result = solver . twoSum ( nums , target )
print ( json . dumps ( result , separators = ( ',' , ':' )))
if __name__ == "__main__" :
solve ()
Appendix: Module Structure codegen/
βββ __init__.py # Public API re-exports
βββ __main__.py # python -m codegen
βββ cli.py # CLI: new / practice / check / migrate
βββ checker.py # Test consistency checker
βββ analyzer.py # Mismatch analysis and reporting
βββ migrator.py # Format migration tool
βββ core/
β βββ __init__.py
β βββ solution_header.py # Header rendering
β βββ stub_parser.py # LeetCode stub parsing
β βββ assemble.py # Module assembly
β βββ config.py # Configuration management
β βββ io_schema.py # IO format inference
β βββ example_parser.py # HTML example extraction
β βββ solve_generator.py # solve() auto-generation
β βββ tiered_solve_generator.py # Tiered solve() for Tier-1/1.5 problems
β βββ problem_support.py # Problem-specific config loading
β βββ test_generator.py # Test file generation
β βββ catalog/ # Problem catalog utilities
β β βββ __init__.py
β βββ helpers/
β βββ __init__.py
β βββ catalog.py # Canonical helper definitions
β βββ detect.py # Helper detection logic
β βββ emit.py # Helper code emission
βββ reference/
β βββ __init__.py
β βββ generator.py # Reference skeleton generation
βββ practice/
βββ __init__.py
βββ generator.py # Practice skeleton generation
βββ reuse.py # Reuse from reference
Tiered Solve Generation For problems involving complex types (TreeNode, ListNode), use --solve-mode tiered:
python -m codegen new 104 --solve-mode tiered
This generates solve() functions with codec support for serialization/deserialization of complex types. See Problem Support Boundary for tier definitions.
January 9, 2026 10:49:40 December 31, 2025 16:42:23