wk_

Geonosis

Dec 5, 2025 · 7 min read
python opensource llm

The Backbone of the Separatist Army

Not really, just an LLM download engine.

Most of you have heard of open-source models but I’ve found equally as many don’t understand the sheer volume of tools available. Hugging Face hosts hundreds of thousands of them. Which models are actually production-ready? Which ones are worth the terabytes of storage? Where is the download button?

Classics.

Geonosis is a Python utility that answers these questions, hopefully. Inside you will find a rotating catalog of open-source LLMs and an automated download system that handles batch acquisition from Hugging Face. This post walks through how it works and what it has to offer.


The Problem

If you’re running local inference for research, fine-tuning, or production deployment you need models on disk. The workflow usually looks like:

  1. Research which models exist and are worth trying
  2. Find the correct Hugging Face repo IDs
  3. Figure out which file formats to download (safetensors vs. PyTorch)
  4. Download them one at a time, hoping nothing interrupts
  5. Clean up leftover cache files
  6. Repeat monthly when new models drop

This sucks.

It’s also error-prone, download the wrong format and you waste hundreds of gigabytes (maybe im the only one who does this, but still). Miss a new release and you’re benchmarking against last month’s state of the art. Fashion crime.

Geonosis automates the entire pipeline for you, cheers.


Architecture

The project is intentionally simple — two main components:

geonosis/
├── droid_factory.py              # CLI download tool
├── model_lists/
│   ├── master_list.py            # Curated catalog (184+ models)
│   └── archive/                  # Monthly snapshots
├── model_docs/
│   ├── master_list.md            # Detailed model documentation
│   └── archive/                  # Previous documentation versions
├── requirements.txt
└── .env                          # HF_TOKEN and OUTPUT_BASE

master_list.py defines the catalog for browsing, this is your magazine. droid_factory.py consumes it and handles downloads, the order sheet.

That’s all there is to it.


The Model Catalog

The catalog is the core of the project. It’s a structured module containing every model as a dictionary with five fields:

{
    "id": "meta-llama/Llama-3.2-3B-Instruct",
    "name": "Llama-3.2-3B-Instruct",
    "size_gb": 6.4,
    "category": "general",
    "description": "Edge deployment champion"
}

id is the exact Hugging Face repository path, what you’d pass to snapshot_download(). name is a human-readable label used in the CLI. size_gb is the estimated download size in safetensors format. category and description provide classification and context.

Organization by Size and Category

Models are grouped into collections that encode both size tier and specialization:

ALL_MODEL_COLLECTIONS = {
    # Small models (1-3B)
    "small_general": SMALL_GENERAL,
    "small_coding": SMALL_CODING,
    "small_math": SMALL_MATH,

    # Medium models (7-14B)
    "medium_general": MEDIUM_GENERAL,
    "medium_coding": MEDIUM_CODING,
    "medium_math": MEDIUM_MATH,
    "medium_multimodal": MEDIUM_MULTIMODAL,
    "medium_long_context": MEDIUM_LONG_CONTEXT,

    # Large models (30-34B)
    "large_general": LARGE_GENERAL,
    "large_coding": LARGE_CODING,
    "large_multimodal": LARGE_MULTIMODAL,

    # Extra large models (40-72B)
    "xlarge_general": XLARGE_GENERAL,
    "xlarge_coding": XLARGE_CODING,
    "xlarge_math": XLARGE_MATH,
    "xlarge_multimodal": XLARGE_MULTIMODAL,
    "xlarge_moe": XLARGE_MOE,

    # Massive models (100B+)
    "massive_dense": MASSIVE_DENSE,
    "massive_moe": MASSIVE_MOE,

    # Specialized
    "safety": SAFETY_MODELS,
}

Need all coding models regardless of size? Filter by category. Need everything that fits on a single GPU? Filter by size.

Categories

The catalog spans eight categories:

CategoryWhat It Covers
generalInstruction-following, chat, general reasoning
codingCode generation, software engineering tasks
mathMathematical and scientific reasoning
reasoningDeep reasoning, complex multi-step problems
multimodalVision, audio, video understanding
safetyGuardrails, content classification, prompt injection detection
rewardRLHF reward and preference models
frontierCutting-edge models pushing state of the art

Size Tiers

The catalog is divided into five size tiers:

SIZE_CATEGORIES = {
    "small": "1-3B Parameters",
    "medium": "7-14B Parameters",
    "large": "30-34B Parameters",
    "xlarge": "40-72B Parameters",
    "massive": "100B+ Parameters",
}

The total catalog weighs in at roughly 26,500 GB across all 184 models.

Helper Functions

The catalog exports a few helper function for more specific needs:

from model_lists.master_list import get_all_models, get_models_by_category

# All 184 models
all_models = get_all_models()

# Just the coding specialists
coding_models = get_models_by_category("coding")

# Everything that fits on a single high-end GPU
medium_models = get_models_by_size("medium")

# A specific sub-collection
xlarge_moe = get_collection("xlarge_moe")

get_models_by_size() works by prefix-matching collection names. get_models_by_category() works in the same way. Crazy.

def get_models_by_size(size_category: str):
    all_models = []
    for collection_name, collection in ALL_MODEL_COLLECTIONS.items():
        if collection_name.startswith(size_category):
            all_models.extend(collection)
    return all_models

Helpful for querying collections even after the list updates.


The Download Engine

droid_factory.py is the CLI that turns catalog entries into models on disk. It’s built on huggingface_hub’s snapshot_download and adds batch orchestration, format filtering, and progress tracking on top.

Overkill to some, but we need a serious factory if we want to hit quota.

Configuration

Two environment variables control everything:

HF_TOKEN=hf_your_token_here       # For gated models (Llama, etc.)
OUTPUT_BASE=/Volumes/robots/models  # Where models land

Format Filtering

Not every file in a Hugging Face repo is worth downloading, trust me. Models are often published in multiple formats: safetensors, PyTorch .bin, ONNX, and others. Downloading all of them wastes storage on duplicate weights.

The download function explicitly selects what to keep and what to skip:

snapshot_download(
    repo_id=model_id,
    local_dir=local_dir,
    max_workers=4,
    allow_patterns=[
        "*.safetensors",
        "*.json",
        "*.txt",
        "*.md",
        "*.model",
        "tokenizer*",
        "LICENSE*",
        "*.tiktoken"
    ],
    ignore_patterns=[
        "*.bin",
        "*.pt",
        "*.pth",
        "*.msgpack",
        "*.h5",
        "*.ckpt"
    ]
)

safetensors is the target format. Faster to load, supports memory mapping, and is the modern standard. The allow list also grabs tokenizer files, configs, and documentation. The ignore list blocks every legacy weight format.

The max_workers=4 parameter enables parallel file downloads within a single model, which helps with repos that shard weights across dozens of files.

Cache Cleanup

huggingface_hub leaves behind a .cache directory with metadata and partial downloads. After a successful download, the factory cleans it up:

cache_dir = os.path.join(local_dir, ".cache")
if os.path.exists(cache_dir):
    shutil.rmtree(cache_dir)

Without this your storage calculation WILL NOT be accurate.

Progress and Error Handling

Each download prints structured status information: model name, repo ID, category, estimated size, and timestamps. Failed downloads don’t stop the batch:

results = []
total = len(selected_models)

for i, model in enumerate(selected_models, 1):
    success = download_model(model, i, total)
    results.append((model["name"], model["size_gb"], success))

At the end, a summary table shows what succeeded and what failed:

DOWNLOAD SUMMARY
========================================
✓ SUCCESS    Qwen2.5-72B-Instruct        (144 GB)
✓ SUCCESS    Phi-4                        (28 GB)
✗ FAILED     Llama-3.2-3B-Instruct       (6 GB)

Successful: 2/3 models
Failed:     1/3 models
Downloaded: ~172 GB

This batch-then-report pattern means you can kick off a large download, walk away, and review the results when you come back.

This is my personal favorite feature. These downloads fail quite often and when they do you’re gonna want to know why.


CLI Interface

The tool supports several usage modes through argparse:

Interactive Mode

Running without arguments opens an interactive menu:

python droid_factory.py

This displays every model in the catalog grouped by category, then prompts for selection by number. You can pick individual models or download everything.

Nostalgic CS101 terminal menu.

Targeted Downloads

# Specific models by name
python droid_factory.py --models "Qwen2.5-72B-Instruct" "Phi-4"

# All models in a category
python droid_factory.py --category coding

# All models in a size tier
python droid_factory.py --size small

# Everything
python droid_factory.py --all --non-interactive

The --non-interactive flag skips confirmation prompts, making it suitable for scripted or unattended downloads.

Discovery

# See all categories and model counts
python3 droid_factory.py --list-categories

# See all size tiers
python3 droid_factory.py --list-sizes

These output formatted tables with model counts and total sizes per group, so you can estimate storage needs before committing to a download.


Monthly Update Cycle

The catalog isn’t static. New models ship constantly*

  1. New models are evaluated and added to master_list.py
  2. Detailed documentation (benchmarks, architecture, licenses) goes into model_docs/master_list.md
  3. The previous month’s catalog is archived in model_lists/archive/ and model_docs/archive/

The archive provides a historical record. You can see what was available in December 2025 versus February 2026 — useful for reproducing experiments or understanding how the landscape has shifted.

*Constantly = whenever I feel like it, usually not often if I’m being honest.


Stack

Big stuff here, you won’t see a dependency library this advanced anywhere.

DependencyPurpose
huggingface_hubModel downloading and HF authentication
python-dotenvEnvironment variable management

That’s it.


Thanks for reading this far and I hope you get some sort of use out of this tool. I just hope there’s no droid attack on the Wookies, good relations with them, I have.

The project is open source at geonosis.