Training and Custom Models

Create specialized models for your domain and stack without complex infrastructure: efficient fine-tuning (PEFT), adapters and merge, with reproducible evaluation. Register and deploy in one click.

"Verified compatibility" indicates internal testing until 09/2025. Some advanced functions appear as In development or Planned.

Why it's different

Production-oriented: models by domain and by stack (Technology/Domain Packs), not generic.
Real efficiency: PEFT (LoRA/QLoRA) to reduce cost/VRAM and accelerate iteration.
Automatic evaluation before deployment: quality, security, robustness, latency and cost.
Complete traceability: datasets, parameters, weights and versions with MLflow.

Training modes

PEFT (LoRA/QLoRA)

efficient fine-tuning on modest GPUs or accelerated CPU; ideal for specific domains.

Adapters

activate/deactivate specializations by stack/domain without retraining the core.

Model Merge

combines bases and specializations (e.g., "stack + domain") with stability controls.

Planned

Training Workflow

Guided process from data to deployed model

Training packs

⚙️

Technology Packs

models fine-tuned to stacks and versions (frameworks, libraries and conventions).

🎯

Domain Packs

models fine-tuned to verticals (support, banking, retail, health) with specific metrics and evaluations.

Guided flow (from data to production)

Data

load your dataset (QA/code/tickets/manuals); validations and balancing.

Configuration

choose base and method (LoRA/QLoRA/Adapter).

Job

launch training with hardware presets (GPU/CPU); monitoring and logs.

Evaluation

run the suite (quality, security/robustness, latency/cost).

Registration

save weights and metadata in MLflow; version.

Deployment

one click to Serving module; define policies and metrics.

Automatic evaluation (reproducible)

🧠

General LLM

exact match, F1/accuracy, BLEU/ROUGE/BERTScore (according to task).

💻

Code

compiles/"test passes", pass@k.

🔍

RAG

recall@k, MRR/nDCG, groundedness/% response with source.

🛡️

Security and robustness

jailbreak/PII/toxicity, prompt injection, adversarial variations.

⚡

Operation

P50/P95 latency, throughput, cost per query/1k tokens.

Evaluation adapters to recognized methodologies; we add new ones in 48–72h if you need them.

Hardware and backends (verified compatibility)

GA: CPU and NVIDIA (vLLM/Transformers/ONNX Runtime), OpenVINO (Intel CPU).
Beta: AMD ROCm, Intel iGPU/NPU, Habana.
Presets by memory/speed (balanced, performance, memory_efficient).

Key metrics you'll see

Improvement vs. base in target metric (e.g., +F1, +exact match, +pass@k).
Latency/cost per query compared (before/after).
Adapter size/VRAM and total training time.
Global health score (quality+security+operation).

Availability

Available today (v0.10 – 10/10/2025)

PEFT (LoRA/QLoRA) and Adapters with guided flow and hardware presets.
Basic evaluation (quality/operation) and MLflow registration.
1-click deployment to Serving module.

In development (v0.11 – Q4 2025)

Expanded suites (security/robustness), comparative dashboards and more Packs.

Planned (2026)

Advanced Model Merge, distributed multi-node jobs and expanded sector templates.

Ready to train your first domain model?

Become Early Adopter View pricing Back to Platform