PRODUCTION PIPELINE · KAGGLE TESLA P100
Fine-Tuned Nemotron-120B
Intelligence Platform
Intelligence Platform
Multi-agent RAG system backed by FAISS vector index over 923 DigitalOcean documents.
ClassicalEnsemble achieves 98.55% validation accuracy across 4 doc classes.
98.55%
BEST VAL ACCURACY
ClassicalEnsemble · GBM+RF+LogReg
Val Accuracy
98.55
ClassicalEnsemble
↑ +13.77pp vs MLP
Corpus Size
923
HTML documents
FAISS indexed
LoRA Rank
16
α=32, dropout=0.05
7 target modules
Vector Dim
384
all-MiniLM-L6-v2
Cosine metric
Doc Classes
4
community/docs/support/tutorial
Stratified split
40%
Train
369 rows
15%
Validation
138 rows
15%
Test
138 rows
30%
Holdout
277 rows
MODEL ARCHITECTURE
Primary ModelNVIDIA Nemotron-120B-FP8
ArchitectureCausalLM (NemotronH)
Quantisation4-bit NF4 → bf16 → fp32
Fine-tuningLoRA r=16, α=32, drop=0.05
LoRA Targetsq/k/v/o/gate/up/down_proj
Fallback Embedderall-MiniLM-L6-v2
Re-rankerms-marco-MiniLM-L-6-v2
Vector BackendFAISS IndexFlatIP
DOCUMENT TYPE ROUTING
tutorial · 65%
docs · 22%
support · 11%
community · 2%
ANTI-OVERFITTING STACK
◆ Mixup augmentation (α=0.2)
◆ Label smoothing (ε=0.1)
◆ Dropout 40% per MLP layer
◆ Gaussian input noise std=0.02
◆ Early stopping patience=8
◆ ReduceLROnPlateau factor=0.5
◆ WeightedRandomSampler
◆ AdamW weight decay + grad clip
ML PIPELINE EXPLORER8 stages · tap to expand
MODEL ACCURACY
ANTI-OVERFITTING CONFIG
Mixup α0.2
Label smoothing ε0.1
MLP Dropout0.40 per layer
Input noise std0.02 Gaussian
Early stoppingpatience=8, δ=1e-4
LR schedulerReduceLROnPlateau ×0.5
SamplerWeightedRandomSampler
Gradient clippingmax_norm=1.0
Weight decayAdamW, 1e-4
FAISS VECTOR SEARCH923 docs · 384d · cosine similarity
FAISS INDEX
INDEX TYPE
IndexFlatIP
VECTORS
923
DIMENSIONS
384d
RE-RANK K
20 → 5
QUERY ROUTING
Run a search to see routing, embedding, and retrieval metadata.
TOP DOCUMENTSdefault ranking · 8 results
ANALYTICS & CHARTSInteractive visualizations
ACCURACY ACROSS SPLITS
MODEL RADAR COMPARISON
DOC TYPE DISTRIBUTION
CONFUSION MATRIX (VAL)
MLP TRAINING CURVESepoch-by-epoch accuracy
FULL EVALUATION RESULTSAll splits · all models
CLASSICALENSEMBLE · ALL SPLITS
GENERALIZATION GAPtrain − holdout
FULL METRICS TABLE
DEMO SEARCH RESULTSquery: 'ssh keys ubuntu'
AGENT DEMO OUTPUTS
GENERATED PLOTS CATALOGUE13 visualisations
SETTINGS
API CONFIGURATION
Anthropic API Key
Required for live Agent Chat responses
Model
claude-sonnet-4-20250514
Max Tokens
Max response length
AGENT OPTIONS
PII Scrubbing
Redact emails, phones, CC numbers
Auto Escalation
Create ticket when agent escalates
Show Citations
Display KB sources in chat
RAG Re-ranking
ms-marco cross-encoder reranker
MODEL PIPELINE
Base ModelNVIDIA Nemotron-120B-FP8
LoRA Rankr=16, α=32
Quantisation4-bit NF4 bitsandbytes
Embedderall-MiniLM-L6-v2
Vector IndexFAISS IndexFlatIP
Corpus923 DigitalOcean HTML docs
Training envKaggle · Tesla P100 · CUDA 12.2
ABOUT
DigitalGradient AI is an end-to-end fine-tuning and RAG pipeline built on NVIDIA Nemotron-120B-FP8 with LoRA adapters trained on 923 DigitalOcean documentation files.
The pipeline includes a ClassicalEnsemble (GBM+RF+LogReg, val_acc=98.55%) and DeepMLP (256→128→64, val_acc=84.78%) for document classification into 4 classes.
FAISS IndexFlatIP provides millisecond vector retrieval over the 384-dimensional embedding space, re-ranked by a cross-encoder for precision.
Nemotron-120B-FP8
LoRA r=16
FAISS · 384d
val_acc 98.55%