v1.0 · MIT License · Python ≥ 3.8

Track every
ML experiment.
Locally.

Log, compare, and visualize your model runs in a rich Streamlit dashboard — zero cloud, zero setup, just a SQLite file on your machine.

Get started View on GitHub
$ pip install pymlens

Everything you need.
Nothing you don't.

PyMLens wraps around your existing sklearn code with minimal changes and handles all the bookkeeping automatically.

🔒

Fully local

All data lives in a SQLite DB at ~/.pymlens/experiments.db. Nothing ever leaves your machine.

Minimal code changes

Wrap your existing training loop in a with block. That's it. No refactoring required.

📊

Classification & Regression

Full metric suites for both problem types — accuracy, F1, precision, recall, R², MSE, RMSE, and more.

🔁

Cross-validation built-in

3-fold CV enabled by default for classification. Toggle it with a single parameter for regression experiments.

📈

Overfitting detection

Automatically tracks train vs. validation accuracy side-by-side so you can spot overfitting at a glance.

🧬

AI DNA Report

Groq-powered LLaMA 3.1 analysis: per-model score interpretation, improvement suggestions, and a best-model verdict.


Two lines to
track a model.

Classification or regression — the API is identical. Wrap, add models, done.

Classification
Regression
from pymlens import Classification_Experiment from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.linear_model import LogisticRegression from sklearn.svm import SVC x, y = load_iris(return_X_y=True) xtrain, xval, ytrain, yval = train_test_split( x, y, test_size=0.2, random_state=42 ) with Classification_Experiment( "Iris_Classification", xtrain, ytrain, xval, yval ) as exp: exp.Start_experiment( model=LogisticRegression(), exp_keyword="Logistic_reg", cross_val=True ) exp.Start_experiment( model=RandomForestClassifier(), exp_keyword="RF_baseline" ) exp.Start_experiment( model=SVC(), exp_keyword="SVM_rbf" )

Start_experiment params

All parameters available on the context manager's Start_experiment call.

model required

Any scikit-learn compatible estimator

exp_keyword str · None

Custom label for this run. Falls back to class name.

cross_val bool

3-fold CV. Default True for clf, False for regression.

🗄️

How data is stored

All results persist in a local SQLite database. Re-running with the same experiment name upserts — no duplicates.

Experiments Scores Regression_Scores

Everything tracked
automatically.

Comprehensive metric coverage for both classification and regression, stored per run in SQLite.

Classification

Metrics

MetricDescription
AccuracyValidation accuracy
Train AccuracyTraining accuracy (overfitting check)
PrecisionWeighted precision
RecallWeighted recall
F1 ScoreWeighted F1
CV ScoreMean 3-fold CV score
Confusion MatrixJSON stored, visualized as heatmap
Regression

Metrics

MetricDescription
MSEMean Squared Error
MAEMean Absolute Error
RMSERoot Mean Squared Error
Coefficient of determination
CV Scoreneg_mean_squared_error CV (opt-in)

Three pages.
Total visibility.

Launch the Streamlit dashboard with one command and explore every experiment interactively.

$ pymlens dashboard
page 01

📊 Model Comparison

  • Leaderboard sorted by F1 / R²
  • Grouped bar chart across all metrics
  • Radar / spider comparison chart
  • Precision vs. Recall scatter plot
  • Cross-validation stability bars
  • Confusion matrix heatmap
  • Copy hyperparameters as JSON
page 02

🌀 Sunburst Explorer

  • Drill-down: All Exps → Exp → Model → Metric
  • Hover to see all metric values
  • Supports classification and regression
  • Fully interactive Plotly chart
page 03

🧬 AI DNA Report

  • Powered by LLaMA 3.1 via Groq API
  • Score interpretation per model
  • Overfitting & CV stability analysis
  • One specific improvement per model
  • Final VERDICT — best model + reason

Up and running
in 3 steps.

01

Install the package

Requires Python ≥ 3.8. Install via pip — all dependencies pulled automatically.

pip install pymlens
02

Run your experiments

Wrap your training code in a with block and call Start_experiment for each model.

with Classification_Experiment("my_exp", xtrain, ytrain, xval, yval) as exp: exp.Start_experiment(model=RandomForestClassifier(), exp_keyword="RF")
03

Launch the dashboard

One command opens the full Streamlit dashboard in your browser.

pymlens dashboard
04

Optional: enable AI Critics

Get a free Groq API key and save it using the settings utility for LLaMA-powered analysis.

from pymlens import Pymlens_settings settings = Pymlens_settings() settings.add_api_key() # prompts for your Groq key

Container-ready
out of the box.

Run with Docker

The Docker image persists your experiment database across restarts using a named volume. Groq key passed securely as an env variable — never baked in.

-p 8501:8501 Maps container port to your machine
-v pymlens_data:/root/.pymlens Persists experiment DB across restarts
-e GROQ_API_KEY=... Passes your Groq key securely (optional)
# Build docker build -t pymlens . # Run docker run -p 8501:8501 \ -v pymlens_data:/root/.pymlens \ -e GROQ_API_KEY=your_key_here \ pymlens # Or with docker-compose (recommended) docker compose up

Minimal,
purposeful stack.

scikit-learn
Model training & metric computation
streamlit
Dashboard UI
plotly
Interactive charts
groq
AI DNA Report via LLaMA 3.1
pandas
Data manipulation
numpy
Numerical operations
sqlite3
Local storage (stdlib)
python ≥ 3.8
Minimum runtime requirement