Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Publish to PyPI when a release is created
# See: https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/

name: Publish to PyPI

on:
release:
types: [published]

jobs:
build:
name: Build distribution
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v4

- name: Build package
run: uv build

- name: Store the distribution packages
uses: actions/upload-artifact@v4
with:
name: python-package-distributions
path: dist/

publish-to-pypi:
name: Publish to PyPI
needs: build
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/splinator
permissions:
id-token: write # Required for trusted publishing

steps:
- name: Download distributions
uses: actions/download-artifact@v4
with:
name: python-package-distributions
path: dist/

- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1

32 changes: 16 additions & 16 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,32 +10,32 @@ on:
branches: [ "main" ]

jobs:
build:

test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.7", "3.8", "3.9"]
python-version: ["3.9", "3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
run: uv python install ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install flake8 pytest
python -m pip install pdm
pdm install
run: uv sync --dev

- name: Lint with flake8
run: |
uv run pip install flake8
# stop the build if there are Python syntax errors or undefined names
flake8 ./src --count --select=E9,F63,F7,F82 --show-source --statistics
uv run flake8 ./src --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 ./src --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
uv run flake8 ./src --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

- name: Test with pytest
run: |
pdm run -v pytest tests
run: uv run pytest tests -v
18 changes: 15 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

[scikit-learn](https://scikit-learn.org) compatible

[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)
[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
[![Documentation Status](https://readthedocs.org/projects/splinator/badge/?version=latest)](https://splinator.readthedocs.io/en/latest/)
[![Build](https://img.shields.io/github/actions/workflow/status/affirm/splinator/.github/workflows/python-package.yml)](https://github.com/affirm/splinator/actions)

Expand Down Expand Up @@ -45,9 +45,21 @@ Regression](https://github.com/Affirm/splinator/wiki/Linear-Spline-Logistic-Regr

## Development

The dependencies are managed by [pdm](https://pdm.fming.dev/latest/)
The dependencies are managed by [uv](https://github.com/astral-sh/uv).

To run tests, run `pdm run -v pytest tests`
```bash
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create virtual environment and install dependencies
uv sync --dev

# Run tests
uv run pytest tests -v

# Run type checking
uv run mypy src/splinator
```

## Example Usage

Expand Down
41 changes: 30 additions & 11 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,31 +4,50 @@ build-backend = "hatchling.build"

[project]
name = "splinator"
version = "0.2.0"
description = "Python library for fitting linear-spine based logistic regression for calibration."
dynamic = ["version"]
description = "Python library for fitting linear-spline based logistic regression for calibration."
authors = [
{name = "Jiarui Xu", email = "[email protected]"},
]
dependencies = [
"scipy<2.0.0,>=1.6.0",
"scikit-learn<2.0.0,>=1.0.0",
"numpy<2.0.0,>=1.19.0",
"pandas<2.0.0,>=1.3.0",
"scipy>=1.6.0",
"scikit-learn>=1.0.0",
"numpy>=1.19.0",
"pandas>=1.3.0",
]
requires-python = ">=3.7.1,<4.0"
requires-python = ">=3.9"
readme = "README.md"
license = {text = "BSD-3-Clause"}
keywords = ["calibration", "logistic", "spline", "regression"]
classifiers = ["Programming Language :: Python :: 3", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent"]
classifiers = [
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"License :: OSI Approved :: BSD License",
"Operating System :: OS Independent",
]

[project.optional-dependencies]
dev = [
"pytest<8.0.0,>=7.1.2",
"matplotlib<4.0.0,>=3.5.3",
"mypy<1.0,>=0.971",
"pytest>=7.1.2",
"matplotlib>=3.5.3",
"mypy>=0.971",
"jupyter>=1.0.0",
]

[tool.hatch.version]
path = "src/splinator/_version.py"

[tool.hatch.build.targets.wheel]
packages = ["src/splinator"]

[tool.uv]
dev-dependencies = [
"pytest>=7.1.2",
"matplotlib>=3.5.3",
"mypy>=0.971",
"jupyter>=1.0.0",
]

11 changes: 11 additions & 0 deletions src/splinator/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,12 @@
from ._version import __version__
from .estimators import LinearSplineLogisticRegression
from .metrics import expected_calibration_error, spiegelhalters_z_statistic
from .monotonic_spline import Monotonicity

__all__ = [
"__version__",
"LinearSplineLogisticRegression",
"expected_calibration_error",
"spiegelhalters_z_statistic",
"Monotonicity",
]
2 changes: 1 addition & 1 deletion src/splinator/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.0.1"
__version__ = "0.3.0"
104 changes: 84 additions & 20 deletions src/splinator/estimators.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from scipy.optimize import LinearConstraint, minimize
from scipy.special import expit, log_expit
from sklearn.base import BaseEstimator, RegressorMixin, TransformerMixin
from sklearn.utils.validation import check_array, check_random_state, _check_sample_weight, check_consistent_length
from sklearn.utils.validation import check_random_state, validate_data
from sklearn.exceptions import DataConversionWarning, NotFittedError
from warnings import warn
from splinator.monotonic_spline import (
Expand Down Expand Up @@ -70,15 +70,46 @@ def grad(self, coefs):


class LinearSplineLogisticRegression(RegressorMixin, TransformerMixin, BaseEstimator):
""" Piecewise Logistic Regression with Linear Splines
"""Piecewise Logistic Regression with Linear Splines.

For more information regarding how to build your own estimator, read more
in the :ref:`User Guide <user_guide>`.
A scikit-learn compatible estimator that fits a piecewise linear function
in log-odds space for probability calibration. Supports monotonicity
constraints and L2 regularization.

Parameters
----------
demo_param : str, default='demo_param'
A parameter used for demonstation of how to pass and store paramters.
input_score_column_index : int, default=0
Index of the column in X to use as the primary input score for spline fitting.
n_knots : int or None, default=100
Number of knots to automatically place at quantiles of the input distribution.
Only one of `knots` and `n_knots` should be provided.
knots : array-like of float or None, default=None
Explicit knot positions. Only one of `knots` and `n_knots` should be provided.
monotonicity : str, default='none'
Monotonicity constraint: 'none', 'increasing', or 'decreasing'.
intercept : bool, default=True
Whether to include an intercept term in the model.
method : str, default='SLSQP'
Optimization method for scipy.optimize.minimize. Supports 'SLSQP' or 'trust-constr'.
minimizer_options : dict or None, default=None
Additional options passed to the scipy minimizer.
C : float, default=100
Inverse of regularization strength (larger values = weaker regularization).
two_stage_fitting_initial_size : int or None, default=None
If provided, performs initial fit on a subsample of this size for faster convergence.
random_state : int, default=31
Random seed for reproducibility.
verbose : bool, default=False
Whether to print verbose output during fitting.

Attributes
----------
coefficients_ : ndarray
Fitted coefficients after training.
knots_ : ndarray
The knot positions used in the model.
n_features_in_ : int
Number of features seen during fit.

Examples
--------
Expand All @@ -98,7 +129,7 @@ def __init__(
monotonicity: str = Monotonicity.none.value,
intercept: bool = True,
method: str = MinimizationMethod.slsqp.value,
minimizer_options: Dict[str, Any] = None,
minimizer_options: Optional[Dict[str, Any]] = None,
C: int = 100,
two_stage_fitting_initial_size: int = None,
random_state: int = 31,
Expand Down Expand Up @@ -195,7 +226,7 @@ def _fit(self, X, y, initial_guess=None):
jac=lgh.grad,
method=self.method,
constraints=constraint,
options=self.minimizer_options,
options=self.minimizer_options or {},
)
self.result_ = result
optimization_message = "The minimization failed with message: '{message}'".format(message=result.message)
Expand Down Expand Up @@ -225,23 +256,26 @@ def fit(self, X, y):
We use two_stage_fitting_size as the sampling size.
"""
self.random_state_ = check_random_state(self.random_state)
check_params = dict(accept_sparse=False, ensure_2d=False)

X = check_array(X, dtype=[np.float64, np.float32], **check_params)
self.n_features_in_ = 1 if X.ndim == 1 else X.shape[1]
# Validate X and y, this sets n_features_in_ automatically
X, y = validate_data(
self,
X,
y,
accept_sparse=False,
ensure_2d=True,
dtype=[np.float64, np.float32],
y_numeric=True,
multi_output=False,
)

if y is None:
raise ValueError("y should be a 1d array")
y = check_array(y, dtype=X.dtype, **check_params)
if y.ndim > 1:
warn(
"A column-vector y was passed when a 1d array was expected.",
DataConversionWarning,
)
y = y[:, 0]

check_consistent_length(X, y)

if self.n_knots and self.knots is None:
# only n_knots given so we create knots
self.n_knots_ = min([self.n_knots, X.shape[0]])
Expand All @@ -253,7 +287,7 @@ def fit(self, X, y):
raise ValueError("knots and n_knots cannot be both null or non-null")

if self.method not in ['SLSQP', 'trust-constr']:
raise ValueError("optimization method can only be either 'SLSQP' or 'trust-contr'")
raise ValueError("optimization method can only be either 'SLSQP' or 'trust-constr'")

if self.two_stage_fitting_initial_size is None:
self._fit(X, y, initial_guess=None)
Expand All @@ -279,8 +313,16 @@ def transform(self, X):
raise NotFittedError(
"predict or transform is not available if the estimator was not fitted"
)
check_params = dict(accept_sparse=False, ensure_2d=False)
X = check_array(X, dtype=[np.float64, np.float32], **check_params)

# Validate X and check n_features_in_ consistency
X = validate_data(
self,
X,
accept_sparse=False,
ensure_2d=True,
dtype=[np.float64, np.float32],
reset=False,
)

design_X = _get_design_matrix(
inputs=self.get_input_scores(X),
Expand All @@ -304,9 +346,31 @@ def is_fitted(self) -> bool:
"""
return hasattr(self, 'coefficients_')

def __sklearn_tags__(self):
"""
Define sklearn tags for scikit-learn >= 1.6.

Returns
-------
tags : sklearn.utils.Tags
"""
from sklearn.utils import Tags, TargetTags, InputTags, RegressorTags

tags = super().__sklearn_tags__()
tags.target_tags = TargetTags(
required=True,
one_d_labels=True,
two_d_labels=False,
positive_only=False,
multi_output=False,
single_output=True,
)
tags.regressor_tags = RegressorTags(poor_score=True)
return tags

def _more_tags(self) -> Dict[str, bool]:
"""
Override default sklearn tags (sklearn.utils._DEFAULT_TAGS)
Override default sklearn tags for scikit-learn < 1.6.

Returns
-------
Expand Down
1 change: 1 addition & 0 deletions tests/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from typing import TYPE_CHECKING, NamedTuple
import itertools
import numpy as np
import pandas as pd

from splinator.monotonic_spline import Monotonicity
from splinator.estimators import LinearSplineLogisticRegression, MinimizationMethod
Expand Down
Loading