Each function is tested for speed and quality using a modified version of
scipy.optimize.curve_fit as the gold standard. The reason for choosing
curve_fit is that it is a very common and accessible
function to use. If this package provides benefits to users of
curve_fit, it will likely improve the experience for
other packages like lmfit as well.
Each test computes a quality metric that is stored in a global dictionary keyed
by test ID. The dictionary is set up by the
tests.conftest.quality_metric fixture. Quality metrics are aggregated
and published in an HTML file upon completion.
The following tests are run for every fitting function in the scikit:
- A simple benchmark of the fit against the modified
- Checks that
curve_fitconverges faster using the fit as an initial guess than it would with default parameters.
- Total Speed
- Benchmarks the fit +
curve_fitwith the fit as an initial guess against just
curve_fitwith the default guess. Passing this test is optional.
- The RMS of the data about the function fitted with the fitting
routine must be within an acceptable threshold of the RMS of the
data about the reqult of
- The parameters computed by the fitting function must be within
acceptable thresholds of the parameters computed by
- Pathological Cases
- Each function has pathological cases that it can not handle properly. Tests for each function should be included on a case-by-case basis to ensure the contractal behavior in these cases. For example, exponential fits can not be made to colinear data, unless it is horizontal. All distributions run into issues when they are no longer over-determined.
- An implicit test of
curve_fitis done with each dataset, to ensure that the RMS of the data about the fit is lower than the RMS of the data about the model with initial fitting parameters (only for noisy datasets).
- Input Parameters
- Each input parameter should be tested individually to check for contracual behavior. Specifically, Weibull and power fits need to be careful about negative numbers in the input.
- If the source material on wich a the algorithm is based provides a sample, a “Paper Test” may be added to verify the results against what should be an independent implementation.
Add a partial domain test. E.g., Gaussian without the peak portion, etc. Not always valid, e.g. for exponential, which has no partial domain.
Add test to show that it works with sorted=True and x-data reversed.
Test environments can be created under conda with the following commands:
conda create --name skg-testing-py3.6 --no-default-packages python=3.6 nomkl numpy scipy pytest sphinx sphinx_rtd_theme conda create --name skg-testing-py3.7 --no-default-packages python=3.7 nomkl numpy scipy pytest sphinx sphinx_rtd_theme ...