Contributing¶
If you have an idea or would like to correct something, please submit an issue, or a pull request.
If you have questions or are not sure if something is a bug, feel free to submit an issue, or ask a question tagged scikit-guess on Stack Overflow.
This scikit is still in early stages, so please provide all the criticism and advice you can. Any support at all is welcome. In particular, the following areas are under construction:
- Proper packaging (setup.py)
- Proper testing setup (e.g. TravisCI, Appveyor). https://blog.ionelmc.ro/2014/05/25/python-packaging as a reference.
- pandas support
- Adding new algorithms
If you are really interested in going down this path, please read on:
Project Structure¶
Each fitting algorithm resides in its own module. All the functions get
imported into the base skg
namespace. Each module should contain a
function called model that applies the fitting parameters to a given set of
x-values, either raveled or along a particular axis (assuming the function is
1D). Multiple algorithms that fit to the same model can live in the same
module. There is no standard yet for resolving naming conflicts in such cases.
Testing¶
Testing is done using the pytest framework. A test module for every main
module exists in the skg.tests
package.
Tests for new modules are generated in a semi-automated manner (still WIP). All the modules containing a fitting function and a model will be tested against randomly generated inputs, and checked for speed and quality. The quality of each algorithm will be assessed based on these tests. Quality has three categories: speed, accuracy and usefulness.
- Speed is a benchmark against
scipy.optimize.curve_fit
. An algorithm that is slower than a non-linear optimizer starting with default parameters is not deemed very useful. - Accuracy is checked by making sure that the fit is within reasonable
bounds of the values computed by
scipy.optimize.curve_fit
. Reasonableness is a function of the analytically derived partial derivatives of the model with respect to the parameters. - Usefulness is a measure of how many iterations
scipy.optimize.curve_fit
saves by using the algorithm as an initial guess. Another informal metric is the combined runtime of the algorithm andcurve_fit
vs. the runtime of justcurve_fit
with default parameters. If the latter exceeds the former, that’s a win.
See the Testing SKG page for information on how to run and modify the tests.