Bootstrap Aggregating in Javascript Math Toolkit

dvis-header

I had previously planned on converting most of my Actionscript/Flex stroke engine library to the Javascript Math Toolkit.  Of course, as life seems to always dictate, plans never work out as planned.  I’ve had numerous discussions over the last couple months regarding the interest in moving back-end or internal models to the client for performance purposes.  Two of these conversations involved analyses originally developed in R, and one person was interested in a server-side JS implementation.  Now, that was interesting :)

Data analysis or data science is a huge topic these days, so I changed my mind and have circled back around to the statistics and data analytics capability of the Javascript Math Toolkit.

I’ve always found the technique of Bagging or Bootstrap Aggregating interesting, especially in determining its practical application.  I still have to leave that to the professional statisticians as my job is to understand what they want and provide that need in high-quality mathematical software.

The JS Math Toolkit now contains a class for bagging and sub-bagging of one- and two-dimensional numerical datasets.  The linear least squares model has a bagged equivalent as well.

Here is a screen shot of a hypothetical example with synthetic data.  The original linear model is plotted in blue and eight bags are shown in light yellow.  The aggregated fit is shown in red.

bllsq

This analysis is accomplished with a single method call.

var baggedFit = bllsq.bagFit(_x, _y, 8);

General polynomial least squares is next.

The DataStats class has been enhanced, but is still only about 70% finished.  After that, my inclination is to move directly to machine-learning algorithms (yes, with support for boosting).

But, perhaps that plan will get changed a bit as well.  I suppose it depends on who I talk to next :)

Update – Polynomial least squares is now in the toolkit.  This is done by assembling the normal equations and solving with the dense solver in the Matrix class, so it is suitable for small-order polynomials.  There is a potential for numerical issues with this approach, so a future version will perform a quick condition estimate.  If the system is poorly conditioned, the problem will be reformulated and solved via SVD.  That will be at least a version 2.0 enhancement.

pllsq

Quadratic, cubic, and quartic fits are shown above.

Comments are closed.