**New as of 2/21/01: Here is a link to a version that uses a Java
translation of a Minpack
nonlinear least squares routine to fit
multiple data sets. (This
version does not produce residual plots. It does, however, allow
mouse-based "graphical" changes in the starting values used to
estimate the parameters of the nonlinear least squares model.)**

**New as of 3/2/01: Here is a link to a version that uses Java
translations of some of the LINPACK numerical linear algebra routines
to handle a model that is more complex than a simple linear
regression. (In fact the model at this link is just
a + b (x - c)^2 but the LINPACK
routines that are used can be used to handle any general linear model.)**

Researchers sometimes encounter data sets that consist of many curves. The researchers may need to "reduce" these curves further before performing any statistical tests. Some examples:

- A researcher is interested in the effect of various preservative treatments on the modulus of elasticity of several grades of lumber. The design is a two way analysis of variance (ANOVA). The data points in each cell are modulus of elasticity values obtained from stress-strain curves.
- A researcher is interested in the effect of several chemical inhibitors on the growth of several fungus species. Again the design is a two way ANOVA. The data points in each cell are 50% inhibition doses obtained from colony size versus dose curves.
- A researcher is interested in the effects of linerboard source, corrugating medium, and adhesive type on the creep behavior of corrugated fiberboard under cyclic humidity conditions. Here the design is a three way ANOVA. The data points in each cell are steady-state creep rates obtained from strain versus time curves.

In all of these examples, before the ANOVAs can be performed, the data curves must be reduced to single numbers. Sometimes this process is straightforward and it can be automated. In other cases, especially when the curves contain "glitches," human intervention is required --- an investigator needs to look at a plot of each data curve, select a region to fit, look at the results of the fit, decide whether an improved fit is needed, and so on. When the data set consists of hundreds of curves, this can be a painful process. We hope that our program will ease the pain.

The program is available as both a Java applet and a Java application. Alpha source code is available in both compressed tar and zip form. Here is javadoc documentation for some of this material.

The version here fits simple linear regressions Here is a demonstration applet.

[Here are Java translations of LINPACK Cholesky, QR, singular value, and LU decomposition routines. Here are Java translations of several high quality public domain optimization packages, including Minpack nonlinear least squares routines. These were not used in this application, but they might be useful to someone who wants to produce their own regression applications.]

The program is designed to work as follows:

- A user provides it with a data file. The file consists of a
line that contains a curve id, followed by lines containing x,y pairs,
followed by the id of the next curve, followed by its x,y pairs, and
so on. (If you forsee that you might find a use for this program, but
would prefer a different form for the input, please contact me at the
e-mail address given below.) Currently, the demonstration applet reads a data set
consisting of a collection of stress versus strain curves.

**As of 10/26/96, you can provide your own data set. Simply anonymous ftp it to www1.fpl.fs.fed.us and put it in the pub/data directory. Then, after you start the demonstration applet, replace "demo_input" in the first applet window with the name of your data file.**

- When the program begins, the user is presented with a window
that contains text fields for the name of the input data file,
the prefix of the output data files, and labels for some of the plots.
After these fields have been filled, the user clicks on the
**Go**button to proceed. - After the
**Go**button is clicked, the program loads the first curve in the data set. A user can advance through the collection of data curves by pressing the**Next**button. Alternatively, a user can specify a particular curve by typing its ID into the text field next to the**Load**button, and then pressing the**Load**button. - A user excludes rectangular regions of data by pushing a mouse
button down at one corner of the rectangle, holding the button down
while moving the pointer to the opposite corner of the rectangle, and
then releasing the button. While a user is performing this movement,
an animated red rectangle appears that outlines the current exclusion
region. After the button is released the outline of the rectangle
turns black, and it can no longer be moved. However the data in the
rectangle can be reactivated if the rectangle is "cleared." A user
clears a rectangle by clicking a mouse button in it (all rectangles
that contain this point will be cleared) and then clicking
on the
**Clear**button at the bottom of the page. All rectangles that are shown on the current page (in the current "zoom state" --- see below) will be cleared if the user clicks the**Clear All**button at the bottom of the page. - If a user wants to look at only the data that has not been
excluded, the user should press the
**Zoom**button. A sequence of zooms may be performed. - Currently, the program does not permit a user to unzoom a single
zoom.
However a user can reactivate all of the data by pressing the
**Reset**button at the bottom of the page. Alternatively, a user can see all of the inactivated rectangles by performing a fit (see below) and then by clicking the appropriate recall button at the top of the page (see below). - A user fits the active data by clicking the
**Fit**button at the bottom of the page. In the demonstration program, the fit is a simple linear regression fit. In the future the program will be modified to perform certain nonlinear regression fits. If you would like to use this program, and have a particular nonlinear regression model in mind, please contact me at the e-mail address given below. - After a fit has been performed, an appropriate summarizing value
appears in one of the boxes at the top of the page. For the
demonstration program, the summarizing value is the fitted slope of
the curve (the modulus of elasticity estimate). Up to 5 fits can be
performed on any one data set (of course, additional fits can be
performed by reloading the data set or by rerunning the program). If
*i*< 5 fits have been performed, and the**Fit**button is pushed, the summarizing value will appear in text field*i*+ 1 at the top of the page. At any time prior to a push of the**Load**,**Next**, or**Stop**buttons, this fit can be recalled by pushing the button (buttons 1 -- 5) associated with the text field at the top of the page. - After a fit has been performed (actually, even before the
**Fit**button has been pushed), a user can view various associated residual plots by pushing the**Residuals**button at the bottom of the page. This opens a second window that contains a different set of buttons at its bottom. The**Data**button displays the currently active data. The**Fit**button adds the fitted line. The**Res. vs.**button plots the residuals versus the corresponding*x**x*values. The**Res. vs. pred.**button plots the residuals versus the predicted*y**y*values. The**Histo**button produces a histogram of the residuals. The**NPP**button produces a normal probability plot of the residuals. The**Bye**button deletes the residuals window. - The
**Stop**button ends the execution of the program. -
*If a fit has been performed*, then when the next**Load**,**Next**, or**Stop**button is pushed, information is written to two results files. To a file titled xxx.summ (where xxx is provided by the user in the startup box), the ID and summarizing value of the most recently displayed fit are written. To a file titled xxx.inact (where xxx is provided by the user in the startup box), the ID, the number of inactivated rectangles associated with the most recently displayed fit, and the pairs of x,y values that define them are written.**As of 10/22/96, these files can be retrieved via anonymous ftp from the pub/data directory.**

For further information, please contact Steve Verrill at

Last modified on 3/2/01.

As of last midnight, this page had been accessed times.