We have investigated two approaches to obtaining a recommended
drying schedule. Both approaches predict schedules from
specific gravity.
These approaches are based on
a data set that contains 268 species. There are 40 distinct drying
schedules that are currently recommended for these species. (See the
USDA *Dry Kiln Operator's Manual*.) Some of
them are used quite frequently. For example, the 6 (temperature) 4
(moisture content) 2 (wet bulb depression) schedule, T6-D2, was used for 63 of
the 268 species. Others are used infrequently. Eighteen of the schedules
were used only once.

The
classification approach to obtaining a
recommended schedule takes into account
the *prior probability* of a schedule (e.g., 63 of the 268
schedules were 6 4 2 while only 1 of the 268 was 5 2 2) and the
*distance in normalized specific gravity units* of a specimen to be classified
from the mean specific gravity for a particular schedule. (For example, the 63 species
for which the recommended schedule was 6 4 2 had a mean specific
gravity of .575 and associated standard deviation of .107. The 20
species for which the recommended schedule was 10 4 4 had a mean
specific gravity of .449 and associated standard deviation of .115.
Thus if a new specimen had specific gravity $x$, its distance in
normalized specific gravity units from the 6 4 2 schedule would be
$(x\; -\; .575)/.107$, and its distance from the 10 4 4 schedule would be
$(x\; -\; .449)/.115$.) The prior probability and distance measurements
are combined to yield a posterior probability that a specimen
belongs to a particular schedule. The details of this combination
are provided in the Appendix.
The ``predicted'' schedule is
taken to be the one with the highest posterior probability.

The regression approach is based on three regressions --- temperature
schedule number on specific gravity, moisture content schedule
number on specific gravity, and wet bulb depression schedule
number on specific gravity. Given the specific gravity
associated with a species, the regression equations are used to
predict appropriate temperature (temp), moisture content (mc), and
wet bulb depression (depr) schedule numbers. The predicted numbers are
rounded to the nearest integers. The resulting ``schedules'' are
then again ``rounded'' to the nearest of the 40 schedules that has
actually been observed.
We applied the two methods to the 268 species for which we knew the
"correct" schedule.
In Table 1 we report the numbers of ``matches'' achieved by the
two methods. For the classification approach, a match was ``exact''
if the recommended schedule (based on the classification algorithm)
was the same as that typically used. For the
regression approach, a match was ``exact'' if the recommended schedule
obtained after the initial rounding to the nearest integer was the
same as that typically used. For the regression approach, a match was
``quasi-exact'' if the recommended schedule obtained after both the
initial rounding to the nearest integer and the subsequent
``rounding'' to the nearest of the standard 40 schedules was the same
as that typically used. For the classification approach, there was a
``harshness match'' if the recommended schedule and the schedule
typically used lay in the same "harshness class." Bill Simpson
created these classes based on his experience. They are presented in
Table 4. For the regression
approach, there was a harshness match if the recommended schedule
obtained after both the
initial rounding to the nearest integer and the subsequent
``rounding'' to the nearest of the standard 40 schedules lay in the
same harshness class as the schedule actually used.

approach | exact matches | quasi-exact matches | harshness matches |
---|---|---|---|

regression | 13 | 45 | 88 |

classification | 81 | -- | 120 |

Table 2 contains the average absolute differences between the the
schedules recommended by the regression and classification approaches
and the schedules actually used.

approach | temp | mc | depr | harshness |
---|---|---|---|---|

regression | 1.81 | .425 | .828 | .993 |

classification | 1.80 | .422 | .784 | .985 |

Table 3 contains the maximum absolute differences between the the
schedules recommended by the regression and classification approaches
and the schedules actually used.

approach | temp | mc | depr | harshness |
---|---|---|---|---|

regression | 6 | 3 | 3 | 6 |

classification | 7 | 3 | 3 | 6 |

Table 1 suggests that the classification approach might be superior.
Also, in 179 of the 268 cases, after the first rounding,
the schedule recommended by the regression approach was not one of the
40 schedules actually used.
Tables 2 and 3 do not detect much of a difference between the two
approaches. When the two approaches yield different schedules, it
might be wise to choose the milder of the two.

harshness | # | temp | mc | depr | sg average | sg sd | number of species |
---|---|---|---|---|---|---|---|

1 | 1 | 1 | 2 | 1 | 0.91450 | 0.08273 | 2 |

1 | 2 | 2 | 2 | 2 | 0.65900 | 0.06921 | 1 |

1 | 3 | 2 | 3 | 2 | 0.75412 | 0.10557 | 25 |

1 | 4 | 2 | 4 | 2 | 0.53200 | 0.06921 | 1 |

1 | 5 | 2 | 4 | 4 | 0.67453 | 0.17456 | 19 |

2 | 6 | 3 | 3 | 1 | 0.86250 | 0.06718 | 2 |

2 | 7 | 3 | 4 | 1 | 0.61950 | 0.07000 | 2 |

2 | 8 | 3 | 3 | 2 | 0.64773 | 0.11637 | 40 |

2 | 9 | 3 | 4 | 2 | 0.68100 | 0.07114 | 9 |

2 | 10 | 4 | 2 | 2 | 0.58000 | 0.06921 | 1 |

3 | 11 | 5 | 2 | 2 | 0.74000 | 0.06921 | 1 |

3 | 12 | 5 | 4 | 2 | 0.53000 | 0.06921 | 1 |

3 | 13 | 5 | 3 | 3 | 0.41000 | 0.06921 | 1 |

3 | 14 | 6 | 4 | 2 | 0.57454 | 0.10735 | 63 |

3 | 15 | 6 | 1 | 3 | 0.66000 | 0.06921 | 1 |

3 | 16 | 6 | 2 | 3 | 0.40000 | 0.06921 | 1 |

3 | 17 | 6 | 3 | 3 | 0.62500 | 0.02121 | 2 |

3 | 18 | 6 | 6 | 3 | 0.40000 | 0.06921 | 1 |

3 | 19 | 6 | 1 | 4 | 0.51000 | 0.06921 | 1 |

3 | 20 | 6 | 4 | 4 | 0.50856 | 0.10029 | 18 |

4 | 21 | 7 | 2 | 3 | 0.75000 | 0.04243 | 2 |

4 | 22 | 7 | 2 | 6 | 0.36000 | 0.06921 | 1 |

4 | 23 | 8 | 3 | 2 | 0.56000 | 0.06921 | 1 |

4 | 24 | 8 | 2 | 3 | 0.61000 | 0.14000 | 3 |

4 | 25 | 8 | 4 | 3 | 0.60000 | 0.06921 | 1 |

4 | 26 | 8 | 2 | 4 | 0.51000 | 0.05657 | 2 |

4 | 27 | 8 | 3 | 4 | 0.49333 | 0.05508 | 3 |

4 | 28 | 8 | 4 | 4 | 0.47000 | 0.06782 | 4 |

5 | 29 | 10 | 3 | 4 | 0.48000 | 0.06921 | 1 |

5 | 30 | 10 | 4 | 4 | 0.44905 | 0.11489 | 20 |

5 | 31 | 10 | 5 | 4 | 0.39800 | 0.03704 | 3 |

5 | 32 | 10 | 6 | 4 | 0.34500 | 0.02121 | 2 |

5 | 33 | 10 | 4 | 5 | 0.44806 | 0.08762 | 16 |

5 | 34 | 10 | 6 | 5 | 0.34000 | 0.04243 | 2 |

6 | 35 | 11 | 4 | 4 | 0.40000 | 0.06921 | 1 |

7 | 36 | 12 | 4 | 5 | 0.22600 | 0.06921 | 1 |

7 | 37 | 12 | 5 | 5 | 0.46000 | 0.06921 | 1 |

8 | 38 | 12 | 5 | 7 | 0.33500 | 0.02121 | 2 |

9 | 39 | 13 | 3 | 4 | 0.35011 | 0.13640 | 9 |

10 | 40 | 14 | 3 | 6 | 0.33400 | 0.06921 | 1 |

We have (this is not completely rigorous as we are really dealing with density functions rather than probabilities, but the argument can be made rigorous) \[ P(group_j | sg = x) = P(group_j and sg = x)/P(sg = x) = P(sg = x | group_j)P(group_j)/\sum_{i=1}^n P(sg = x | group_i)P(group_i) \] The item is placed in the group_j that maximizes P(group_j | sg = x) which is the group_j that maximizes P(sg = x | group_j)P(group_j). We have \[ P(sg = x | group_j) = 1/\sqrt{2 \pi} 1/\sigma_j \times \exp(-(x - \mu_j)^2/(2 \sigma_j^2)) \] and \[ P(group_j) = n_j/268 \] where n_j is the number of species in the data set for which the jth schedule is the typical schedule. Taking logs, we see that we can maximize the posterior probability by maximizing \[ -\ln(\sigma_j) - (x - \mu_j)^2/(2\sigma_j^2) + \ln(n_j/268) \]