QGIS/python/plugins/processing/algs/otb/description/doc/TrainImagesClassifier-bayes.html

11 lines
15 KiB
HTML
Raw Normal View History

2014-01-17 19:39:17 +01:00
<html><head>
<style type="text/css">
dl { border: 3px double #ccc; padding: 0.5em; } dt { float: left; clear: left; text-align: left; font-weight: bold; color: green; } dt:after { content: ":"; } dd { margin: 0 0 0 220px; padding: 0 0 0.5em 0; }
</style>
</head><body><h1>TrainImagesClassifier</h1><h2>Brief Description</h2>Train a classifier from multiple pairs of images and training vector data.<h2>Tags</h2>Learning<h2>Long Description</h2>This application performs a classifier training from multiple pairs of input images and training vector data. Samples are composed of pixel values in each band optionally centered and reduced using an XML statistics file produced by the ComputeImagesStatistics application.
2014-05-17 17:35:27 +02:00
The training vector data must contain polygons with a positive integer field representing the class label. The name of this field can be set using the "Class label field" parameter. Training and validation sample lists are built such that each class is equally represented in both lists. One parameter allows controlling the ratio between the number of samples in training and validation sets. Two parameters allow to manage the size of the training and validation sets per class and per image.
2014-01-17 19:39:17 +01:00
Several classifier parameters can be set depending on the chosen classifier. In the validation process, the confusion matrix is organized the following way: rows = reference labels, columns = produced labels. In the header of the optional confusion matrix output file, the validation (reference) and predicted (produced) class labels are ordered according to the rows/columns of the confusion matrix.
2014-05-17 17:35:27 +02:00
This application is based on LibSVM and on OpenCV Machine Learning classifiers, and is compatible with OpenCV 2.3.1 and later.<h2>Parameters</h2><ul><li><b>[param] -io</b> &lt;string&gt; This group of parameters allows setting input and output data.. Mandatory: True. Default Value: &quot;0&quot;</li><li><b>[param] -elev</b> &lt;string&gt; This group of parameters allows managing elevation values. Supported formats are SRTM, DTED or any geotiff processed by the DEM import application. Mandatory: True. Default Value: &quot;0&quot;</li><li><b>[param] -sample</b> &lt;string&gt; This group of parameters allows setting training and validation sample lists parameters.. Mandatory: True. Default Value: &quot;0&quot;</li><li><b>[param] -rand</b> &lt;int32&gt; Set specific seed. with integer value.. Mandatory: False. Default Value: &quot;0&quot;</li><b>[choice] -classifier</b> Choice of the classifier to use for the training. libsvm,svm,boost,dt,gbt,ann,bayes,rf,knn. Mandatory: True. Default Value: &quot;libsvm&quot;<ul><li><b>[group] -libsvm</b></li><ul><li><b>[param] -classifier.libsvm.k</b> &lt;string&gt; SVM Kernel Type.. Mandatory: True. Default Value: &quot;linear&quot;</li><li><b>[param] -classifier.libsvm.c</b> &lt;float&gt; SVM models have a cost parameter C (1 by default) to control the trade-off between training errors and forcing rigid margins.. Mandatory: True. Default Value: &quot;1&quot;</li><li><b>[param] -classifier.libsvm.opt</b> &lt;boolean&gt; SVM parameters optimization flag.. Mandatory: False. Default Value: &quot;True&quot;</li></ul><li><b>[group] -svm</b></li><ul><li><b>[param] -classifier.svm.m</b> &lt;string&gt; Type of SVM formulation.. Mandatory: True. Default Value: &quot;csvc&quot;</li><li><b>[param] -classifier.svm.k</b> &lt;string&gt; SVM Kernel Type.. Mandatory: True. Default Value: &quot;linear&quot;</li><li><b>[param] -classifier.svm.c</b> &lt;float&gt; SVM models have a cost parameter C (1 by default) to control the trade-off between training errors and forcing rigid margins.. Mandatory: True. Default Value: &quot;1&quot;</li><li><b>[param] -classifier.svm.nu</b> &lt;float&gt; Parameter nu of a SVM optimization problem.. Mandatory: True. Default Value: &quot;0&quot;</li><li><b>[param] -classifier.svm.coef0</b> &lt;float&gt; Parameter coef0 of a kernel function (POLY / SIGMOID).. Mandatory: True. Default Value: &quot;0&quot;</li><li><b>[param] -classifier.svm.gamma</b> &lt;float&gt; Parameter gamma of a kernel function (POLY / RBF / SIGMOID).. Mandatory: True. Default Value: &quot;1&quot;</li><li><b>[param] -classifier.svm.degree</b> &lt;float&gt; Parameter degree of a kernel function (POLY).. Mandatory: True. Default Value: &quot;1&quot;</li><li><b>[param] -classifier.svm.opt</b> &lt;boolean&gt; SVM parameters optimization flag.
2014-01-17 19:39:17 +01:00
-If set to True, then the optimal SVM parameters will be estimated. Parameters are considered optimal by OpenCV when the cross-validation estimate of the test set error is minimal. Finally, the SVM training process is computed 10 times with these optimal parameters over subsets corresponding to 1/10th of the training samples using the k-fold cross-validation (with k = 10).
-If set to False, the SVM classification process will be computed once with the currently set input SVM parameters over the training samples.
-Thus, even with identical input SVM parameters and a similar random seed, the output SVM models will be different according to the method used (optimized or not) because the samples are not identically processed within OpenCV.. Mandatory: False. Default Value: &quot;True&quot;</li></ul><li><b>[group] -boost</b></li><ul><li><b>[param] -classifier.boost.t</b> &lt;string&gt; Type of Boosting algorithm.. Mandatory: True. Default Value: &quot;real&quot;</li><li><b>[param] -classifier.boost.w</b> &lt;int32&gt; The number of weak classifiers.. Mandatory: True. Default Value: &quot;100&quot;</li><li><b>[param] -classifier.boost.r</b> &lt;float&gt; A threshold between 0 and 1 used to save computational time. Samples with summary weight <= (1 - weight_trim_rate) do not participate in the next iteration of training. Set this parameter to 0 to turn off this functionality.. Mandatory: True. Default Value: &quot;0.95&quot;</li><li><b>[param] -classifier.boost.m</b> &lt;int32&gt; Maximum depth of the tree.. Mandatory: True. Default Value: &quot;1&quot;</li></ul><li><b>[group] -dt</b></li><ul><li><b>[param] -classifier.dt.max</b> &lt;int32&gt; The training algorithm attempts to split each node while its depth is smaller than the maximum possible depth of the tree. The actual depth may be smaller if the other termination criteria are met, and/or if the tree is pruned.. Mandatory: True. Default Value: &quot;65535&quot;</li><li><b>[param] -classifier.dt.min</b> &lt;int32&gt; If all absolute differences between an estimated value in a node and the values of the train samples in this node are smaller than this regression accuracy parameter, then the node will not be split.. Mandatory: True. Default Value: &quot;10&quot;</li><li><b>[param] -classifier.dt.ra</b> &lt;float&gt; . Mandatory: True. Default Value: &quot;0.01&quot;</li><li><b>[param] -classifier.dt.cat</b> &lt;int32&gt; Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split.. Mandatory: True. Default Value: &quot;10&quot;</li><li><b>[param] -classifier.dt.f</b> &lt;int32&gt; If cv_folds > 1, then it prunes a tree with K-fold cross-validation where K is equal to cv_folds.. Mandatory: True. Default Value: &quot;10&quot;</li><li><b>[param] -classifier.dt.r</b> &lt;boolean&gt; If true, then a pruning will be harsher. This will make a tree more compact and more resistant to the training data noise but a bit less accurate.. Mandatory: False. Default Value: &quot;True&quot;</li><li><b>[param] -classifier.dt.t</b> &lt;boolean&gt; If true, then pruned branches are physically removed from the tree.. Mandatory: False. Default Value: &quot;True&quot;</li></ul><li><b>[group] -gbt</b></li><ul><li><b>[param] -classifier.gbt.w</b> &lt;int32&gt; Number "w" of boosting algorithm iterations, with w*K being the total number of trees in the GBT model, where K is the output number of classes.. Mandatory: True. Default Value: &quot;200&quot;</li><li><b>[param] -classifier.gbt.s</b> &lt;float&gt; Regularization parameter.. Mandatory: True. Default Value: &quot;0.01&quot;</li><li><b>[param] -classifier.gbt.p</b> &lt;float&gt; Portion of the whole training set used for each algorithm iteration. The subset is generated randomly.. Mandatory: True. Default Value: &quot;0.8&quot;</li><li><b>[param] -classifier.gbt.max</b> &lt;int32&gt; The training algorithm attempts to split each node while its depth is smaller than the maximum possible depth of the tree. The actual depth may be smaller if the other termination criteria are met, and/or if the tree is pruned.. Mandatory: True. Default Value: &quot;3&quot;</li></ul><li><b>[group] -ann</b></li><ul><li><b>[param] -classifier.ann.t</b> &lt;string&gt; Type of training method for the multilayer perceptron (MLP) neural network.. Mandatory: True. Default Value: &quot;reg&quot;</li><li><b>[param] -classifier.ann.sizes</b> &lt;string&gt; The number of neurons in each intermediate layer (excluding input and output layers).. Mandatory: True. Default Value: &quot;&quot;</li><li><b>[param] -classifier.ann.f</b> &lt;string&gt; Neu