public class RandomForestRegressor extends Regressor<Vector,RandomForestRegressor,RandomForestRegressionModel> implements RandomForestRegressorParams, DefaultParamsWritable
Constructor and Description |
---|
RandomForestRegressor() |
RandomForestRegressor(String uid) |
Modifier and Type | Method and Description |
---|---|
BooleanParam |
bootstrap()
Whether bootstrap samples are used when building trees.
|
BooleanParam |
cacheNodeIds()
If false, the algorithm will pass trees to executors to match instances with nodes.
|
IntParam |
checkpointInterval()
Param for set checkpoint interval (>= 1) or disable checkpoint (-1).
|
RandomForestRegressor |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
Param<String> |
featureSubsetStrategy()
The number of features to consider for splits at each tree node.
|
Param<String> |
impurity()
Criterion used for information gain calculation (case-insensitive).
|
Param<String> |
leafCol()
Leaf indices column name.
|
static RandomForestRegressor |
load(String path) |
IntParam |
maxBins()
Maximum number of bins used for discretizing continuous features and for choosing how to split
on features at each node.
|
IntParam |
maxDepth()
Maximum depth of the tree (nonnegative).
|
IntParam |
maxMemoryInMB()
Maximum memory in MB allocated to histogram aggregation.
|
DoubleParam |
minInfoGain()
Minimum information gain for a split to be considered at a tree node.
|
IntParam |
minInstancesPerNode()
Minimum number of instances each child must have after split.
|
DoubleParam |
minWeightFractionPerNode()
Minimum fraction of the weighted sample count that each child must have after split.
|
IntParam |
numTrees()
Number of trees to train (at least 1).
|
static MLReader<T> |
read() |
LongParam |
seed()
Param for random seed.
|
RandomForestRegressor |
setBootstrap(boolean value) |
RandomForestRegressor |
setCacheNodeIds(boolean value) |
RandomForestRegressor |
setCheckpointInterval(int value)
Specifies how often to checkpoint the cached node IDs.
|
RandomForestRegressor |
setFeatureSubsetStrategy(String value) |
RandomForestRegressor |
setImpurity(String value) |
RandomForestRegressor |
setMaxBins(int value) |
RandomForestRegressor |
setMaxDepth(int value) |
RandomForestRegressor |
setMaxMemoryInMB(int value) |
RandomForestRegressor |
setMinInfoGain(double value) |
RandomForestRegressor |
setMinInstancesPerNode(int value) |
RandomForestRegressor |
setMinWeightFractionPerNode(double value) |
RandomForestRegressor |
setNumTrees(int value) |
RandomForestRegressor |
setSeed(long value) |
RandomForestRegressor |
setSubsamplingRate(double value) |
RandomForestRegressor |
setWeightCol(String value)
Sets the value of param
weightCol . |
DoubleParam |
subsamplingRate()
Fraction of the training data used for learning each decision tree, in range (0, 1].
|
static String[] |
supportedFeatureSubsetStrategies()
Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
|
static String[] |
supportedImpurities()
Accessor for supported impurity settings: variance
|
String |
uid()
An immutable unique ID for the object and its derivatives.
|
Param<String> |
weightCol()
Param for weight column name.
|
featuresCol, fit, labelCol, predictionCol, setFeaturesCol, setLabelCol, setPredictionCol, transformSchema
params
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getBootstrap, getNumTrees
validateAndTransformSchema
getFeatureSubsetStrategy, getOldStrategy, getSubsamplingRate
getCacheNodeIds, getLeafCol, getMaxBins, getMaxDepth, getMaxMemoryInMB, getMinInfoGain, getMinInstancesPerNode, getMinWeightFractionPerNode, getOldStrategy, setLeafCol
extractInstances, extractInstances
getLabelCol, labelCol
featuresCol, getFeaturesCol
getPredictionCol, predictionCol
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
toString
getCheckpointInterval
getWeightCol
getImpurity, getOldImpurity
write
save
$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize
public RandomForestRegressor(String uid)
public RandomForestRegressor()
public static final String[] supportedImpurities()
public static final String[] supportedFeatureSubsetStrategies()
public static RandomForestRegressor load(String path)
public static MLReader<T> read()
public final Param<String> impurity()
HasVarianceImpurity
impurity
in interface HasVarianceImpurity
public final IntParam numTrees()
RandomForestParams
Note: The reason that we cannot add this to both GBT and RF (i.e. in TreeEnsembleParams)
is the param maxIter
controls how many trees a GBT has. The semantics in the algorithms
are a bit different.
numTrees
in interface RandomForestParams
public final BooleanParam bootstrap()
RandomForestParams
bootstrap
in interface RandomForestParams
public final DoubleParam subsamplingRate()
TreeEnsembleParams
subsamplingRate
in interface TreeEnsembleParams
public final Param<String> featureSubsetStrategy()
TreeEnsembleParams
These various settings are based on the following references: - log2: tested in Breiman (2001) - sqrt: recommended by Breiman manual for random forests - The defaults of sqrt (classification) and onethird (regression) match the R randomForest package.
featureSubsetStrategy
in interface TreeEnsembleParams
public final Param<String> leafCol()
DecisionTreeParams
leafCol
in interface DecisionTreeParams
public final IntParam maxDepth()
DecisionTreeParams
maxDepth
in interface DecisionTreeParams
public final IntParam maxBins()
DecisionTreeParams
maxBins
in interface DecisionTreeParams
public final IntParam minInstancesPerNode()
DecisionTreeParams
minInstancesPerNode
in interface DecisionTreeParams
public final DoubleParam minWeightFractionPerNode()
DecisionTreeParams
minWeightFractionPerNode
in interface DecisionTreeParams
public final DoubleParam minInfoGain()
DecisionTreeParams
minInfoGain
in interface DecisionTreeParams
public final IntParam maxMemoryInMB()
DecisionTreeParams
maxMemoryInMB
in interface DecisionTreeParams
public final BooleanParam cacheNodeIds()
DecisionTreeParams
cacheNodeIds
in interface DecisionTreeParams
public final Param<String> weightCol()
HasWeightCol
weightCol
in interface HasWeightCol
public final LongParam seed()
HasSeed
public final IntParam checkpointInterval()
HasCheckpointInterval
checkpointInterval
in interface HasCheckpointInterval
public String uid()
Identifiable
uid
in interface Identifiable
public RandomForestRegressor setMaxDepth(int value)
public RandomForestRegressor setMaxBins(int value)
public RandomForestRegressor setMinInstancesPerNode(int value)
public RandomForestRegressor setMinWeightFractionPerNode(double value)
public RandomForestRegressor setMinInfoGain(double value)
public RandomForestRegressor setMaxMemoryInMB(int value)
public RandomForestRegressor setCacheNodeIds(boolean value)
public RandomForestRegressor setCheckpointInterval(int value)
SparkContext
.
Must be at least 1.
(default = 10)value
- (undocumented)public RandomForestRegressor setImpurity(String value)
public RandomForestRegressor setSubsamplingRate(double value)
public RandomForestRegressor setSeed(long value)
public RandomForestRegressor setNumTrees(int value)
public RandomForestRegressor setBootstrap(boolean value)
public RandomForestRegressor setFeatureSubsetStrategy(String value)
public RandomForestRegressor setWeightCol(String value)
weightCol
.
If this is not set or empty, we treat all instance weights as 1.0.
By default the weightCol is not set, so all instances have weight 1.0.
value
- (undocumented)public RandomForestRegressor copy(ParamMap extra)
Params
defaultCopy()
.copy
in interface Params
copy
in class Predictor<Vector,RandomForestRegressor,RandomForestRegressionModel>
extra
- (undocumented)