public class PrefixSpan
extends Object
implements org.apache.spark.internal.Logging, scala.Serializable
param: minSupport the minimal support level of the sequential pattern, any pattern that appears more than (minSupport * size-of-the-dataset) times will be output param: maxPatternLength the maximal length of the sequential pattern param: maxLocalProjDBSize The maximum number of items (including delimiters used in the internal storage format) allowed in a projected database before local processing. If a projected database exceeds this size, another iteration of distributed prefix growth is run.
Modifier and Type | Class and Description |
---|---|
static class |
PrefixSpan.FreqSequence<Item>
Represents a frequent sequence.
|
static class |
PrefixSpan.Postfix$ |
static class |
PrefixSpan.Prefix$ |
Constructor and Description |
---|
PrefixSpan()
Constructs a default instance with default parameters
{minSupport:
0.1 , maxPatternLength: 10 , maxLocalProjDBSize: 32000000L }. |
Modifier and Type | Method and Description |
---|---|
long |
getMaxLocalProjDBSize()
Gets the maximum number of items allowed in a projected database before local processing.
|
int |
getMaxPatternLength()
Gets the maximal pattern length (i.e.
|
double |
getMinSupport()
Get the minimal support (i.e.
|
static void |
org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1) |
static org.slf4j.Logger |
org$apache$spark$internal$Logging$$log_() |
<Item,Itemset extends Iterable<Item>,Sequence extends Iterable<Itemset>> |
run(JavaRDD<Sequence> data)
A Java-friendly version of
run() that reads sequences from a JavaRDD and returns
frequent sequences in a PrefixSpanModel . |
<Item> PrefixSpanModel<Item> |
run(RDD<Object[]> data,
scala.reflect.ClassTag<Item> evidence$1)
Finds the complete set of frequent sequential patterns in the input sequences of itemsets.
|
PrefixSpan |
setMaxLocalProjDBSize(long maxLocalProjDBSize)
Sets the maximum number of items (including delimiters used in the internal storage format)
allowed in a projected database before local processing (default:
32000000L ). |
PrefixSpan |
setMaxPatternLength(int maxPatternLength)
Sets maximal pattern length (default:
10 ). |
PrefixSpan |
setMinSupport(double minSupport)
Sets the minimal support level (default:
0.1 ). |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize
public PrefixSpan()
0.1
, maxPatternLength: 10
, maxLocalProjDBSize: 32000000L
}.public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)
public double getMinSupport()
public PrefixSpan setMinSupport(double minSupport)
0.1
).minSupport
- (undocumented)public int getMaxPatternLength()
public PrefixSpan setMaxPatternLength(int maxPatternLength)
10
).maxPatternLength
- (undocumented)public long getMaxLocalProjDBSize()
public PrefixSpan setMaxLocalProjDBSize(long maxLocalProjDBSize)
32000000L
).maxLocalProjDBSize
- (undocumented)public <Item> PrefixSpanModel<Item> run(RDD<Object[]> data, scala.reflect.ClassTag<Item> evidence$1)
data
- sequences of itemsets.evidence$1
- (undocumented)PrefixSpanModel
that contains the frequent patternspublic <Item,Itemset extends Iterable<Item>,Sequence extends Iterable<Itemset>> PrefixSpanModel<Item> run(JavaRDD<Sequence> data)
run()
that reads sequences from a JavaRDD
and returns
frequent sequences in a PrefixSpanModel
.data
- ordered sequences of itemsets stored as Java Iterable of IterablesPrefixSpanModel
that contains the frequent sequential patterns