public class NameFinderME extends Object implements TokenNameFinder
Modifier and Type | Field and Description |
---|---|
protected NameContextGenerator |
contextGenerator |
static String |
CONTINUE |
static int |
DEFAULT_BEAM_SIZE |
protected MaxentModel |
model |
static String |
OTHER |
static String |
START |
Constructor and Description |
---|
NameFinderME(MaxentModel mod)
Deprecated.
Use the new model API!
|
NameFinderME(MaxentModel mod,
NameContextGenerator cg)
Deprecated.
|
NameFinderME(MaxentModel mod,
NameContextGenerator cg,
int beamSize)
Deprecated.
|
NameFinderME(TokenNameFinderModel model) |
NameFinderME(TokenNameFinderModel model,
AdaptiveFeatureGenerator generator,
int beamSize) |
NameFinderME(TokenNameFinderModel model,
AdaptiveFeatureGenerator generator,
int beamSize,
SequenceValidator<String> sequenceValidator)
Initializes the name finder with the specified model.
|
NameFinderME(TokenNameFinderModel model,
int beamSize) |
Modifier and Type | Method and Description |
---|---|
void |
clearAdaptiveData()
Forgets all adaptive data which was collected during previous
calls to one of the find methods.
|
static Span[] |
dropOverlappingSpans(Span[] spans)
Removes spans with are intersecting or crossing in anyway.
|
Span[] |
find(String[] tokens)
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
|
Span[] |
find(String[] tokens,
String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence,
returning token spans for any identified names.
|
double[] |
probs()
Returns an array with the probabilities of the last decoded sequence.
|
void |
probs(double[] probs)
Populates the specified array with the probabilities of the last decoded
sequence.
|
double[] |
probs(Span[] spans)
Returns an array of probabilities for each of the specified spans which is the arithmetic mean
of the probabilities for each of the outcomes which make up the span.
|
static GISModel |
train(EventStream es,
int iterations,
int cut)
Deprecated.
|
static TokenNameFinderModel |
train(String languageCode,
String type,
ObjectStream<NameSample> samples,
AdaptiveFeatureGenerator generator,
Map<String,Object> resources,
int iterations,
int cutoff)
Trains a name finder model.
|
static TokenNameFinderModel |
train(String languageCode,
String type,
ObjectStream<NameSample> samples,
byte[] generatorDescriptor,
Map<String,Object> resources,
int iterations,
int cutoff)
Deprecated.
use
train(String, String, ObjectStream, TrainingParameters, byte[], Map)
instead and pass in a TrainingParameters object. |
static TokenNameFinderModel |
train(String languageCode,
String type,
ObjectStream<NameSample> samples,
Map<String,Object> resources) |
static TokenNameFinderModel |
train(String languageCode,
String type,
ObjectStream<NameSample> samples,
Map<String,Object> resources,
int iterations,
int cutoff)
Deprecated.
use
train(String, String, ObjectStream, TrainingParameters, AdaptiveFeatureGenerator, Map)
instead and pass in a TrainingParameters object. |
static TokenNameFinderModel |
train(String languageCode,
String type,
ObjectStream<NameSample> samples,
TrainingParameters trainParams,
AdaptiveFeatureGenerator generator,
Map<String,Object> resources)
Trains a name finder model.
|
static TokenNameFinderModel |
train(String languageCode,
String type,
ObjectStream<NameSample> samples,
TrainingParameters trainParams,
byte[] featureGeneratorBytes,
Map<String,Object> resources)
Trains a name finder model.
|
public static final int DEFAULT_BEAM_SIZE
public static final String START
public static final String CONTINUE
public static final String OTHER
protected MaxentModel model
protected NameContextGenerator contextGenerator
public NameFinderME(TokenNameFinderModel model)
public NameFinderME(TokenNameFinderModel model, AdaptiveFeatureGenerator generator, int beamSize, SequenceValidator<String> sequenceValidator)
model
- beamSize
- public NameFinderME(TokenNameFinderModel model, AdaptiveFeatureGenerator generator, int beamSize)
public NameFinderME(TokenNameFinderModel model, int beamSize)
@Deprecated public NameFinderME(MaxentModel mod)
mod
- The model to be used to find names.@Deprecated public NameFinderME(MaxentModel mod, NameContextGenerator cg)
mod
- The model to be used to find names.cg
- The context generator to be used with this name finder.@Deprecated public NameFinderME(MaxentModel mod, NameContextGenerator cg, int beamSize)
mod
- The model to be used to find names.cg
- The context generator to be used with this name finder.beamSize
- The size of the beam to be used in decoding this model.public Span[] find(String[] tokens)
TokenNameFinder
find
in interface TokenNameFinder
tokens
- an array of the tokens or words of the sequence, typically a sentence.public Span[] find(String[] tokens, String[][] additionalContext)
tokens
- an array of the tokens or words of the sequence,
typically a sentence.additionalContext
- features which are based on context outside
of the sentence but which should also be used.public void clearAdaptiveData()
clearAdaptiveData
in interface TokenNameFinder
public void probs(double[] probs)
chunk
. The specified array should be at least as large as
the number of tokens in the previous call to chunk
.probs
- An array used to hold the probabilities of the last decoded
sequence.public double[] probs()
chunk
.chunk
when it was last called.public double[] probs(Span[] spans)
spans
- The spans of the names for which probabilities are desired.public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, TrainingParameters trainParams, AdaptiveFeatureGenerator generator, Map<String,Object> resources) throws IOException
languageCode
- the language of the training datatype
- null or an override type for all types in the training datasamples
- the training datatrainParams
- machine learning train parametersgenerator
- null or the feature generatorresources
- the resources for the name finder or null if noneIOException
public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, TrainingParameters trainParams, byte[] featureGeneratorBytes, Map<String,Object> resources) throws IOException
languageCode
- the language of the training datatype
- null or an override type for all types in the training datasamples
- the training datatrainParams
- machine learning train parametersfeatureGeneratorBytes
- descriptor to configure the feature generation or nullresources
- the resources for the name finder or null if noneIOException
public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, AdaptiveFeatureGenerator generator, Map<String,Object> resources, int iterations, int cutoff) throws IOException
languageCode
- the language of the training datatype
- null or an override type for all types in the training datasamples
- the training dataiterations
- the number of iterationscutoff
- resources
- the resources for the name finder or null if noneIOException
ObjectStreamException
@Deprecated public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, Map<String,Object> resources, int iterations, int cutoff) throws IOException
train(String, String, ObjectStream, TrainingParameters, AdaptiveFeatureGenerator, Map)
instead and pass in a TrainingParameters object.IOException
public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, Map<String,Object> resources) throws IOException
IOException
@Deprecated public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, byte[] generatorDescriptor, Map<String,Object> resources, int iterations, int cutoff) throws IOException
train(String, String, ObjectStream, TrainingParameters, byte[], Map)
instead and pass in a TrainingParameters object.IOException
@Deprecated public static GISModel train(EventStream es, int iterations, int cut) throws IOException
IOException
public static Span[] dropOverlappingSpans(Span[] spans)
The following rules are used to remove the spans:
Identical spans: The first span in the array after sorting it remains
Intersecting spans: The first span after sorting remains
Contained spans: All spans which are contained by another are removed
spans
- Copyright © 2016 The Apache Software Foundation. All rights reserved.