Train a name finder model.

tnf_train_(model, lang, data, feature.gen = NULL, name.types = NULL,
  sequence.codec = NULL, factory = NULL, resources = NULL,
  params = NULL, encoding = NULL, type = NULL)

tnf_train(model, lang, data, feature.gen = NULL, name.types = NULL,
  sequence.codec = NULL, factory = NULL, resources = NULL,
  params = NULL, encoding = NULL, type = NULL)

Arguments

model

Full path to output model file.

lang

Language which is being processed.

data

Data to be used, full path to file, usually .txt.

feature.gen

Path to the feature generator descriptor file.

name.types

Name types to use for training.

sequence.codec

sequence codec used to code name spans.

factory

A sub-class of TokenNameFinderFactory.

resources

The resources directory.

params

Training parameters file.

encoding

Encoding for reading and writing text, if absent the system default is used.

type

The type of the token name finder model.

Value

Full path to the model for convenience.

Examples

# NOT RUN {
# get working directory
# need to pass full path
wd <- getwd()

# Training to find "WEF"
data <- paste("This organisation is called the <START:wef> World Economic Forum <END>",
  "It is often referred to as <START:wef> Davos <END> or the <START:wef> WEF <END> .")

# train model
tnf_train(model = paste0(wd, "/model.bin"), lang = "en", data = data)

# Same with .txt files.
# Save the above as file
write(data, file = "input.txt")

# Trains the model and returns the full path to the model
model <- tnf_train_(model = paste0(wd, "/wef.bin"), lang = "en",
  data = paste0(wd, "/input.txt"), type = "wef")
# }