tnf_train.Rd
Train a name finder model.
tnf_train_(model, lang, data, feature.gen = NULL, name.types = NULL, sequence.codec = NULL, factory = NULL, resources = NULL, params = NULL, encoding = NULL, type = NULL) tnf_train(model, lang, data, feature.gen = NULL, name.types = NULL, sequence.codec = NULL, factory = NULL, resources = NULL, params = NULL, encoding = NULL, type = NULL)
model | Full path to output model file. |
---|---|
lang | Language which is being processed. |
data | Data to be used, full path to file, usually |
feature.gen | Path to the feature generator descriptor file. |
name.types | Name types to use for training. |
sequence.codec | sequence codec used to code name spans. |
factory | A sub-class of |
resources | The resources directory. |
params | Training parameters file. |
encoding | Encoding for reading and writing text, if absent the system default is used. |
type | The type of the token name finder model. |
Full path to the model
for convenience.
# NOT RUN { # get working directory # need to pass full path wd <- getwd() # Training to find "WEF" data <- paste("This organisation is called the <START:wef> World Economic Forum <END>", "It is often referred to as <START:wef> Davos <END> or the <START:wef> WEF <END> .") # train model tnf_train(model = paste0(wd, "/model.bin"), lang = "en", data = data) # Same with .txt files. # Save the above as file write(data, file = "input.txt") # Trains the model and returns the full path to the model model <- tnf_train_(model = paste0(wd, "/wef.bin"), lang = "en", data = paste0(wd, "/input.txt"), type = "wef") # }