createDict {Rwordseg} | R Documentation |
Read a corpus vector and generate the dictionary data frame.
createDict(trainvec, dicfile = NULL, wordsplit = "\\s+", natruesplit = "/")
trainvec |
A character vector of corpus. |
dicfile |
The path of output file. Defult is NULL. |
wordsplit |
Character containing regular expression to use for splitting words. |
natruesplit |
Character containing regular expression to use for splitting nature. |
A data frame of:
word |
Word. |
freq |
Frequency. |
nature |
Nature. |
Jian Li <rweibo@sina.com>
data(PD980105) d1 <- createDict(PD980105[1:10]) head(d1)