Please check the latest news (change log) and keep this package updated.
BERT_vocab()
and ICC_models()
.summary.fmat()
, FMAT_query()
, and FMAT_run()
(significantly faster because now it can simultaneously estimate all [MASK] options for each unique query sentence, with running time only depending on the number of unique queries but not on the number of [MASK] options).reticulate
package version ≥ 1.36.1, then FMAT
should be updated to ≥ 2024.4. Otherwise, out-of-vocabulary [MASK] words may not be identified and marked. Now FMAT_run()
directly uses model vocabulary and token ID to match [MASK] words. To check if a [MASK] word is in the model vocabulary, please use BERT_vocab()
.BERT_download()
(downloading models to local cache folder “%USERPROFILE%/.cache/huggingface”) to differentiate from FMAT_load()
(loading saved models from local cache). But indeed FMAT_load()
can also download models silently if they have not been downloaded.gpu
parameter (see Guidance for GPU Acceleration) in FMAT_run()
to allow for specifying an NVIDIA GPU device on which the fill-mask pipeline will be allocated. GPU roughly performs 3x faster than CPU for the fill-mask pipeline. By default, FMAT_run()
would automatically detect and use any available GPU with an installed CUDA-supported Python torch
package (if not, it would use CPU).FMAT_run()
.BERT_download()
, FMAT_load()
, and FMAT_run()
.parallel
in FMAT_run()
: FMAT_run(model.names, data, gpu=TRUE)
is the fastest.progress
in FMAT_run()
.