R/1_1_textEmbed.R
textLayerAggregation.Rd
Select and aggregate layers of hidden states to form a word embeddings.
textLayerAggregation( word_embeddings_layers, layers = 11:12, aggregate_layers = "concatenate", aggregate_tokens = "mean", tokens_select = NULL, tokens_deselect = NULL )
word_embeddings_layers | Layers outputted from textHuggingFace. |
---|---|
layers | The numbers of the layers to be aggregated (e.g., c(11:12) to aggregate the eleventh and twelfth). Note that layer 0 is the input embedding to the transformer, and should normally not be used. Selecting 'all' thus removes layer 0. |
aggregate_layers | Method to carry out the aggregation among the layers for each word/token, including "min", "max" and "mean" which takes the minimum, maximum or mean across each column; or "concatenate", which links together each layer of the word embedding to one long row. Default is "concatenate" |
aggregate_tokens | Method to carry out the aggregation among the word embeddings for the words/tokens, including "min", "max" and "mean" which takes the minimum, maximum or mean across each column; or "concatenate", which links together each layer of the word embedding to one long row. |
tokens_select | Option to only select embeddings linked to specific tokens such as "[CLS]" and "[SEP]" (default NULL). |
tokens_deselect | Option to deselect embeddings linked to specific tokens such as "[CLS]" and "[SEP]" (default NULL). |
A tibble with word embeddings. Note that layer 0 is the input embedding to the transformer, which is normally not used.
see textHuggingFace
and textEmbed
if (FALSE) { wordembeddings <- textLayerAggregation(word_embeddings_layers) }