[AI] Why are most LLMs decoder-only, and not encoder-only or encoder-decoder, in relation to transformer type?
Basically, as far as I read, different kinds of transformers serve different kinds of purposes:
-
A Decoder-only is used for generative word output (auto-completion), when the model receives an input prompt and outputs generated words based on it.
-
An Encoder-only is used for things such as classification and sentiment analysis, when the model receives an input text and outputs a classification vector.
-
An Encoder-Decoder is used for translating texts from one language to other, because it firstly encodes (hence, it classifies) then decodes (hence, it generates words).
How LLMs can translate and can kinda of “analyze the sentiment” of a given text, I’m not really sure (possibly the training data allows is so huge that decoders can achieve the encoding function) but the “basic” generation of words is a decoder thing.
-
They are? I was under the impression that they were full transformers.