The innovative language model, GPT-3, developed by OpenAI, is the third installment in the GPT series and is distinguished by its unprecedented scale, making it the largest non-sparse language model currently available. Outperforming its predecessor, GPT-2[1], and Microsoft’s Turing NLG, GPT-3 has ten times the capacity of the latter. It is renowned for its capability to generate text, including news articles, and aid in coding tasks, though it also poses potential misuse threats such as the propagation of misinformation or phishing. GPT-3 comes in various versions to accommodate different needs, with davinci, possessing 175 billion parameters, being the largest. The subsequent GPT-3.5 series introduced new models and abilities. GPT-3 plays a pivotal role in both industry and research, supporting products like GitHub[2] Copilot and finding application in several Microsoft products. However, it also raises ethical and academic issues.
Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only transformer model of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "attention". This attention mechanism allows the model to selectively focus on segments of input text it predicts to be most relevant. It uses a 2048-tokens-long context[jargon], float16 (16-bit) precision, and a hitherto-unprecedented 175 billion parameters, requiring 350GB of storage space as each parameter takes 2 bytes of space, and has demonstrated strong "zero-shot" and "few-shot" learning abilities on many tasks.
Original author(s) | OpenAI |
---|---|
Initial release | June 11, 2020 (beta) |
Repository | |
Predecessor | GPT-2 |
Successor | GPT-3.5 GPT-4 |
Type | |
Website | openai |
On September 22, 2020, Microsoft announced that it had licensed GPT-3 exclusively. Others can still receive output from its public API, but only Microsoft has access to the underlying model.