Un logotipo azul y blanco para una herramienta de gestión de redes sociales llamada Socialionals.

GPT-2

Compartir
" Volver al índice del glosario

Generative Pretrained Transformer 2, or GPT-2, is an advanced AI model specifically engineered for natural language processing tasks. This model, launched by OpenAI in February 2019, is renowned for its versatility in generating a wide array of text types, with its prowess extending to answering queries and completing code automatically. GPT-2’s training involved a vast online text corpus, WebText, and it operates on a staggering 1.5 billion parameters. Despite its resource-intensive nature, GPT-2 has found usage in diverse and innovative applications such as text-centric adventure games and subreddit simulations. Initial misuse fears led to the full GPT-2 model’s release in November 2019 when the concerns didn’t manifest. However, to address resource constraints, a smaller model, DistilGPT2, was developed. The innovations and successes of GPT-2 set the stage for future progress in AI text generation.

GPT-2 (Wikipedia)

Generative Pre-trained Transformer 2 (GPT-2) es una large language model por OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained a dataset of 8 million web pages. It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.

Generative Pre-trained Transformer 2 (GPT-2)
Autor(es) original(es)OpenAI
Lanzamiento inicial14 February 2019Hace 5 años (14 February 2019)
Repositoriohttps://github.com/openai/gpt-2
PredecessorGPT-1
SucesorGPT-3
Tipo
LicenciaMIT
Página webopenai.com/blog/gpt-2-1-5b-release/

GPT-2 was created as a "direct scale-up" of GPT-1 with a ten-fold increase in both its parameter count and the size of its training dataset. It is a general-purpose learner and its ability to perform the various tasks was a consequence of its general ability to accurately predict the next item in a sequence, which enabled it to traducir texts, answer questions about a topic from a text, resumir passages from a larger text, and generate text output on a level sometimes indistinguishable from that of humans, however it could become repetitive or nonsensical when generating long passages. It was superseded by GPT-3 and GPT-4 models, which are not open source anymore.

GPT-2 has, like its predecessor GPT-1 and its successors GPT-3 and GPT-4, a transformador generativo preentrenado architecture, implementing a deep neural network, specifically a transformador model, which uses atención instead of older recurrence- and convolution-based architectures. Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant. This model allows for greatly increased parallelization, and outperforms previous benchmarks for RNN/CNN/LSTM-based models.

" Volver al índice del glosario
es_ESEspañol