文本转视频(text-to-video, T2V),另一种方式解读原理,带你走进不一样的世界
文生图的一些技术方案的探讨与总结 Discussion and Summary of Some Technical Solutions for Generative Text Modeling.
Attention Is All You Need 经典范文重磅深度解析
attention的经典之作,引领了文本走向大模型的基础,作为奠基石,自此以后,大模型百花齐放百家争鸣,这都是这篇文章的功劳。 "Attention is a classic work that has led to the foundation of large models for text, serving as a cornerstone. Since then, large models have flourished and blossomed, and all this is thanks to this article."
Attention大合集讲解,一定要给你讲透
attention的大合集,包含所有attention知识 A comprehensive collection of knowledge about "attention," including all aspects of attention.
DALL.E2 模型从0到1解密
这意味着利用Clip模型来理解和生成文本描述的视觉概念,然后借助Diffusion Models的技术来生成高质量的图像。这种结合可以实现文本描述到图像的高保真生成,通过Clip模型,文本描述被转化为视觉概念,并且通过Diffusion Models,这些视觉概念被映射成逼真的图像。 This means using the Clip model to understand and generate text descriptions of visual concepts, and then leveraging the technology of Diffusion Models to generate high-quality images. This combination allows for high-fidelity generation of images from text descriptions. Through the Clip model, text descriptions are translated into visual concepts, and through Diffusion Models, these visual concepts are mapped into realistic images.
DALL·E 模型从0到1深度解析
通过互联网收集的2.5亿个图像文本对,训练了一个拥有120亿参数的自回归Transformer图像生成模型,可实现自然语言控制且高保真。 A self-supervised pre-trained autoencoder-based image generation model with 1.2 billion parameters was trained on 250 million image-text pairs collected from the internet. This model is capable of generating high-fidelity images under natural language control.
DDPM 概率模型从0到1深度解析
发现了扩散模型与用于培训马尔科夫过程的变分推理、去噪分数匹配、退火Langevin动力学、自回归模型以及渐进有损压缩之间的相关性。 I have discovered a correlation between diffusion models and variational inference for training Markov processes, denoising score matching, Langevin dynamics with simulated annealing, autoregressive models, and asymptotic lossy compression.
FLAN模型从0到1解密
探讨零样本提示的一个简单问题,FLAN模型相对于未做微调的模型提高了性能,大多数任务上超过零样本GPT-3 Discussing a simple question about zero-shot prompting, FLAN models improve performance compared to models without fine-tuning, and exceed zero-shot GPT-3 on most tasks.
GPT1大模型核心技术解密
这项工作提出了一个框架,通过首先进行生成式的预训练,然后进行判别式微调,使得独立的任务无关模型具备了强大的自然语言理解能力。这个框架还利用了无监督(预)训练的方法,以提高判别任务的性能。这一研究推动了新的无监督学习方向的探索。 This work proposes a framework that enables a task-agnostic model to achieve powerful natural language understanding by first undergoing generative pre-training and then discriminative fine-tuning. The framework also leverages unsupervised (pre-)training methods to improve the performance of discriminative tasks. This research推动了新的无监督学习方向的探索。
GPT2大模型核心技术解密
我们的首要目标是减少领域数据的使用,同时避免复杂的精细调整。 Our primary goal is to reduce the use of domain data while avoiding complex fine-tuning.
GPT3大模型核心技术解密
gpt3重磅发布,在九大nlp领域取得成功显著。更少的领域数据、且不经过fine-tuning GPT-3's major release has achieved remarkable success across nine major NLP domains with fewer domain-specific data and without fine-tuning.
InstructGPT 模型深度解密
使用PPO来微调SFT模型。输入一个prompt期望得到一个输出。给定一个prompt和response,生成奖励分数。除此之外,增加了KL散度降低奖励模型的过度优化。我们称这个模型为PPO。作者把预训练的梯度加入到PPO的梯度中,为了缓和模型在公开数据集中的性能损失。我们称这个模型为PPO-ptx。
Learning Transferable Visual Models From Natural Language Supervision
提出来一个非常简单的方法叫clip,也就是 contrast language image pre training
xDeepFM 极深因子分解机模型从0到1要点解析
考虑三阶FM时,我们是将三个嵌入向量进行Hadamard乘法操作,然后对得到的向量进行求和。而CIN则是在向量级别进行高阶组合,然后再对这些组合结果进行求和池化。这种思路和模型名称 "eXtreme Deep Factorization Machine (xDeepFM)" 相关,因为它强调了对嵌入特征的深度组合操作。
InterHAt排序预估模型
本文提出利用具有多头自我注意的transformer进行特征学习的InterHAt算法. 在此基础上,分层注意层被用于预测CTR,同时为预测结果提供可解释的见解. InterHAt通过低计算复杂度的高效注意力聚集策略捕获高阶特征交互。