想要自己搭建 chatGPT？这份教程会让你事半功倍-chatgptplus账号购买平台

对人工智能爱好者来讲，Chatbot 或 GPT 应当不再陌生。但是，你会不会曾想过开发自己的聊天机器人？在市场上，一些聊天机器人架构框架成为非常受欢迎的选择，其中最流行的之一就是 OpenAI 的 GPT ( Generative Pretrained Transformer )。但是，要搭建一个高效的聊天机器人，晓得怎样使用 GPT 和相应散布式架构就变得相当重要，这将使全部进程变得简单快捷。

相信你已了解了 GPT 的基本原理，下面我来介绍怎样在本地配置并训练 GPT 模型。

第一步：环境搭建

使用 GPT2 / GPT3 等大模型，需要高内存 GPU 支持，并且网络连接最少需要 100 Mbps。建议使用一台强大的云主机，例如 Google Cloud、AWS、Microsoft Azure、阿里云，将存储资源部署在远程服务器上来实现远程训练。

第二步：安装必要的库

为了正常运行 GPT，需要安装 PyTorch。 PyTorch 是一个能够适应区别实验需求的机器学习库。我们还需要安装 PytorchTransformers，这是一个基于 Pytorch 的最新 NLP 深度学习模型预训练库，例如 BERT，XLNet 等，同时支持面向生产的模型转换和训练。安装好 PyTorch 和 PyTorchTransformers 以后，使用以下命令：

```

pip install torch

pip install pytorch-transformers

```

第三步：数据准备

现在，我们将处理数据，数据的选择和预处理对训练结果相当重要。在可使用现有数据集时，使用大型数据集是很有帮助的。最好使用通用、开放数据集，例如 Book Corpus 数据集。从论文中可以取得此数据集的下载链接，或使用以下 Python 命令：

```

!wget https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext⑵-v1.zip

!unzip -q wikitext⑵-v1.zip

!mkdir -p data && mv wikitext⑵/ data/

```

第四步：模型训练

现在，我们已准备好开始训练模型。以下是训练 / 预测模型的代码。在这类情况下，我们使用的是 GPT2 中的自回归模型，该模型在 117M 参数下训练。

```python

import torch

from pytorch_transformers import GPT2Tokenizer, GPT2LMHeadModel

# from pt_util import context_timer, print_memory_info

from time import time

def get_data_from_file(filename):

with open(filename, 'r') as f:

return f.read()

def train():

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

model = GPT2LMHeadModel.from_pretrained('gpt2')

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = model.to(device)

input_text = get_data_from_file("data/wikitext⑵/wiki.train.tokens")

input_ids = torch.tensor(tokenizer.encode(input_text)).unsqueeze(0)

input_ids = input_ids.to(device)

# 假定这里你已安装了 NVIDIA GPU Computing Toolkit, 并且使用了 nvidia-smi 命令来查看 GPU 资源

# 假定你的显存为 6GB

# 如果以下训练代码的模型用显存超过 6 GB, 代码将会出错

#cuda_max_memory_bytes = 6.0 * 1024 * 1024 * 1024

#batch_size, text_length = input_ids.shape

#model_max_memory = torch.cuda.max_memory_allocated()

#model_next_memory = batch_size * text_length * model.config.to_dict()['n_embd'] * 4

#if model_max_memory + model_next_memory > cuda_max_memory_bytes:

# torch.cuda.empty_cache()

output, past = model(input_ids[:,:⑴], past=None)

loss = torch.nn.CrossEntropyLoss()(output.view(⑴, output.size(⑴)), input_ids[:,1:].contiguous().view(⑴))

loss.backward()

torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)

optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

optimizer.step()

# print_memory_info('End training')

return "Training successful! Model saved at: {}".format(model.save_pretrained('./models/'))

start = time()

train()

# print("Time elapsed: ", time() - start, " seconds")

```

当我们完成了这些步骤，我们就有了一个可以用于训练的 GPT 模型。现在运行它就能够看到结果了。

第五步：使用训练好的模型进行预测

训练好模型后，可以将其用于生成文本或进行问答。可以通过输入一些示例文本，来产生模型预测的输出。

```python

import torch

from pytorch_transformers import GPT2Tokenizer, GPT2LMHeadModel

# from pt_util import context_timer, print_memory_info

import numpy as np

import random

def predict(model, input_text, seq_len):

n_ctx = model.config.n_ctx

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model.to(device)

generated = ''

generated += input_text

input_ids = torch.tensor(tokenizer.encode(generated)).unsqueeze(0)

input_ids = input_ids.to(device)

for _ in range(seq_len):

outputs = model(input_ids)

next_token_logits = outputs[0][:, ⑴, :]

filter_value = -float('Inf')

banned_tokens = []

next_token_logits = next_token_logits[:, :].contiguous()

next_token_logits = next_token_logits.squeeze()

next_token_logits = next_token_logits / 0.8

for token in banned_tokens:

next_token_logits[token] = filter_value

next_token_probs = torch.softmax(next_token_logits, dim=⑴).detach().cpu().numpy()[0]

next_token = int(np.random.choice(len(next_token_probs), p = next_token_probs))

if next_token == tokenizer.sep_token_id:

break

generated += tokenizer.decode([next_token])

input_ids = torch.cat((input_ids, torch.ones((1, 1)).long().to(device) * next_token), dim=1)

# print_memory_info('End predicting')

return generated

# 加载训练好的模型

def get_trained_model():

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = GPT2LMHeadModel.from_pretrained('gpt2')

model.to(device)

return model

model = get_trained_model()

text = "一只小猫咪"

print(predict(model=model, input_text=text, seq_len=50))

```

现在，你可以运行代码，并将模型和代码载入到利用程序中了。

当你理解建立自己的聊天机器人的进程，并且能够成功处理数据集并训练模型后，你就可以够很容易地构建基于 GPT 的 chatbot 了。假设你需要更精细的预测，只需使用一些额外的算法和进一步的优化便可。

本文来源于chatgptplus账号购买平台，转载请注明出处：https://chatgpt.guigege.cn/jiaocheng/32830.html 咨询请加VX：muhuanidc

想要自己搭建 chatGPT？这份教程会让你事半功倍

相关推荐

联系我们