LLM - 基于 ChatGLM-6B 的工程配置搭建私有 ChatGPT 中文在线聊天-Toy模板网

这篇具有很好参考价值的文章主要介绍了LLM - 基于 ChatGLM-6B 的工程配置搭建私有 ChatGPT 中文在线聊天。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

欢迎关注我的CSDN：https://spike.blog.csdn.net/
本文地址：https://blog.csdn.net/caroline_wendy/article/details/131104546

Paper：GLM: General Language Model Pretraining with Autoregressive Blank Infilling

ChatGLM是通用的预训练语言模型（General Language Pretraining Model），基于自回归空格填充（Autoregressive Blank Infilling）的方法，可以兼容三种主流的预训练框架：自回归模型（如GPT）、自编码模型（如BERT）和编码器-解码器模型（如T5）。GLM 通过添加二维位置编码和允许任意顺序预测文本片段，提高了空格填充预训练的效果。同时，GLM可以通过调整空格的数量和长度，来适应不同类型的任务，包括自然语言理解、有条件和无条件的文本生成。GLM在多个任务上都超越了BERT、T5和GPT，展示了其通用性和强大性。

LLM - 基于 ChatGLM-6B 的工程配置搭建私有 ChatGPT 中文在线聊天

ChatGLM 已经升级到2.0版本 ChatGLM2-6B，相关文章：

ChatGLM v1.0: ChatGLM-6B (General Language Model) 的工程配置
ChatGLM v2.0: 第2版 ChatGLM2-6B (General Language Model) 的工程配置

1. 配置工程

GitHub 工程：GitHub - THUDM/ChatGLM-6B
HuggingFace 网页：https://huggingface.co/THUDM/chatglm-6b

下载 HuggingFace 工程 chatglm-6b，其中 git-lfs 相关的大文件并未下载，命令如下：

git clone https://huggingface.co/THUDM/chatglm-6b

下载参数工程 THU-Cloud-Downloader，用于快速下载清华云的模型参数，命令如下：

git clone https://github.com/chenyifanthu/THU-Cloud-Downloader

模型参数地址：https://cloud.tsinghua.edu.cn/d/fb9f16d6dc8f482596c2/

将模型参数直接保存至 chatglm-6b 替换已有文件，命令如下：

cd THU-Cloud-Downloader
python main.py --link https://cloud.tsinghua.edu.cn/d/fb9f16d6dc8f482596c2/ --save ../chatglm-6b/

等待11个文件，其中包括8个模型文件，下载完成，预计15分钟左右。

下载服务工程：

git clone git@github.com:THUDM/ChatGLM-6B.git ChatGLM-6B-official

建议使用 git 下载，预先设置 id_rsa ，如下载过慢，可直接下载zip文件，上传之后再解压。

配置 conda 环境，安装相关的 Python 包，命令如下：

conda create -n chatglm python=3.8
conda activate chatglm
cd ChatGLM-6B-official

pip install -r requirements.txt

验证 PyTorch 是否安装成功：

python

import torch
print(torch.__version__)  # 2.0.1+cu117
print(torch.cuda.is_available())  # True

2. 服务端测试

将 cli_demo.py 源码中的模型，修改为已下载的 chatglm-6b 路径：

...
tokenizer = AutoTokenizer.from_pretrained("../chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("../chatglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()
...

即可运行：

python3 cli_demo.py

测试对话，如下：

欢迎使用 ChatGLM-6B 模型，输入内容即可进行对话，clear 清空对话历史，stop 终止程序

用户：蒙特卡洛方法是什么呢?

ChatGLM-6B：蒙特卡洛方法是一种随机模拟方法，常用于研究随机现象和不确定性原理。在蒙特卡洛方法中，研究者使用随机数生成器生成一组随机样本，然后根据这些样本计算出一些物理量或数学模型的结果。

具体来说，蒙特卡洛方法的基本思想是通过模拟随机过程来估计估计物理量或数学模型的参数。在模拟过程中，研究者随机选择一个物理量或数学模型的参数，然后计算出一组模拟结果。这些模拟结果可以用于估计真实的物理量或数学模型的结果，并给出相应的误差。

蒙特卡洛方法的优点在于可以处理不确定性原理，例如在研究量子物理、化学反应、金融市场等领域中，由于存在大量随机因素，不确定性原理会导致结果的不准确性。而蒙特卡洛方法可以通过模拟大量的随机样本来消除不确定性，提高结果的准确性。

用户：

3. 网页端调用

修改 web_demo.py 源码中的模型，修改为已下载的 chatglm-6b 路径：

...
tokenizer = AutoTokenizer.from_pretrained("../chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("../chatglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()
...

修改 Gradio 插件的服务器 IP 地址 (server_name) 和端口号 (server_port)，进行端口透传：

...
demo.queue().launch(share=False, server_name='[Your Server IP]', server_port=[Your Server Port], inbrowser=True)
...

IP 可以设置为 0.0.0.0，端口设置为可用端口即可。

参考：基于Gradio可视化部署机器学习应用

即可运行：

CUDA_VISIBLE_DEVICES="1" nohup python3 web_demo.py > nohup.out &

访问地址如下：http://[Your Server IP]:[Your Server Port]

LLM - 基于 ChatGLM-6B 的工程配置搭建私有 ChatGPT 中文在线聊天

Bugfix

1. TCP connection reset by peer

当下载 HuggingFace 项目时，遇到错误：

fatal: unable to access 'https://huggingface.co/THUDM/chatglm-6b/': TCP connection reset by peer

可能是网络原因或Git版本较低，建议使用 ssh + git 路径下载，避免访问异常。

2. Permissions 0644 for id_rsa are too open

修改 .id_rsa 的权限，即可：

chmod 400 ~/.ssh/id_rsa

参考：Stackoverflow - SSH Key: “Permissions 0644 for ‘id_rsa.pub’ are too open.” on mac

3. 修改 Docker 环境的 pip 安装源

默认 pip 源的优先级，如下：

# This file has been autogenerated or modified by NVIDIA PyIndex.
# In case you need to modify your PIP configuration, please be aware that
# some configuration files may have a priority order. Here are the following 
# files that may exists in your machine by order of priority:
#
# [Priority 1] Site level configuration files
#       1. `/opt/conda/pip.conf`
#
# [Priority 2] User level configuration files
#       1. `/root/.config/pip/pip.conf`
#       2. `/root/.pip/pip.conf`
#
# [Priority 3] Global level configuration files
#       1. `/etc/pip.conf`
#       2. `/etc/xdg/pip/pip.conf`

全部删除，仅保留 /root/.pip/pip.conf，即可：

rm /opt/conda/pip.conf
rm /root/.config/pip/pip.conf
rm /etc/pip.conf
rm /etc/xdg/pip/pip.conf

修改 pip.conf，添加清华的pip源，与Nvidia的pip源共用，即：文章来源地址https://www.toymoban.com/news/detail-494180.html

vim ~/.pop/pip.conf

[global]
no-cache-dir = true
index-url = https://pypi.tuna.tsinghua.edu.cn/simple/
extra-index-url = https://pypi.ngc.nvidia.com
trusted-host = pypi.tuna.tsinghua.edu.cn pypi.ngc.nvidia.com

到了这里，关于LLM - 基于 ChatGLM-6B 的工程配置搭建私有 ChatGPT 中文在线聊天的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！