LLama Factory 安装部署实操记录（二）-Toy模板网

这篇具有很好参考价值的文章主要介绍了LLama Factory 安装部署实操记录（二）。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

1. 项目地址

GitHub - hiyouga/LLaMA-Factory: Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM)Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM) - GitHub - hiyouga/LLaMA-Factory: Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM)https://github.com/hiyouga/LLaMA-Factory/2. 下载，最好是选择tag版本的源码，这里采用直接下载，clone网貌似有点问题。

wget https://github.com/hiyouga/LLaMA-Factory/archive/refs/tags/v0.4.0.tar.gz

解压：

tar -xzvf v0.4.0.tar.gz

3. 创建环境

conda create -n llama_0_4 python=3.10
cd LLaMA-Factory-0.4.0/
pip install -r requirements.txt

4.启动API，这里可以使用别的接口

CUDA_VISIBLE_DEVICES=1 python src/api_demo.py \
    --model_name_or_path $qwen14b_chat_path \
    --template default \
    --finetuning_type lora \
    --checkpoint_dir path_modle

修改API端口：src/api_demo.py中修改，参数如下所示

LLama Factory 实操记录（一）-CSDN博客

5.测试API接口的请求记录body，http://192.168.0.133:8000/v1/chat/completions

查看doc文档：http://192.168.0.133:8000/docs

{
  "model": "string",
  "messages": [
    {
      "role": "user",
      "content":  "问题"
    }
  ],
  "do_sample": true,
  "temperature": 0,
  "top_p": 0.5,
  "n":1,
  "max_tokens": 2048,
  "stream": false
}

6. 错误1

422 Unprocessable Entity

422 表现为请求格式错误，但出现了语义错误，以至于服务端无法响应。可以理解为服务端能理解请求资源类型 content-type，否则应该返回 415（Unsupported Media Type），也能理解请求实体内容，否则应该返回 400（Bad Request）

大概率是请求发的内容，要选择 json，并注意字段

7.错误2

API端口，报错RuntimeError: probability tensor contains either inf , nan or element < 0

推理的时候报 RuntimeError: 概率张量包含inf,nan或 element < 0

（1）一种说法是，原因是双卡推理，当前的确也出现该问题，A100上正常，双4090有问题。

Baichuan2合并lora后推理报错：RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 · Issue #1618 · hiyouga/LLaMA-Factory · GitHubReminder I have read the README and searched the existing issues. Reproduction 最新发现使用 api-for-llm来部署，同样会报一样的错误。看来不是这个框架的原因。模型基座是：Baichuan2-13B-Chat, 进行lora微调并合并，使用cli_demo.py 加载合并后的模型时，推理报错。报错信息如下。使用的是最新的代码。一个奇怪的现象，同样的导出合并后的模型，在A800...https://github.com/hiyouga/LLaMA-Factory/issues/1618

（2）另一种说法是，要更新代码，当前测试的是0.4.0版本的源码，未更新，大家有兴趣可以尝试一下，放在评论区

使用`web_demo.py`，部署网页端示例，报错`RuntimeError: probability tensor contains either `inf`, `nan` or element < 0` · Issue #1642 · hiyouga/LLaMA-Factory · GitHubReminder I have read the README and searched the existing issues. Reproduction python src/web_demo.py \ --model_name_or_path ~/model/ChatGLM2-6B\ --template chatglm2 Expected behavior 希望正常运行，可以成功在多卡中成功运行 System Info transformers version:...https://github.com/hiyouga/LLaMA-Factory/issues/1642文章来源地址https://www.toymoban.com/news/detail-765462.html

到了这里，关于LLama Factory 安装部署实操记录（二）的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！