LLaMA-Factory使用V100微调ChatGLM2报错 RuntimeError: “addmm_impl_cpu_“ not implemented for ‘Half‘

这篇具有很好参考价值的文章主要介绍了LLaMA-Factory使用V100微调ChatGLM2报错 RuntimeError: “addmm_impl_cpu_“ not implemented for ‘Half‘。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

微调命令

CUDA_VISIBLE_DEVICES=0 python /aaa/LLaMA-Factory/src/train_bash.py \
    --stage sft \
    --model_name_or_path /aaa/LLaMA-Factory/models/chatglm2-6b \
    --do_train \
    --dataset bbbccc \
    --template chatglm2 \
    --finetuning_type lora \
    --lora_target query_key_value \
    --output_dir output/dddeee/ \
    --overwrite_cache \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 10 \
    --learning_rate 5e-5 \
    --num_train_epochs 3.0 \
    --plot_loss 

已经从huggingface下载完整的模型并配置正确路径,也对自定义数据集仿照alpaca_gpt4_data_zh.json在dataset_info.json中写入相关配置。但运行如上命令还是有报错如下:


[INFO|training_args.py:1798] 2023-11-02 16:00:19,165 >> PyTorch: setting up devices
Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
[INFO|trainer.py:1760] 2023-11-02 16:00:19,402 >> ***** Running training *****
[INFO|trainer.py:1761] 2023-11-02 16:00:19,402 >>   Num examples = 1,372
[INFO|trainer.py:1762] 2023-11-02 16:00:19,402 >>   Num Epochs = 3
[INFO|trainer.py:1763] 2023-11-02 16:00:19,402 >>   Instantaneous batch size per device = 4
[INFO|trainer.py:1766] 2023-11-02 16:00:19,402 >>   Total train batch size (w. parallel, distributed & accumulation) = 16
[INFO|trainer.py:1767] 2023-11-02 16:00:19,403 >>   Gradient Accumulation steps = 4
[INFO|trainer.py:1768] 2023-11-02 16:00:19,403 >>   Total optimization steps = 255
[INFO|trainer.py:1769] 2023-11-02 16:00:19,404 >>   Number of trainable parameters = 1,949,696
  0%|                                                                                                                       | 0/255 [00:00<?, ?it/s]/aaa/envs/bbb_llama_factory_py310/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
Traceback (most recent call last):
  File "/aaa/LLaMA-Factory/src/train_bash.py", line 14, in <module>
    main()
  File "/aaa/LLaMA-Factory/src/train_bash.py", line 5, in main
    run_exp()
  File "/aaa/LLaMA-Factory/src/llmtuner/tuner/tune.py", line 26, in run_exp
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/aaa/LLaMA-Factory/src/llmtuner/tuner/sft/workflow.py", line 67, in run_sft
    train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/transformers/trainer.py", line 1591, in train
    return inner_training_loop(
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/transformers/trainer.py", line 1892, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/transformers/trainer.py", line 2776, in training_step
    loss = self.compute_loss(model, inputs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/transformers/trainer.py", line 2801, in compute_loss
    outputs = model(**inputs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/peft/peft_model.py", line 918, in forward
    return self.base_model(
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 94, in forward
    return self.model.forward(*args, **kwargs)
  File "/xxxcache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 937, in forward
    transformer_outputs = self.transformer(
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/xxxcache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 830, in forward
    hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/xxxcache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 631, in forward
    layer_ret = torch.utils.checkpoint.checkpoint(
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/_compile.py", line 24, in inner
    return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 328, in _fn
    return fn(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 17, in inner
    return fn(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 451, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/autograd/function.py", line 539, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 230, in forward
    outputs = run_function(*args)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/xxxcache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 544, in forward
    attention_output, kv_cache = self.self_attention(
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/xxxcache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 376, in forward
    mixed_x_layer = self.query_key_value(hidden_states)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/envs/llama_factory_py310/lib/python3.10/site-packages/peft/tuners/lora.py", line 902, in forward
    result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
  0%|                                                                                  | 0/255 [00:00<?, ?it/s]

命令运行过程中,看上去已经成功加载模型了,应该是训练第1个epoch时的报错。我--fp16加到上面的命令中运行,也有报错。

这是与开源社区交流的记录: https://github.com/hiyouga/LLaMA-Factory/issues/1359文章来源地址https://www.toymoban.com/news/detail-759674.html

  • 原因:cuda 环境问题
  • 解决方案:pip install torch==2.0.1
  • 排查:打log看torch.cuda.is_available()输出为False说明CUDA环境有问题

到了这里,关于LLaMA-Factory使用V100微调ChatGLM2报错 RuntimeError: “addmm_impl_cpu_“ not implemented for ‘Half‘的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • Llama3-8B+ LLaMA-Factory 中文微调

    Llama3是目前开源大模型中最优秀的模型之一,但是原生的Llama3模型训练的中文语料占比非常低,因此在中文的表现方便略微欠佳! 本教程就以Llama3-8B-Instruct开源模型为模型基座,通过开源程序LLaMA-Factory来进行中文的微调,提高Llama3的中文能力!LLaMA-Factory是一个开源的模型训

    2024年04月27日
    浏览(48)
  • llama-factory SFT 系列教程 (四),lora sft 微调后,使用vllm加速推理

    llama-factory SFT系列教程 (一),大模型 API 部署与使用 llama-factory SFT系列教程 (二),大模型在自定义数据集 lora 训练与部署 llama-factory SFT系列教程 (三),chatglm3-6B 命名实体识别实战 llama-factory SFT 系列教程 (四),lora sft 微调后,使用vllm加速推理 llama-factory 提供了 vllm API 部署,但笔

    2024年04月27日
    浏览(38)
  • Llama-Factory的baichuan2微调

    Llama-Factory:https://github.com/hiyouga/LLaMA-Factory/tree/main 请使用   来启用 QLoRA 训练。 (1)奖励模型训练 (2)PPO训练(PPO训练需要先进行上一步RM的训练,然后导入微调后模型和RM进行训练输出)        大规模无监督语言模型(LMs)虽然可以学习广泛的世界知识和一些推理技能

    2024年02月05日
    浏览(41)
  • 从零开始的LLaMA-Factory的指令增量微调

    大模型,包括部署微调prompt/Agent应用开发、知识库增强、数据库增强、知识图谱增强、自然语言处理、多模态等大模型应用开发内容 从0起步,扬帆起航。 大模型应用向开发路径及一点个人思考 大模型应用开发实用开源项目汇总 大模型问答项目问答性能评估方法 大模型数据

    2024年04月09日
    浏览(64)
  • 安装LLaMA-Factory微调chatglm3,修改自我认知

    安装git clone https://github.com/hiyouga/LLaMA-Factory.git conda create -n llama_factory python=3.10 conda activate llama_factory cd LLaMA-Factory pip install -r requirements.txt 之后运行 单卡训练, CUDA_VISIBLE_DEVICES=0 python src/train_web.py,按如下配置 demo_tran.sh   export_model.sh   cli_demo.sh 注意合并模型的时候,最后复制

    2024年02月04日
    浏览(60)
  • 快速上手!LLaMa-Factory最新微调实践,轻松实现专属大模型

    Yuan2.0(https://huggingface.co/IEITYuan)是浪潮信息发布的新一代基础语言大模型,该模型拥有优异的数学、代码能力。自发布以来,Yuan2.0已经受到了业界广泛的关注。当前Yuan2.0已经开源参数量分别是102B、51B和2B的3个基础模型,以供研发人员做进一步的开发。 LLM(大语言模型)微

    2024年01月20日
    浏览(54)
  • LLaMA-Factory微调(sft)ChatGLM3-6B保姆教程

    下载LLaMA-Factory 下载ChatGLM3-6B 下载ChatGLM3 windows下载CUDA ToolKit 12.1 (本人是在windows进行训练的,显卡GTX 1660 Ti) CUDA安装完毕后,通过指令 nvidia-smi 查看 1、选择下载目录:E:llm-trainLLaMA-Factory,并打开 2、创建新的python环境,这里使用conda创建一个python空环境,选择python3.10 参考

    2024年04月13日
    浏览(67)
  • LLaMA-Factory 8卡4090 deepspeed zero3 微调Qwen14B-chat

    环境安装 推荐使用docker,Ubuntu20.04 https://www.modelscope.cn/docs/%E7%8E%AF%E5%A2%83%E5%AE%89%E8%A3%85 下载模型 在modelscope主页,找到模型 https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary 可以使用如下脚本 微调 使用LLaMA-Factory, 下载下面仓库的代码, https://github.com/hiyouga/LLaMA-Factory 在代码目录,

    2024年04月15日
    浏览(53)
  • 【本地大模型部署与微调】ChatGLM3-6b、m3e、one-api、Fastgpt、LLaMA-Factory

    本文档详细介绍了使用ChatGLM3-6b大模型、m3e向量模型、one-api接口管理以及Fastgpt的知识库,成功的在本地搭建了一个大模型。此外,还利用LLaMA-Factory进行了大模型的微调。 1.ChatGLM3-6b 2.m3e 3.One-API 4.Fastgpt 5.LLaMA-Factory 1.1创建腾讯云服务器 注意: ChatGLM3-6b的大模型40多个G,购买腾讯

    2024年03月22日
    浏览(45)
  • llama-factory SFT系列教程 (一),大模型 API 部署与使用

    本来今天没有计划学 llama-factory ,逐步跟着github的文档走,发现这框架确实挺方便,逐渐掌握了一些。 最近想使用 SFT 微调大模型,llama-factory 是使用非常广泛的大模型微调框架; 基于 llama_factory 微调 qwen/Qwen-7B,qwen/Qwen-7B-Chat 我使用的是 qwen/Qwen-7B ,如果追求对话效果 qwen/

    2024年04月16日
    浏览(46)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包