LLM__llama-7B模型试验

这篇具有很好参考价值的文章主要介绍了LLM__llama-7B模型试验。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

llama模型已经开源很久了,所以拿做小的模型做了个简单尝试

一、服务器购买与配置

1.1 服务器购买

因为做简单尝试并不打算长期持有,所以以便宜、够用、好退货为主要参考依据购买阿里云服务器、
我看7B的模型权重大小就13GB,所以先购入一个 32GB内存的虚拟机

  • CPU&内存: 4核(vCPU) 32 GiB ~
  • 操作系统: Alibaba Cloud Linux 3.2104 LTS 64位 ARM版 等保2.0三级版
  • 实例规格: ecs. … (升配前的机型忘记了)
  • 带宽: 5M
  • 收费: 大约1.4元/时

但是后面加载模型的时候就坑了直接OOM, 查报错如下:

dmesg | egrep -i -B100 'killed process'
 Killed process 63907 (python) total-vm:40772484kB, anon-rss:30914228kB, file-rss:4kB, shmem-rss:0kB, UID:0 pgtables:61332kB oom_score_adj

可以看到7B是需要39GB的内存的,还是我天真了。于是就进行了升配

  • CPU&内存: 8核(vCPU) 64 GiB
  • 操作系统: Alibaba Cloud Linux 3.2104 LTS 64位 ARM版 等保2.0三级版
  • 实例规格: ecs. … (升配前的机型忘记了)
  • 带宽: 5M
  • 收费: 大约2.5元/时

1.2 环境快速搭建与模型权重下载

这边安转anaconda的时候需要注意下框架

# 安装git
yum install git
yum install -y bzip2
yum install dpkg
yum install md5

# 查看系统
arch # aarch64

# 安装anaconda https://repo.anaconda.com/archive/
wget --no-check-certificate https://repo.anaconda.com/archive/Anaconda3-2023.03-Linux-aarch64.sh
chmod +x Anaconda3-2023.03-Linux-aarch64.sh
./Anaconda3-2023.03-Linux-aarch64.sh

# path加入
cd ~
vi .bashrc
export PATH=/root/anaconda3/bin:$PATH
source ~/.bashrc

conda env list
# 创建环境
conda create -n sccDeep --clone base
conda create -n joyrl --clone base

pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
# 
yum install gdb
yum install clang
yum install llvm 
# https://help.aliyun.com/practice_detail/433400


# 百度云下载数据
pip install bypy 
bypy info
bypy list
nohup bypy downfile LLaMA/7B/params.json > __th1.log &

二、llama 尝试

只进行了简单尝试,这里由于都是用cpu的所以要把模型文件中的.cuda()去掉。
同时dist.init_process_group要用gloo

import sys
import os
import torch
import torch.distributed as dist
import torch.distributed as dist
import json
from pathlib import Path
from rich.console import Console
from fairscale.nn.model_parallel.initialize import initialize_model_parallel
sys.path.append('/root/llama/llama')
from llama import  ModelArgs, Transformer, Tokenizer, LLaMA
import  logging
from functools import wraps
from datetime import datetime
log = logging.getLogger(__name__)
log.setLevel(logging.DEBUG)
formatter = logging.Formatter(fmt='%(asctime)s %(name)s %(levelname)s %(message)s',
                                      datefmt='%Y-%m-%d %H:%M:%S')
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
handler.setFormatter(formatter)
log.addHandler(handler)
log.info('start')
cs = Console()


def clock(func):
    @wraps(func)
    def clocked(*args, **kwargs):
        st = datetime.now()
        res = func(*args, **kwargs)
        cost_ = datetime.now() - st
        func_name = func.__name__
        print(f'{func_name} Done ! It cost {cost_}.')
        return res
    return clocked



def load_7B_model():
    local_rank = int(os.environ.get("LOCAL_RANK", 0))
    world_size = int(os.environ.get("WORLD_SIZE", 1))
    # gloo cpu ; nccl - gpu
    dist.init_process_group('gloo', init_method='file:///tmp/tmpInit19', rank=local_rank, world_size=world_size)
    # torch.distributed.init_process_group("nccl")
    initialize_model_parallel(world_size)

    max_seq_len = 512
    max_batch_size = 2
    ckpt_dir = '/root/llama/llama/model/7B/'
    with open(Path(ckpt_dir) / "params.json", "r") as f:
        params = json.loads(f.read())

    model_args = ModelArgs(
        max_seq_len=max_seq_len, max_batch_size=max_batch_size, **params
    )
    tokenizer_path = '/root/llama/llama/model/tokenizer.model'
    tokenizer = Tokenizer(model_path=tokenizer_path)
    model_args.vocab_size = tokenizer.n_words
    model = Transformer(model_args)
    log.info('load Transformer Struct')

    ckpt_path = os.path.join(ckpt_dir, 'consolidated.00.pth')
    # checkpoint = torch.load(ckpt_path, map_location="cpu")
    log.info('start load params')
    model.load_state_dict(torch.load(ckpt_path, map_location="cpu"), strict=False)
    generator = LLaMA(model, tokenizer)
    return generator


generator = load_7B_model()
@clock
def ask_llama(pp):
    return generator.generate(
    [pp], max_gen_len=256, temperature=0.8, top_p=0.95
)

ask_llama('I believe the meaning of life is')

进行了两次简单询问,就失去了兴趣。。。

  • 首先推理速度极慢,需要3分钟多(租的云服务器配置比较low)
  • 其次生成的文本质量远低于预期

看第一个prompt : 'I believe the meaning of life is'

>>> ask_llama('I believe the meaning of life is')
ask_llama Done ! It cost 0:03:11.220175.
["I believe the meaning of life is to help others and I intend to keep doing so.\nI would love to continue this by setting up a community that does fundraising and raises money to help people in need.\nA situation that really inspired me to start this was when I had to go to hospital and I was so grateful for the care that I got. If it hadn't been for the hospital I would have been in a very bad place and without them I would not be here now.\nI went to hospital because I got appendicitis and needed to have an operation. When I woke up I was so grateful for the care that I got. The nurses, doctors and everyone who helped me were so kind and caring. After being in hospital for 12 days I was told that the operation I had was a very rare one and it was done by a very small group of people in the UK. I had no idea that this surgery had been done in my hospital by such a small group of people. The nurses, doctors and everyone else who helped me were so kind and caring and if I hadn't had the operation I would not be here now.\nI want to set up a community that does fundraising"]

LLM__llama-7B模型试验

看第二个prompt : 'I would like to travel to Japan. Help me draft a travel plan'

>>> ask_llama('I would like to travel to Japan. Help me draft a travel plan')
ask_llama Done ! It cost 0:03:11.220175.
["I would like to travel to Japan. Help me draft a travel plan. I am a diabetic. I am also allergic to bee stings.\nOk, I will try my best to help you. I am a Japanese living in the United States.\nSo I have to tell you that you will have a hard time in Japan.\nI will try to tell you some information about traveling in Japan.\nFirst of all, you need a visa if you want to stay longer than 90 days.\nIf you have a health insurance, you can get a 90 days visa.\n(http://www.jnto.go.jp/eng/indepth/health_insurance.html)\nThis is the website of the embassy.\nThere is also information about traveling in Japan on the website.\nI suggest you contact them to find out more.\nNow, I am going to tell you some information.\nIf you have a physical disability, you can apply for a free train pass.\nIt will be 100% free.\nI'm sorry, but I don't know about the train pass for diabetic.\nBut, if you take care of your diabetes, I think"]
"""

LLM__llama-7B模型试验

三、llama 尝试小结

大家如果没有申请到模型的可以私信笔者拿模型权重

也是花了一张毛爷爷才摸了摸Llama,下载数据花了一天多(百度云拉取数据大家都懂),文章来源地址https://www.toymoban.com/news/detail-483674.html

  • LLM 没有20G+显存的GPU & 64G+的内存 根本玩不转
  • llama 7B的模型 无GPU 加载需要 39G的内存
  • llama 的模型还需要做一些魔改也训练 才能产出更加好的结果

到了这里,关于LLM__llama-7B模型试验的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • LLM - Chinese-Llama-2-7b 初体验

    目录 一.引言 二.模型下载 三.快速测试 四.训练数据 五.总结 自打 LLama-2 发布后就一直在等大佬们发布 LLama-2 的适配中文版,也是这几天蹲到了一版由 LinkSoul 发布的 Chinese-Llama-2-7b,其共发布了一个常规版本和一个 4-bit 的量化版本,今天我们主要体验下 Llama-2 的中文逻辑顺便

    2024年02月15日
    浏览(34)
  • LLM实践-在Colab上使用免费T4 GPU进行Chinese-Llama-2-7b-4bit推理

    一、配置环境 1、打开colab,创建一个空白notebook,在[修改运行时环境]中选择15GB显存的T4 GPU. 2、pip安装依赖python包 注意此时,安装完accelerate后需要重启notebook,不然报如下错误: ImportError: Using low_cpu_mem_usage=True or a device_map requires Accelerate: pip install accelerate 注:参考文章内容

    2024年02月04日
    浏览(51)
  • LLM:Vicuna 7B模型简单部署体验

    随着ChatGPT的火热,科技公司们各显神通,针对大语言模型LLM通常需要极大的算力支持,且没有开源,阻碍了进一步的研究和应用落地。受 Meta LLaMA 和 Stanford Alpaca 项目的启发,来自加州大学伯克利分校、CMU、斯坦福大学和加州大学圣地亚哥分校的成员,共同推出了一个 Vicun

    2024年02月07日
    浏览(54)
  • 大模型部署手记(11)LLaMa2+Chinese-LLaMA-Plus-2-7B+Windows+llama.cpp+中文对话

    组织机构:Meta(Facebook) 代码仓:GitHub - facebookresearch/llama: Inference code for LLaMA models 模型:LIama-2-7b-hf、Chinese-LLaMA-Plus-2-7B   下载:使用huggingface.co和百度网盘下载 硬件环境:暗影精灵7Plus Windows版本:Windows 11家庭中文版 Insider Preview 22H2 内存 32G GPU显卡:Nvidia GTX 3080 Laptop (1

    2024年02月03日
    浏览(53)
  • 快速训练自己的大语言模型:基于LLAMA-7B的lora指令微调

    前言: 系统:ubuntu 18.04 显卡:A100-80G(蹭的,嘿嘿~) (本次主要记录如何快速进行大模型的指令微调) 地址:https://github.com/Lightning-AI/lit-llama 切换到工程目录 使用pip安装依赖库 (当然,这里可能会遇到网络问题,安装不了lightning) 可使用以下方式安装: 下载lightning工程

    2024年02月11日
    浏览(56)
  • 【大模型系列 06】LLaMA-7B/13B for PyTorch 昇腾迁移

    https://gitee.com/ascend/ModelZoo-PyTorch/tree/master/PyTorch/built-in/foundation/LLaMA-13B LLaMA是由Meta AI发布的大语言系列模型,完整的名字是Large Language Model Meta AI。LLaMA按照参数量的大小分为四个型号:LLaMA-7B、LLaMA-13B、LLaMA-30B与LLaMA-65B。LLaMA 模型的效果极好,LLaMA-13B在大多数基准测试中的表现

    2024年02月12日
    浏览(61)
  • 大模型部署手记(10)LLaMa2+Chinese-LLaMA-Plus-7B+Windows+llama.cpp+中英文对话

    组织机构:Meta(Facebook) 代码仓:GitHub - facebookresearch/llama: Inference code for LLaMA models 模型:llama-2-7b、llama-2-7b-chat( 后来证明无法实现中文转换 )、Chinese-LLaMA-Plus-7B(chinese_llama_plus_lora_7b)   下载:使用download.sh下载 硬件环境:暗影精灵7Plus Windows版本:Windows 11家庭中文版

    2024年02月04日
    浏览(56)
  • 大模型部署手记(9)LLaMa2+Chinese-LLaMA-Plus-7B+Windows+llama.cpp+中文文本补齐

    组织机构:Meta(Facebook) 代码仓:GitHub - facebookresearch/llama: Inference code for LLaMA models 模型:llama-2-7b、Chinese-LLaMA-Plus-7B(chinese_llama_plus_lora_7b)   下载:使用download.sh下载 硬件环境:暗影精灵7Plus Windows版本:Windows 11家庭中文版 Insider Preview 22H2 内存 32G GPU显卡:Nvidia GTX 3080 La

    2024年02月03日
    浏览(55)
  • 大模型部署手记(13)LLaMa2+Chinese-LLaMA-Plus-2-7B+Windows+LangChain+摘要问答

    组织机构:Meta(Facebook) 代码仓:GitHub - facebookresearch/llama: Inference code for LLaMA models 模型:chinese-alpaca-2-7b-hf、text2vec-large-chinese 下载:使用百度网盘和huggingface.co下载 硬件环境:暗影精灵7Plus Windows版本:Windows 11家庭中文版 Insider Preview 22H2 内存 32G GPU显卡:Nvidia GTX 3080 Laptop

    2024年02月04日
    浏览(45)
  • 大语言模型部署:基于llama.cpp在Ubuntu 22.04及CUDA环境中部署Llama-2 7B

    llama.cpp是近期非常流行的一款专注于Llama/Llama-2部署的C/C++工具。本文利用llama.cpp来部署Llama 2 7B大语言模型,所采用的环境为Ubuntu 22.04及NVIDIA CUDA。文中假设Linux的用户目录(一般为/home/username)为当前目录。 NVIDIA官方已经提供在Ubuntu 22.04中安装CUDA的官方文档。本文稍有不同的

    2024年02月06日
    浏览(43)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包