大模型部署手记(8)LLaMa2+Windows+llama.cpp+英文文本补齐

这篇具有很好参考价值的文章主要介绍了大模型部署手记(8)LLaMa2+Windows+llama.cpp+英文文本补齐。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

llama2模型部署,大模型,深度学习,windows

1.简介:

组织机构:Meta(Facebook)

代码仓:https://github.com/facebookresearch/llama

模型:llama-2-7b

下载:使用download.sh下载

硬件环境:暗影精灵7Plus

Windows版本:Windows 11家庭中文版 Insider Preview 22H2

内存 32G

GPU显卡:Nvidia GTX 3080 Laptop (16G)

llama2模型部署,大模型,深度学习,windows

2.代码和模型下载:

下载llama.cpp的代码仓:

git clone https://github.com/ggerganov/llama.cpp

llama2模型部署,大模型,深度学习,windows

需要获取原始LLaMA的模型文件,放到 models目录下,现在models目录下是这样的:

llama2模型部署,大模型,深度学习,windows

参考 https://blog.csdn.net/snmper/article/details/133578456

将上次在Jetson AGX Orin上的成功运行的7B模型文件传到 models目录下:

llama2模型部署,大模型,深度学习,windows

3.llama.cpp环境安装:

查看readme,找到llamp.cpp在Windows上的安装方式

llama2模型部署,大模型,深度学习,windows

打开 https://github.com/skeeto/w64devkit/releases

llama2模型部署,大模型,深度学习,windows

找到最新fortran版本的 w64devkit:

llama2模型部署,大模型,深度学习,windows

下载完成后系统弹出:

llama2模型部署,大模型,深度学习,windows

往前找一个版本v.19.0试试:https://github.com/skeeto/w64devkit/releases/tag/v1.19.0llama2模型部署,大模型,深度学习,windows

解压到 D:\w64devkit

llama2模型部署,大模型,深度学习,windows

运行 w64devkit.exe

llama2模型部署,大模型,深度学习,windows

切换到 d: 盘

cd llama.cpp

llama2模型部署,大模型,深度学习,windows

python -V

llama2模型部署,大模型,深度学习,windows

这里python是3.7.5版本。

查看下make,cmake,gcc,g++的版本:

llama2模型部署,大模型,深度学习,windows

编译试试:

make

llama2模型部署,大模型,深度学习,windows

耐心等待编译结束(或者编译出错)

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

这个到底算不算恶意软件呢?

llama2模型部署,大模型,深度学习,windows

张小白感觉不像,于是到llama.cpp的官方去提了个issue确认一下:https://github.com/ggerganov/llama.cpp/issues/3463

llama2模型部署,大模型,深度学习,windows

官方回答如下:https://github.com/ggerganov/llama.cpp/discussions/3464

llama2模型部署,大模型,深度学习,windows

张小白还是决定使用 w64devkit,而且是最新版。在编译期间关掉 360杀毒软件!!!(其实还得关闭360安全卫士)

重新打开 https://github.com/skeeto/w64devkit/releases

下载 w64devkit-fortran-1.20.0.zip

解压到D盘:

llama2模型部署,大模型,深度学习,windows

双击运行 w64devkit.exe

llama2模型部署,大模型,深度学习,windows

cd d:/

cd llama.cpp

make

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

耐心等待编译结束:

llama2模型部署,大模型,深度学习,windows

编译成功。

llama2模型部署,大模型,深度学习,windows

其中exe就是生成好的windows可执行文件。

退出 w64devkit.编译环境。

4.安装依赖

创建conda环境

conda create -n llama python=3.10

conda activate llama

llama2模型部署,大模型,深度学习,windows

cd llama.cpp

pip install -r requirements.txt

llama2模型部署,大模型,深度学习,windows

5.部署验证

阅读下面这段内容:

llama2模型部署,大模型,深度学习,windows

将7B模型(14G左右)转换成 ggml FP16模型

python convert.py models/7B/

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

模型写到了 models\7B\ggml-model-f16.gguf 文件中:也是14G左右。

llama2模型部署,大模型,深度学习,windows

将刚才转换好的FP16模型进行4-bit量化:

./quantize ./models/7B/ggml-model-f16.gguf ./models/7B/ggml-model-q4_0.gguf q4_0

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

量化后的文件为:./models/7B/ggml-model-q4_0.gguf

llama2模型部署,大模型,深度学习,windows

大小只有3.8G了。

进行推理:

./main -m ./models/7B/ggml-model-q4_0.gguf -n 128

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

运行结果如下:

Refresh your summer look with our stylish new range of women's swimwear. Shop the latest styles in bikinis, tankinis and one pieces online at Simply Beach today! Our collection offers a wide selection of flattering designs from classic cuts to eye-catching prints that will turn heads on your next day by the pool. [end of text]

这个貌似是随机生成的一段话。

换个提示词: ./main -m ./models/7B/ggml-model-q4_0.gguf --prompt "Once upon a time"

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

补齐的文字如下:

Once upon a time, there was no such thing as a "social network". The idea of connecting with someone else on the Internet simply by clicking on their name and seeing who they were connected to and what we might have in common is a relatively new concept. But this kind of connection has become so much a part of our lives that we don't even think twice about it, right?
But once upon a time there was only one way to connect with someone: you either knew them or you didn't. And if you met somebody and became friends, the way you maintained your relationship was to stay in touch by phone, letter, or in person. It wasn't that easy before e-mail, cell phones, Facebook, Twitter, texting, and all the other ways we keep in touch today.
So I say once upon a time because social networking is not quite as new as it seems to be. In fact, I think the first true social network was formed back in 1594 when Shakespeare's "Hamlet" premiered at London's Globe Theatre and his performance was greeted by thunderous applause and a standing ovation by the entire audience.
At that time there were no movie theatre chains to advertise, no TV shows, no radio stations or even newspapers with paid reviews to promote "Hamlet" in advance of its opening night. Shakespeare's only way to get the word out about his latest production was through a series of "word-of-mouth" conversations between the people who had gone to see it and all those they encountered afterwards.
This was, by far, the most advanced social network that existed up until that time! And yet this type of social networking is probably still used today in the modern theatre world where actors and producers meet with audience members after their show to get feedback on how well (or not) it went over for them.
What we now call "social networking" is nothing more than the latest iteration of a centuries-old system that's already proven itself to be effective, but only when used by those who choose to engage in it voluntarily and without coercion. And yes, I realize that this particular definition of social networking has changed over time as well: from Shakespeare's "word of mouth" all the way up to the first online bulletin board systems (BBS) with 300-baud modems.
And yet, the latest innovation in social networking, Web 2.0 and its accompanying sites like Facebook, Twitter and LinkedIn still have yet to surpass these earlier methods in the minds of those who prefer not to use them (and they exist by virtue of an ever-growing user base).
So why is it that so many people are afraid of social networking? After all, there's no reason for anyone to feel compelled or coerced into joining these sites. And yet, despite this fact, a growing number of people seem more than willing to give up their personal information and privacy on the Internet. Why is that?
The answer is simple: most people don't have an accurate picture of what social networking really means. What they imagine it looks like bears little resemblance to how these sites actually work, let alone what's actually going on behind the scenes.
In a nutshell, those who believe that Web 2.0 is nothing but another attempt at getting us all "connected" are missing out on something very important: social networking isn't really about connecting with other people (much like Facebook and LinkedIn) or exchanging information (like Twitter). It's actually about the things we do when we connect, exchange information and interact.
So what does this mean? Simply put, all of these sites are ineffective at helping us get to know each other better. They have very little influence on how we choose who to trust or not to trust among our personal networks. What they're actually good for is gathering data (or information) about us as a way to sell us things that we don't really need and might not even want.
This isn't an attack on social networking, it's just the truth. Facebook may have started out as a site where students can connect with each other but it has now evolved into something much more sinister: a database of personal information about every one of its users that can be sold to anyone at any time without your consent (or even knowledge).
In effect, sites like LinkedIn and Facebook are nothing but the modern version of the old fashioned "spammers" who used to send us junk email. In addition to their obvious privacy concerns and their inability to help us connect with each other or exchange information, these social networking sites should also be regarded as a direct threat to our personal safety.
Why? Well, for one thing the information that they collect about us (and sell to others) can also be used by criminals to commit fraud against us and even extort money from us. This is why it's so important that we take control of this information and use it wisely instead of letting these sites control our private lives for their own selfish reasons.
The reality is that these sites cannot be trusted with the kind of personal information that they require about each one of us. Sites like LinkedIn or Facebook are nothing more than a threat to our privacy and should be regarded as such by every single person who uses them. In fact, sites like this (and any others) are in effect "spammers" who use the same tactics that spammers used to use in order to scam us into using their services.
I don't have a Facebook account and I don't plan on ever creating one either. This site is actually nothing more than a direct threat to my privacy because it uses the same old trick of collecting personal information about me (without my permission) in order to spam me with ads that will help them get rich at my expense. They have even resorted to using psychological tricks and sophisticated surveys in order to manipulate our feelings into believing that they are something important to us.
The truth is that sites like Facebook (or LinkedIn) can only be trusted if we're the ones who control them instead of letting others control them so that they can profit from it. In fact, a site like this can never even hope to become our friend because it doesn't respect the privacy rights of its users at all. This is why I am against these sites and their invasive surveys but if you want to know more about how these sites work then check out the link that we have below in order to learn a bit more about these sites. [end of text]

由于llama原始模型都是英文回答(后面会考虑试验改进后的中文),有请词霸翻译一下:

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

llama2模型部署,大模型,深度学习,windows

先试验到这里吧!文章来源地址https://www.toymoban.com/news/detail-773860.html

到了这里,关于大模型部署手记(8)LLaMa2+Windows+llama.cpp+英文文本补齐的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 大模型部署手记(13)LLaMa2+Chinese-LLaMA-Plus-2-7B+Windows+LangChain+摘要问答

    组织机构:Meta(Facebook) 代码仓:GitHub - facebookresearch/llama: Inference code for LLaMA models 模型:chinese-alpaca-2-7b-hf、text2vec-large-chinese 下载:使用百度网盘和huggingface.co下载 硬件环境:暗影精灵7Plus Windows版本:Windows 11家庭中文版 Insider Preview 22H2 内存 32G GPU显卡:Nvidia GTX 3080 Laptop

    2024年02月04日
    浏览(45)
  • llama.cpp LLM模型 windows cpu安装部署;运行LLaMA2模型测试

    参考: https://www.listera.top/ji-xu-zhe-teng-xia-chinese-llama-alpaca/ https://blog.csdn.net/qq_38238956/article/details/130113599 cmake windows安装参考:https://blog.csdn.net/weixin_42357472/article/details/131314105 1、下载: 2、编译 3、测试运行 参考: https://zhuanlan.zhihu.com/p/638427280 模型下载: https://huggingface.co/nya

    2024年02月16日
    浏览(45)
  • llama.cpp LLM模型 windows cpu安装部署;运行LLaMA-7B模型测试

    参考: https://www.listera.top/ji-xu-zhe-teng-xia-chinese-llama-alpaca/ https://blog.csdn.net/qq_38238956/article/details/130113599 cmake windows安装参考:https://blog.csdn.net/weixin_42357472/article/details/131314105 1、下载: 2、编译 3、测试运行 参考: https://zhuanlan.zhihu.com/p/638427280 模型下载: https://huggingface.co/nya

    2024年02月15日
    浏览(54)
  • Windows11下私有化部署大语言模型实战 langchain+llama2

    CPU:锐龙5600X 显卡:GTX3070 内存:32G 注:硬件配置仅为博主的配置,不是最低要求配置,也不是推荐配置。该配置下计算速度约为40tokens/s。实测核显笔记本(i7-1165g7)也能跑,速度3tokens/s。 Windows系统版本:Win11专业版23H2 Python版本:3.11 Cuda版本:12.3.2 VS版本:VS2022 17.8.3 lan

    2024年02月03日
    浏览(224)
  • llama.cpp LLM模型 windows cpu安装部署

    参考: https://www.listera.top/ji-xu-zhe-teng-xia-chinese-llama-alpaca/ https://blog.csdn.net/qq_38238956/article/details/130113599 cmake windows安装参考:https://blog.csdn.net/weixin_42357472/article/details/131314105 1、下载: 2、编译 3、测试运行 参考: https://zhuanlan.zhihu.com/p/638427280 模型下载: https://huggingface.co/nya

    2024年02月11日
    浏览(41)
  • AI-windows下使用llama.cpp部署本地Chinese-LLaMA-Alpaca-2模型

    生成的文件在 .buildbin ,我们要用的是 main.exe , binmain.exe -h 查看使用帮助 本项目基于Meta发布的可商用大模型Llama-2开发,是中文LLaMAAlpaca大模型的第二期项目,开源了中文LLaMA-2基座模型和Alpaca-2指令精调大模型。这些模型在原版Llama-2的基础上扩充并优化了中文词表,使用

    2024年04月25日
    浏览(75)
  • llama.cpp LLM模型 windows cpu安装部署踩坑记录

    一直想在自己的笔记本上部署一个大模型验证,早就听说了llama.cpp,可是一直没时间弄。 今天终于有时间验证了。首先本机安装好g++,cmake.我下载的cmake版本是cmake-3.27.0-rc4-windows-x86_64.msi。安装时选择增加系统变量。接着GitHub - ggerganov/llama.cpp: Port of Facebook\\\'s LLaMA model in C/C++ 执行

    2024年02月15日
    浏览(43)
  • 大模型Llama2部署,基于text-generation-webui、Llama2-Chinese

    参考安装教程:傻瓜式!一键部署llama2+chatglm2,集成所有环境和微调功能,本地化界面操作! Github地址:GitHub - oobabooga/text-generation-webui: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, llama.cpp (ggml/gguf), Llama models. 模型下载地址:meta-llama/Llama-2-13b-chat-hf at main 遇到的问

    2024年02月08日
    浏览(51)
  • llama.cpp部署在windows

    本想部署LLAMA模型,但是基于显卡和多卡的要求,很难部署在个人笔记本上,因此搜索发现有一个量化版本的LLAMA.cpp,部署过程和踩过的坑如下: (1)在GitHub - ggerganov/llama.cpp: Port of Facebook\\\'s LLaMA model in C/C++中下载cpp到本地 (2)创建conda环境 (3)安装Cmake 在安装 之前 我们需

    2024年02月04日
    浏览(45)
  • LLMs之LLaMA2:基于云端进行一键部署对LLaMA2模型实现推理(基于text-generation-webui)执行对话聊天问答任务、同时微调LLaMA2模型(配置云端环境【A100】→下载数

    LLMs之LLaMA-2:基于云端进行一键部署对LLaMA2模型实现推理(基于text-generation-webui)执行对话聊天问答任务、同时微调LLaMA2模型(配置云端环境【A100】→下载数据集【datasets】→加载模型【transformers】→分词→模型训练【peft+SFTTrainer+wandb】→基于HuggingFace实现云端分享)之图文教程详

    2024年02月11日
    浏览(51)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包