LLM - Hugging Face 工程 BERT base model (uncased) 配置-Toy模板网

这篇具有很好参考价值的文章主要介绍了LLM - Hugging Face 工程 BERT base model (uncased) 配置。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

欢迎关注我的CSDN：https://spike.blog.csdn.net/
本文地址：https://blog.csdn.net/caroline_wendy/article/details/131400428

LLM - Hugging Face 工程 BERT base model (uncased) 配置

BERT是一个在大量英文数据上以自监督的方式预训练的变换器模型。这意味着它只是在原始文本上进行预训练，没有人以任何方式对它们进行标注（这就是为什么它可以使用大量公开可用的数据），而是用一个自动的过程来从这些文本中生成输入和标签。更准确地说，它是用两个目标进行预训练的：

掩码语言建模 (Masked Language Modeling，MLM) ：给定一个句子，模型随机地掩盖输入中的15%的词，然后将整个掩盖的句子通过模型，并且必须预测掩盖的词。这与传统的循环神经网络（RNN）不同，它们通常是一个接一个地看词，或者与像GPT这样的自回归模型不同，它们内部地掩盖未来的词。这使得模型能够学习句子的双向表示。
下一句预测 (Next Sentence Prediction，NSP)：模型在预训练期间将两个掩盖的句子作为输入拼接起来。有时它们对应于原始文本中相邻的句子，有时不是。然后模型必须预测这两个句子是否是相互跟随的。

uncased 表示不区分大小写

Hugging Face：bert-base-uncased

配置 ssh 之后，使用 git 下载工程，模型使用占位符：

git clone git@hf.co:bert-base-uncased

从 Hugging Face 网站，下载 5 个大文件：

flax_model.msgpack  # 417M
model.safetensors		# 420M
pytorch_model.bin		# 420M
rust_model.ot				# 509M
tf_model.h5					# 511M

使用 bypy 下载文件，参考：CSDN - 使用网盘快速下载 Hugging Face 大模型

bypy info
bypy downdir /bert-base-uncased/ ./bert-base-uncased/

完成更新 5 个文件。

测试脚本：

from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained("bert-base-uncased")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
print(f"output.last_hidden_state: {output.last_hidden_state.shape}")

输出：文章来源地址https://www.toymoban.com/news/detail-512948.html

output.last_hidden_state: torch.Size([1, 12, 768])

到了这里，关于LLM - Hugging Face 工程 BERT base model (uncased) 配置的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！

LLM - Hugging Face 工程 BERT base model (uncased) 配置

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏

支付宝扫一扫领取红包，优惠每天领

二维码1

二维码2