[LangChain核心模块]模型的输入和输出-＞Language models

这篇具有很好参考价值的文章主要介绍了[LangChain核心模块]模型的输入和输出-＞Language models。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

⭐作者介绍：大二本科网络工程专业在读，持续学习Java，努力输出优质文章
⭐作者主页：@逐梦苍穹
⭐所属专栏：人工智能。

1、简介

语言模型（ Language models ）
LangChain提供了两种类型模型的接口和集成：
● LLMs: 输入为文本字符串，输出为文本字符串的模型
● Chat models: 由语言模型支持，输入为聊天消息列表，输出为聊天消息的模型
LLMs vs Chat Models：
LLMs和Chat Models有微妙但重要的区别。
LangChain中的LLMs是指纯文本补全模型。
它们包装的API接受字符串提示作为输入，并输出字符串补全。OpenAI的GPT-3就是LLM的实现。 Chat models通常由LLMs支持，但专门用于进行对话。关键是，它们的提供者API公开了与纯文本补全模型不同的接口。它们的输入不是单个字符串，而是聊天消息的列表。通常，这些消息带有发言者（通常为"System"、“AI"和"Human"之一）。它们返回一个（“AI”）聊天消息作为输出。GPT-4和Anthropic的Claude都是作为Chat Models实现的。
为了能够交换LLMs和Chat Models，两者都实现了Base Language Model接口。这暴露了公共方法"predict”（输入字符串，返回字符串）和"predict messages"（输入消息，返回消息）。如果使用特定模型，建议使用该模型类别的特定方法（即LLMs的"predict"和Chat Models的"predict messages"），但如果正在创建一个应该适用于不同类型模型的应用程序，共享接口可能会很有帮助。

2、LLMs

大型语言模型（LLMs-> Large Language Model）是LangChain的核心组件。 LangChain不提供自己的LLMs，而是提供了与许多不同的LLMs进行交互的标准接口。

2.1、入门

有许多LLM提供商（OpenAI、Cohere、Hugging Face等）- LLM类旨在为所有提供商提供标准接口。
在本演示中，我们将使用OpenAI LLM包装器，尽管突出显示的功能对于所有LLM类型都是通用的。
设置：
首先，我们需要安装OpenAI Python包：
pip install openai
使用API需要一个API密钥，可以通过创建帐户并转到此处获取。一旦我们获得密钥，我们将希望通过运行以下命令将其设置为环境变量：
export OPENAI_API_KEY="..."
如果不想设置环境变量，可以在初始化OpenAI LLM类时直接通过openai_api_key命名参数传递密钥：

from langchain.llms import OpenAI

llm = OpenAI(openai_api_key="...")

否则，可以不使用任何参数进行初始化：

from langchain.llms import OpenAI

llm = OpenAI()

call：输入字符串 -> 输出字符串
使用LLM的最简单方法是可调用的：输入一个字符串，获得一个字符串完成结果。
llm("Tell me a joke")
'Why did the chicken cross the road?\n\nTo get to the other side.'
generate：批量调用，更丰富的输出
generate允许使用字符串列表调用模型，获得比仅文本更完整的响应。这个完整的响应可以包括多个顶部响应和其他LLM提供程序特定的信息：

llm_result = llm.generate(["Tell me a joke", "Tell me a poem"]*15)
len(llm_result.generations)
#30

llm_result.generations[0]

[Generation(text='\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'),
     Generation(text='\n\nWhy did the chicken cross the road?\n\nTo get to the other side.')]

llm_result.generations[-1]

[Generation(text="\n\nWhat if love neverspeech\n\nWhat if love never ended\n\nWhat if love was only a feeling\n\nI'll never know this love\n\nIt's not a feeling\n\nBut it's what we have for each other\n\nWe just know that love is something strong\n\nAnd we can't help but be happy\n\nWe just feel what love is for us\n\nAnd we love each other with all our heart\n\nWe just don't know how\n\nHow it will go\n\nBut we know that love is something strong\n\nAnd we'll always have each other\n\nIn our lives."),
     Generation(text='\n\nOnce upon a time\n\nThere was a love so pure and true\n\nIt lasted for centuries\n\nAnd never became stale or dry\n\nIt was moving and alive\n\nAnd the heart of the love-ick\n\nIs still beating strong and true.')]

还可以访问返回的特定于提供程序的信息。此信息在不同提供程序之间是标准化的。\

llm_result.llm_output

{'token_usage': {'completion_tokens': 3903,
      'total_tokens': 4023,
      'prompt_tokens': 120}}

2.2、缓存

缓存 llm_caching：
LangChain为LLM提供了一个可选的缓存层。这个功能有两个好处：
如果经常多次请求相同的补全，它可以通过减少对LLM提供者的API调用次数来节省费用。它可以通过减少对LLM提供者的API调用次数来加速的应用程序。

import langchain
from langchain.llms import OpenAI

# To make the caching really obvious, lets use a slower model.
llm = OpenAI(model_name="text-davinci-002", n=2, best_of=2)

内存缓存（In Memory Cache）

from langchain.cache import InMemoryCache
langchain.llm_cache = InMemoryCache()

# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

CPU times: user 35.9 ms, sys: 28.6 ms, total: 64.6 ms
    Wall time: 4.83 s
    

    "\n\nWhy couldn't the bicycle stand up by itself? It was...two tired!"

# The second time it is, so it goes faster
llm("Tell me a joke")

CPU times: user 238 µs, sys: 143 µs, total: 381 µs
    Wall time: 1.76 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

SQLite缓存（SQLite Cache）
rm .langchain.db

# We can do the same thing with a SQLite cache
from langchain.cache import SQLiteCache
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")

# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

CPU times: user 17 ms, sys: 9.76 ms, total: 26.7 ms
    Wall time: 825 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

# The second time it is, so it goes faster
llm("Tell me a joke")

CPU times: user 2.46 ms, sys: 1.23 ms, total: 3.7 ms
    Wall time: 2.67 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

链中的可选缓存（Optional Caching in Chains）
还可以关闭链中特定节点的缓存。请注意，由于某些接口的原因，先构建链，然后再编辑LLM通常更容易。
例如，我们将加载一个摘要映射-减少链。我们将对映射步骤的结果进行缓存，但不对合并步骤进行冻结。

llm = OpenAI(model_name="text-davinci-002")
no_cache_llm = OpenAI(model_name="text-davinci-002", cache=False)

from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.mapreduce import MapReduceChain

text_splitter = CharacterTextSplitter()

with open('../../../state_of_the_union.txt') as f:
    state_of_the_union = f.read()
texts = text_splitter.split_text(state_of_the_union)

from langchain.docstore.document import Document
docs = [Document(page_content=t) for t in texts[:3]]
from langchain.chains.summarize import load_summarize_chain

chain = load_summarize_chain(llm, chain_type="map_reduce", reduce_llm=no_cache_llm)
chain.run(docs)

CPU times: user 452 ms, sys: 60.3 ms, total: 512 ms
    Wall time: 5.09 s


    '\n\nPresident Biden is discussing the American Rescue Plan and the Bipartisan Infrastructure Law, which will create jobs and help Americans. He also talks about his vision for America, which includes investing in education and infrastructure. In response to Russian aggression in Ukraine, the United States is joining with European allies to impose sanctions and isolate Russia. American forces are being mobilized to protect NATO countries in the event that Putin decides to keep moving west. The Ukrainians are bravely fighting back, but the next few weeks will be hard for them. Putin will pay a high price for his actions in the long run. Americans should not be alarmed, as the United States is taking action to protect its interests and allies.'

当我们再次运行它时，我们发现它运行得更快，但最终的答案是不同的。这是由于在映射步骤中进行了缓存，但在减少步骤中没有进行缓存。
chain.run(docs)

CPU times: user 11.5 ms, sys: 4.33 ms, total: 15.8 ms
    Wall time: 1.04 s


    '\n\nPresident Biden is discussing the American Rescue Plan and the Bipartisan Infrastructure Law, which will create jobs and help Americans. He also talks about his vision for America, which includes investing in education and infrastructure.'

rm .langchain.db sqlite.db

2.3、流式传输

流式传输（Streaming）：
一些LLM提供流式响应。这意味着可以在整个响应返回之前开始处理它，而不是等待它完全返回。如果希望在生成响应时向用户显示响应，或者希望在生成响应时处理响应，这将非常有用。
目前，我们支持对 OpenAI、ChatOpenAI 和 ChatAnthropic 实现的流式传输。要使用流式传输，请使用一个实现了 on_llm_new_token 的 CallbackHandler。在这个示例中，我们使用的是 StreamingStdOutCallbackHandler。

from langchain.llms import OpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler


llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)
resp = llm("Write me a song about sparkling water.")

如果使用 generate，我们仍然可以访问最终的 LLMResult。但是，目前不支持对流式传输的 token_usage。

Verse 1
    I'm sippin' on sparkling water,
    It's so refreshing and light,
    It's the perfect way to quench my thirst
    On a hot summer night.
    
    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.
    
    Verse 2
    I'm sippin' on sparkling water,
    It's so bubbly and bright,
    It's the perfect way to cool me down
    On a hot summer night.
    
    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.
    
    Verse 3
    I'm sippin' on sparkling water,
    It's so light and so clear,
    It's the perfect way to keep me cool
    On a hot summer night.
    
    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.

llm.generate(["Tell me a joke."])

Q: What did the fish say when it hit the wall?
    A: Dam!


    LLMResult(generations=[[Generation(text='\n\nQ: What did the fish say when it hit the wall?\nA: Dam!', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {}, 'model_name': 'text-davinci-003'})

3、Chat models

聊天模型 Chat models：
尽管聊天模型在内部使用语言模型，但它们所提供的接口有些不同。它们不是提供"输入文本，输出文本"的API，而是提供一个以"聊天消息"作为输入和输出的接口。
聊天模型的API相对较新，因此我们仍在确定正确的抽象层次。
提供以下文档部分：
操作指南：核心功能的实例介绍，例如流式传输、创建聊天提示等。
集成：如何使用不同的聊天模型提供者（OpenAI、Anthropic等）。

3.1、入门

设置
首先，我们需要安装OpenAI Python包：
pip install openai
访问API需要API密钥，可以通过创建帐户并转到此处（https://platform.openai.com/account/api-keys）获取密钥。一旦我们有了密钥，我们将希望通过运行以下命令将其设置为环境变量：
export OPENAI_API_KEY="..."
如果不想设置环境变量，可以在初始化OpenAI LLM类时直接通过“openai_api_key”命名参数传递密钥：

from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(open_api_key="...")

否则，可以不使用任何参数进行初始化：

from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI()

消息
聊天模型界面基于消息而不是原始文本。 LangChain目前支持的消息类型有“AIMessage”，“HumanMessage”，“SystemMessage”和“ChatMessage” - “ChatMessage”接受一个任意角色参数。大多数时候，只需处理“HumanMessage”，“AIMessage”和“SystemMessage”
call
输入消息 -> 输出消息
通过将一个或多个消息传递给聊天模型，可以获得聊天完成。响应将是一条消息。

from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

chat([HumanMessage(content="Translate this sentence from English to French: I love programming.")])

AIMessage(content="J'aime programmer.", additional_kwargs={})

OpenAI的聊天模型支持多个消息作为输入。有关更多信息，请参见此处（https://platform.openai.com/docs/guides/chat/chat-vs-completions）。以下是向聊天模型发送系统消息和用户消息的示例：

messages = [
    SystemMessage(content="You are a helpful assistant that translates English to French."),
    HumanMessage(content="I love programming.")
]
chat(messages)

AIMessage(content="J'aime programmer.", additional_kwargs={})
generate
批量调用，更丰富的输出
可以进一步使用generate为多组消息生成完成。这将返回一个带有额外message参数的LLMResult。

batch_messages = [
    [
        SystemMessage(content="You are a helpful assistant that translates English to French."),
        HumanMessage(content="I love programming.")
    ],
    [
        SystemMessage(content="You are a helpful assistant that translates English to French."),
        HumanMessage(content="I love artificial intelligence.")
    ],
]
result = chat.generate(batch_messages)
result

LLMResult(generations=[[ChatGeneration(text="J'aime programmer.", generation_info=None, message=AIMessage(content="J'aime programmer.", additional_kwargs={}))], [ChatGeneration(text="J'aime l'intelligence artificielle.", generation_info=None, message=AIMessage(content="J'aime l'intelligence artificielle.", additional_kwargs={}))]], llm_output={'token_usage': {'prompt_tokens': 57, 'completion_tokens': 20, 'total_tokens': 77}})

可以从这个LLMResult中恢复诸如令牌使用情况之类的东西
result.llm_output

{'token_usage': {'prompt_tokens': 57,
      'completion_tokens': 20,
      'total_tokens': 77}}

3.2、LLMChain

可以以非常类似的方式使用现有的LLMChain - 提供一个提示和一个模型。

chain = LLMChain(llm=chat, prompt=chat_prompt)
chain.run(input_language="English", output_language="French", text="I love programming.")

# "J'adore la programmation."

3.3、提示（Prompts）

Chat模型的提示（Prompts）是围绕消息而构建的，而不仅仅是普通文本。
你可以使用 MessagePromptTemplate 来利用模板。你可以从一个或多个 MessagePromptTemplates 构建一个 ChatPromptTemplate。可以使用 ChatPromptTemplate 的 format_prompt 方法，这将返回一个 PromptValue，可以将其转换为字符串或消息对象，具体取决于想要将格式化的值用作 LLM 或聊天模型的输入。
为方便起见，模板上还公开了一个 from_template 方法。如果要使用此模板，则如下所示：

from langchain import PromptTemplate
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    AIMessagePromptTemplate,
    HumanMessagePromptTemplate,
)

template="You are a helpful assistant that translates {input_language} to {output_language}."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template="{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)

chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

# 获取来自格式化消息的聊天完成。
chat(chat_prompt.format_prompt(input_language="English", output_language="French", text="I love programming.").to_messages())

AIMessage(content="J'adore la programmation.", additional_kwargs={})

如果想要更直接地构建 MessagePromptTemplate，可以在外部创建一个 PromptTemplate 然后将其传入，例如：

prompt=PromptTemplate(
    template="You are a helpful assistant that translates {input_language} to {output_language}.",
    input_variables=["input_language", "output_language"],
)
system_message_prompt = SystemMessagePromptTemplate(prompt=prompt)

3.4、实时流媒体 streaming

一些聊天模型提供实时流媒体响应。这意味着无需等待完整响应返回，而是可以在其可用时开始处理响应。如果希望在生成响应时将其显示给用户，或者希望在生成响应时处理响应，这将非常有用。文章来源地址https://www.toymoban.com/news/detail-575859.html

from langchain.chat_models import ChatOpenAI
from langchain.schema import (
    HumanMessage,
)


from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
chat = ChatOpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)
resp = chat([HumanMessage(content="Write me a song about sparkling water.")])

Verse 1:
    Bubbles rising to the top
    A refreshing drink that never stops
    Clear and crisp, it's pure delight
    A taste that's sure to excite
    
    Chorus:
    Sparkling water, oh so fine
    A drink that's always on my mind
    With every sip, I feel alive
    Sparkling water, you're my vibe
    
    Verse 2:
    No sugar, no calories, just pure bliss
    A drink that's hard to resist
    It's the perfect way to quench my thirst
    A drink that always comes first
    
    Chorus:
    Sparkling water, oh so fine
    A drink that's always on my mind
    With every sip, I feel alive
    Sparkling water, you're my vibe
    
    Bridge:
    From the mountains to the sea
    Sparkling water, you're the key
    To a healthy life, a happy soul
    A drink that makes me feel whole
    
    Chorus:
    Sparkling water, oh so fine
    A drink that's always on my mind
    With every sip, I feel alive
    Sparkling water, you're my vibe
    
    Outro:
    Sparkling water, you're the one
    A drink that's always so much fun
    I'll never let you go, my friend
    Sparkling