https://geeknote.net/renny
Renny
https://geeknote-storage.oss-cn-hongkong.aliyuncs.com/qg8iz4vrxiuhi2pxerqgsv737wya?x-oss-process=image%2Fresize%2Cm_fill%2Cw_160%2Ch_160
2023-05-30T15:48:36Z
renny
https://geeknote.net/renny
https://geeknote.net/renny/posts/2314
2023-05-05T10:04:26Z
2023-05-30T15:48:36Z
在 Rails 中使用 SSE 来实现一个 ChatGPT 应用
<p><em>英文原文:<a href="https://renny.ren/ch/articles/40">https://renny.ren/ch/articles/40</a></em></p>
<hr>
<h2>
<a id="%E5%89%8D%E8%A8%80" href="#%E5%89%8D%E8%A8%80" class="anchor"></a>前言</h2>
<p>在使用 ChatGPT 的时候,你会注意到这个回复不是一次性生成完的,而是边生成边返回,像打字一样的效果:</p>
<p><img src="/attachments/XpZERDggnLECQudMnX7q3X5x/stream4.gif" alt="stream4.gif"></p>
<p>那么这是如何实现的呢,这篇来研究一下相关的技术细节。</p>
<p>其实这种效果叫 streaming response (流式传输的回复),很形象。</p>
<p>提到 streaming response 就不得不提到 SSE</p>
<h2>
<a id="%E5%85%B3%E4%BA%8E+SSE" href="#%E5%85%B3%E4%BA%8E+SSE" class="anchor"></a>关于 SSE</h2>
<p>如果你看一下 <a href="https://platform.openai.com/docs/api-reference/chat/create">OenAI API 文档</a>,就会发现有一个参数叫 <code>stream</code></p>
<blockquote>
<p>If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only <a href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#event_stream_format">server-sent events</a> as they become available, with the stream terminated by a <code>data: [DONE]</code> message.</p>
</blockquote>
<p>所以什么是 SSE 呢?</p>
<p>简单来说,SSE (Server-Sent Event) 是一种从服务器流式传输事件的简单方式。它通过单个 HTTP 连接将实时更新从服务器发送到客户端。使用 SSE,只要建立连接后,服务器就可以将实时数据推送到客户端,无需靠客户端不断轮询来获取更新。</p>
<p>步骤如下:</p>
<ol>
<li>客户端发送 GET 请求到服务器: <code>https://www.host.com/stream</code>
</li>
<li>建立长连接,响应头里会有 <code>Connection: keep-alive</code> (从 HTTP/1.1 起,默认就使用的是长连接)</li>
<li>服务端设置 <code>Content-Type: text/event-stream</code> response header</li>
<li>服务端可以开始发送事件 (event) 了,类似这样:</li>
</ol>
<pre class="highlight"><code> event: add
data: This is the first message, it
data: has two lines.
</code></pre>
<p>就是这么简单</p>
<h2>
<a id="%E6%AF%94%E8%BE%83%E4%B8%80%E4%B8%8B+SSE+%E5%92%8C+WebSocket" href="#%E6%AF%94%E8%BE%83%E4%B8%80%E4%B8%8B+SSE+%E5%92%8C+WebSocket" class="anchor"></a>比较一下 SSE 和 WebSocket</h2>
<p>那么 SSE 和 WebSocket 是不是差不多呢?总结了一下,它们都是可以用来在客户端和服务端做实时通信的,但有一些小的区别:</p>
<ol>
<li>SSE 提供的是单向通信渠道 (server -> client);而 WebSockets 是双向沟通,客户端也可以随时给服务器发消息</li>
<li>SSE 是基于 HTTP 的,本质上还是使用长轮询技术来实现实时通信;而 WebSocket 则直接在 TCP 连接上发送和接收数据</li>
<li>SSE 在连接丢失的时候会自动尝试重连,重连失败又会重连,无限重连。。所以你在浏览器看到的就是 GET 请求无限发送,一直到服务器返回连接成功为止,所以需要加上处理异常的代码,在 client 端关闭连接;而 WebSocket 如果连接丢失了一般是需要 client 重新建立一个新的连接</li>
</ol>
<p>总的来说,SSE 使用简单,更适合传输小量的数据,特别是只需要服务端到客户端单向通信的时候。WebSocket 要更强大一些,可以用于更多的复杂场景,比如多人实时聊天、多人游戏等。</p>
<h2>
<a id="Workflow" href="#Workflow" class="anchor"></a>Workflow</h2>
<p>接下来以 Rails 提供后端接口为例,看看怎么调用 OpenAI 的接口 接收 SSE 事件,然后转发到我们的客户端。</p>
<p>工作流程是这样的:</p>
<p><img src="/attachments/iHGG5SrgtLYpRzqESv5qupUq/wf2.png" alt="wf2.png"></p>
<ol>
<li>客户端使用 <a href="https://developer.mozilla.org/en-US/docs/Web/API/EventSource">EventSource</a> 接口向服务端发送请求</li>
<li>服务端收到请求,发送请求到 OpenAI 接口,带上 <code>stream: true</code> 参数</li>
<li>服务端收到来自 OpenAI 的 event,然后转发给客户端</li>
<li>当事件发送完毕后,OpenAI 会发送一个特殊的消息来告诉我们可以关闭连接了。比如当我们收到 <code>[Done]</code>,就可以关闭服务器与 OpenAI 之间的连接,然后客户端关闭到我们服务器的连接</li>
</ol>
<h2>
<a id="%E7%94%A8+Rails+%E6%8F%90%E4%BE%9B%E5%90%8E%E7%AB%AF%E6%8E%A5%E5%8F%A3" href="#%E7%94%A8+Rails+%E6%8F%90%E4%BE%9B%E5%90%8E%E7%AB%AF%E6%8E%A5%E5%8F%A3" class="anchor"></a>用 Rails 提供后端接口</h2>
<p>理解了 SSE 和工作流之后,接下来就是代码来实现整个过程。总共三个部分:</p>
<ul>
<li>client</li>
</ul>
<pre class="highlight"><code class="language-js"><span class="kd">const</span> <span class="nx">fetchResponse</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">evtSource</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">EventSource</span><span class="p">(</span><span class="s2">`/v1/completions/live_stream?prompt=</span><span class="p">${</span><span class="nx">prompt</span><span class="p">}</span><span class="s2">`</span><span class="p">)</span>
<span class="nx">evtSource</span><span class="p">.</span><span class="nx">onmessage</span> <span class="o">=</span> <span class="p">(</span><span class="nx">event</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">event</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="nx">JSON</span><span class="p">.</span><span class="nx">parse</span><span class="p">(</span><span class="nx">event</span><span class="p">.</span><span class="nx">data</span><span class="p">)</span>
<span class="nx">setMessage</span><span class="p">(</span><span class="nx">response</span><span class="p">)</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="nx">evtSource</span><span class="p">.</span><span class="nx">close</span><span class="p">()</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nx">evtSource</span><span class="p">.</span><span class="nx">onerror</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="nx">evtSource</span><span class="p">.</span><span class="nx">close</span><span class="p">()</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre>
<p>使用上面提到的 <code>EventSource</code> API 来建立 SSE 链接。
当收到新消息的时候,<code>onmessage</code> 事件会被触发</p>
<ul>
<li>server</li>
</ul>
<pre class="highlight"><code class="language-ruby"><span class="k">class</span> <span class="nc">CompletionsController</span> <span class="o"><</span> <span class="no">ApplicationController</span>
<span class="kp">include</span> <span class="no">ActionController</span><span class="o">::</span><span class="no">Live</span>
<span class="k">def</span> <span class="nf">live_stream</span>
<span class="n">response</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Content-Type"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"text/event-stream"</span>
<span class="n">response</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Last-Modified"</span><span class="p">]</span> <span class="o">=</span> <span class="no">Time</span><span class="p">.</span><span class="nf">now</span><span class="p">.</span><span class="nf">httpdate</span>
<span class="n">sse</span> <span class="o">=</span> <span class="no">SSE</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">response</span><span class="p">.</span><span class="nf">stream</span><span class="p">,</span> <span class="ss">retry: </span><span class="mi">300</span><span class="p">)</span>
<span class="no">ChatCompletion</span><span class="o">::</span><span class="no">LiveStreamService</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">sse</span><span class="p">,</span> <span class="n">live_stream_params</span><span class="p">).</span><span class="nf">call</span>
<span class="k">ensure</span>
<span class="n">sse</span><span class="p">.</span><span class="nf">close</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
<ul>
<li>这里引入了 <code>ActionController::Live</code> module 来开启 streaming response</li>
<li>上面提到了,需要设置 <code>text/event-stream</code> response header</li>
<li>这里需要特别注意的是,如果你使用的是 Rails 7, Rails 7 默认是不支持 stream repsonse 的,这个问题我找了好久,最后发现是 rack 的问题</li>
</ul>
<p>Rails 7 默认是引入了 <code>Rack::ETag</code> 的,而这玩意会把 response 缓存起来,导致实时的 streaming response 就不能实现了。</p>
<p>这个问题我看 issue 里面讨论了好久,到最后也没有解决,不过有 hack 的解决方案,具体可以参考<a href="https://github.com/rack/rack/issues/1619#issuecomment-1510031078">这里</a></p>
<p>总之如果你的 rack 版本是 <code>2.2.x</code> 就需要加下面这一行:</p>
<pre class="highlight"><code class="language-ruby"><span class="n">response</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Last-Modified"</span><span class="p">]</span> <span class="o">=</span> <span class="no">Time</span><span class="p">.</span><span class="nf">now</span><span class="p">.</span><span class="nf">httpdate</span>
</code></pre>
<ul>
<li>OpenAI API</li>
</ul>
<p>接下来是请求 OpenAI 接口的部分,自己封装了<a href="https://github.com/renny-ren/openai_ruby">一个简单的 gem</a>,支持流式传输</p>
<pre class="highlight"><code class="language-ruby"><span class="k">module</span> <span class="nn">ChatCompletion</span>
<span class="k">class</span> <span class="nc">LiveStreamService</span>
<span class="k">def</span> <span class="nf">call</span>
<span class="n">client</span><span class="p">.</span><span class="nf">create_chat_completion</span><span class="p">(</span><span class="n">request_body</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">chunk</span><span class="p">,</span> <span class="n">overall_received_bytes</span><span class="p">,</span> <span class="n">env</span><span class="o">|</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">chunk</span><span class="p">[</span><span class="sr">/data: (.*)\n\n$/</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span>
<span class="n">send_message</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">send_message</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="n">response</span> <span class="o">=</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="k">if</span> <span class="n">response</span><span class="p">.</span><span class="nf">dig</span><span class="p">(</span><span class="s2">"choices"</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">"delta"</span><span class="p">,</span> <span class="s2">"content"</span><span class="p">)</span>
<span class="vi">@result</span> <span class="o">=</span> <span class="vi">@result</span> <span class="o">+</span> <span class="n">response</span><span class="p">.</span><span class="nf">dig</span><span class="p">(</span><span class="s2">"choices"</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">"delta"</span><span class="p">,</span> <span class="s2">"content"</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">sse</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="ss">status: </span><span class="mi">200</span><span class="p">,</span> <span class="ss">content: </span><span class="vi">@result</span><span class="p">)</span>
<span class="k">end</span>
<span class="kp">private</span>
<span class="k">def</span> <span class="nf">client</span>
<span class="vi">@client</span> <span class="o">||=</span> <span class="no">OpenAI</span><span class="o">::</span><span class="no">Client</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="no">OPENAI_API_KEY</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
<p>最后,我上面写的是前后端分离的方案,如果你在用 Hotwire 的话,可以看看<a href="https://gist.github.com/alexrudall/cb5ee1e109353ef358adb4e66631799d">这篇</a></p>
<p>最后,我做了一个 demo,大家可以体验效果: <a href="https://aiichat.cn/chats/new">https://aiichat.cn/chats/new</a> (登录账号密码皆为 rubychina)</p>
英文原文:https://renny.ren/ch/articles/40
前言
在使用 ChatGPT 的时候,你会注意到这个回复不是一次性生成完的,而是边生成边返回,像打字一样的效果:
...
Renny
https://geeknote.net/renny