Deepseek R1 のインストールと Web サービスとしての実装方法

中国で Deepseek R1 をオープンソースとして公開しているのを見て、自分も一度ローカル環境で動かしてみたいと思った。

そこで実際にインストールしてみて、Web サービスとしてはどう作れるのかを調べてみた。

インストール環境はMacBook M1 16Gb モデルである。

1. Ollama インストール

Ollamaは、オープンソースで LLM をローカル環境で実行できるようにしてくれる。

Go 言語で作られており、JavaScript と Python の API をサポートしているため、localhost および外部環境からのアクセスも許可できる。

インストールは上記のアドレスにアクセスしてプログラムをダウンロードするか、あるいは homebrew でもインストール可能だ。

brew install ollama

やはり Homebrew でインストールする時が一番ゾクゾクする。

この快感のために MacBook を使っている。

そしてサービスを起動する。

brew services start ollama

2. Ollama の簡単な使い方

Ollama help を叩くと、下のようなコマンドを確認できる。

ollama help
Large language model runner

Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

実際に使うのはこれほど多くない。

サーバーにモデルがあるなら、run でモデルを取得してすぐ実行できる。

もしサーバーにない場合は、モデルファイルと create コマンドで新しい環境を作ることもできる。

現在公開されているバージョンはいくつかあるが、まずはオリジナルを使ってみたかった。

次のようにターミナルに入力してみよう。

もし別のモデルを使いたい場合は、コロン以降の部分だけ変えればよい。

ollama run deepseek-r1:8b

# もし他のモデルがよければ...
ollama run deepseek-r1:16b
ollama run deepseek-r1:7b
ollama run deepseek-r1:1.5b

3. Deepseek-R1 モデルを触ってみる

まず 8B モデルを起動してみた。

残念ながらこのモデルは韓国語をうまく理解してくれない。

一方で、英語にはかなりしっかりと答えてくれる。

could you recommend some places for travel

<think>
... 생략
</think>

Based on your thoughtful considerations and the various aspects you've
explored, here's a curated travel plan that combines historical sites,
natural beauty, unique cultural experiences, and culinary delights:

### Travel Itinerary: Exploring History, Nature, Culture & Cuisine

#### 1. **Historical Exploration**
   - **Amalfi Coast, Italy**: Visit Positano and Amalfi for their charming
architecture and coastal views. Explore ancient Roman ruins nearby.
   - **Tulum, Mexico**: Discover Mayan ruins and vibrant beach settings
with a mix of history and relaxation.

... 생략

### Budget Considerations:
   - Opt for multi-destination trips to balance costs. Prioritize
affordable flight and accommodation deals.

This itinerary offers a blend of history, nature, culture, and cuisine,
tailored to your interests and considerations.

Hugging Faceでは、さらに多様なモデルを探すことができる。

その中でも、量子化された 14B 蒸留モデルをダウンロードして実行してみた。

容量は 10.27Gb だった。

ollama create deepseek-r1:14b-distill -f DeepSeek-R1-Distill-Qwen-14B-Q5_K_S
ollama run deepseek-r1:14b-distill

そして「한국말 할 수 있니?（韓国語話せる？）」と質問してみた。

한국말 할 수 있니?

<think>
Okay, the user is asking "한국말할 수 있니?" which means "Do I speak
Korean?" in Korean.

I need to confirm if they know Korean or not.

Since they're asking this, it's likely that they don't speak Korean well
but want to communicate with someone who does.

So, I'll respond in a friendly manner and offer my help.
</think>

한국말이 nNot very fluent, but I can try to help! What do you need?

エラーが頻発したり出力が変だったりしたのでドキュメントを読んでみると、ここに上がっているモデルは必ず前後にトークンを付けないといけないと書いてあった。

Ollama では template と params を設定しないと正しく動作しない。

Do not forget about <｜User｜> and <｜Assistant｜> tokens! - Or use a chat template formatter

テンプレートなどの設定方法を調べているうちに、人がすでに作ってくれたものをダウンロードする方が精神衛生にいいということに気づいた。

8b モデルの中で一番人気のモデルをダウンロードしてみた。

ollama run sjo/deepseek-r1-8b-llama-distill-abliterated-q8_0

こうすると、考えている時間も短くなり、回答もかなりまともになった。

その代わり、MacBook がじわじわと温かくなっていくのを感じる。

14b の量子化モデルはメモリをほぼフルで使っていたが、一応は動いた。

しかしオリジナルを回すと、メモリが狂ったように跳ね上がった。

オープンソースを本気で活用するには、少なくとも 32Gb 以上のスペックが必要そうだと感じた。

もっとお金を稼がないと、と心に誓った瞬間だった。

4. Vercel SDK と Ollama を使った Deepseek-R1 サービス実装

Ollama を起動すると、ポート 11434 が開く。

ここに直接 API リクエストを送ることができる！

まず必要なライブラリを 1 つインストールする。

yarn add ollama-ai-provider

そして .env に次のように url を追加する。

OLLAMA_BASEURL="http://localhost:11434/api"

次に、以下のように API エンドポイントを設定する。

import { streamText } from "ai";
import { createOllama } from "ollama-ai-provider";

export const maxDuration = 30;

export async function POST(req: Request) {
  const { messages } = await req.json();
  const ollama = createOllama({
    baseURL: process.env.OLLAMA_BASEURL,
  });

  const result = await streamText({
    model: ollama("deepseek-r1:1.5b-distill"),
    // create로 만든 모델 명을 넣어주면 됨.
    // 서버 램이 적어서 1.5b 모델로 구현해 봄.
    messages: messages,
  });
  return result.toDataStreamResponse();
}

そして client ページも作成する。

前回の記事で使った、Markdown を含むレスポンスページをそのまま使っている。

違いは API アドレスだけだ。

"use client";
import { useChat } from "ai/react";
import { PaperAirplaneIcon, StopCircleIcon } from "@heroicons/react/24/outline";
import { useRef, useEffect } from "react";
import ReactMarkdown from "react-markdown";
import remarkGfm from "remark-gfm";
import rehypeHighlight from "rehype-highlight";
import "highlight.js/styles/atom-one-dark.css";

export default function ChatInterface() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } =
    useChat({
      api: "/api/services/deepseek",
    });
  const messagesEndRef = useRef<HTMLDivElement>(null);
  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
  };

  useEffect(() => {
    scrollToBottom();
  }, [messages.length]);

  return (
    <div className="flex flex-col h-[calc(100svh-60px)] lg:h-[calc(100svh-106px)] max-w-3xl mx-auto border rounded-lg shadow-lg bg-white">
      <div className="flex-1 overflow-y-auto p-4 space-y-4">
        {messages.map((message) => {
          if (message.role !== "system") {
            return (
              <div
                key={message.id}
                className={`p-3 rounded-lg  ${
                  message.role === "user"
                    ? "ml-auto bg-blue-100"
                    : message.role === "assistant"
                    ? "bg-gray-100"
                    : "bg-green-100"
                }`}
              >
                {message.role === "assistant" && (
                  <p className="font-black mb-1">🌏 AI</p>
                )}
                <div className="text-gray-800">
                  <ReactMarkdown
                    className="w-full h-5/6 prose
                       prose-ol:bg-gray-200 prose-ol:rounded-lg prose-ol:pr-1.5 prose-ol:py-3
                     prose-ul:bg-gray-200 prose-ul:rounded-lg prose-ul:pr-1.5 prose-ul:py-3
                     prose-blockquote:bg-gray-200 prose-blockquote:rounded-lg prose-blockquote:border-l-8
                     prose-blockquote:text-gray-600 prose-blockquote:border-gray-700 prose-blockquote:break-all prose-blockquote:pr-1.5
                     prose-a:text-blue-600 prose-a:underline-offset-4 prose-a:underline
                      "
                    remarkPlugins={[remarkGfm]}
                    rehypePlugins={[rehypeHighlight]}
                  >
                    {message.content}
                  </ReactMarkdown>
                </div>
              </div>
            );
          }
        })}
        <div ref={messagesEndRef} />
      </div>

      <div className="fixed bottom-0 left-0 right-0 flex justify-center">
        <div className="w-full max-w-3xl p-1 bg-white border rounded-lg">
          <form
            onSubmit={handleSubmit}
            className="flex items-center bg-gray-50 rounded-lg px-4 py-2"
          >
            <input
              value={input}
              onChange={handleInputChange}
              placeholder="メッセージを入力してください..."
              className={`flex-1 bg-transparent outline-none resize-none max-h-32`}
              disabled={isLoading}
            />

            {isLoading ? (
              <button className="ml-2 text-blue-500 p-1 rounded-full hover:bg-blue-50">
                <StopCircleIcon className="size-6" />
              </button>
            ) : (
              <button
                type="submit"
                className="ml-2 text-blue-500 p-1 rounded-full hover:bg-blue-50"
              >
                <PaperAirplaneIcon className="size-6" />
              </button>
            )}
          </form>
        </div>
      </div>
    </div>
  );
}

では、レスポンスを一度見てみよう。

Deepseek R1 のインストールと Web サービスとしての実装方法-10

前にあった <think> がないのは少し残念だが、とりあえず動いている。

あとは少しずつブラッシュアップしていくだけだ。

5. 所感

Deepseek R1 のインストールと Web サービスとしての実装方法-11

deepseek R1 が発表された時、まるで普及版 JARVIS が出てきたように感じたが、さすがにそこまではいかなかった。

少なくとも ChatGPT と同等の性能を求めるなら、それに見合ったハードウェアが必要であり、そのレベルのハードウェアは一般人にはやや高いハードルだ。

それでも、これをオープンソースとして公開しようという発想自体が本当にすごいと思う。

近いうちに、低スペック環境でも動作し、かつ韓国語にファインチューニングされたモデルが登場することを期待したい。

목차