DeepSeek R1 API: Integration Guide & Pricing (2026)
DeepSeek-R1 achieved 97.3% on AIME 2024, matching GPT-o1 on advanced math reasoning. It's open-weight, 60-80% cheaper than GPT-o1, and available via OpenAI-compatible API. This guide covers Python/Node.js integration, pricing, and when to choose R1 vs DeepSeek-V3.
What Is DeepSeek R1?
DeepSeek-R1 is an open-weight reasoning model released by DeepSeek AI (China) that uses chain-of-thought (CoT) reasoning — it thinks through problems step-by-step before giving a final answer, similar to how GPT-o1 and o3 work.
DeepSeek-R1 key facts:
- • AIME 2024: 97.3% (vs GPT-o1's 96.7%) — top math reasoning benchmark
- • MATH-500: ~97% (vs GPT-o1's 96.4%)
- • HumanEval (coding): ~95% (vs GPT-o1's ~92%)
- • Open-weight: MIT license on Hugging Face — run locally or fine-tune
- • Context window: 128K tokens
- • API format: OpenAI-compatible (chat completions)
- • Cost via ChinaModelAPI: ~60-80% cheaper per token vs GPT-o1
DeepSeek R1 vs DeepSeek V3: Which to Use?
| Feature | DeepSeek-R1 | DeepSeek-V3 |
|---|---|---|
| Type | Reasoning (CoT) | General purpose |
| AIME 2024 | 97.3% 🏆 | ~40% |
| Response speed | Slower (thinking tokens) | Fast |
| Token cost | Higher (more tokens) | Lower |
| Best for | Math, reasoning, hard coding | Chat, content, general code |
| Open-weight | Yes ✓ | Yes ✓ |
| Model ID (ChinaModelAPI) | deepseek-r1 | deepseek-v3 |
Rule of thumb: Use R1 for hard problems requiring step-by-step reasoning. Use V3 for everything else.
Python Integration: DeepSeek R1 API
Install: pip install openai
from openai import OpenAI
client = OpenAI(
base_url="https://api.chinamodelapi.com/v1",
api_key="your-chinamodelapi-key"
)
response = client.chat.completions.create(
model="deepseek-r1",
messages=[{
"role": "user",
"content": "Prove that there are infinitely many prime numbers."
}],
max_tokens=4096 # R1 uses more tokens for reasoning steps
)
# R1 may include a reasoning field with chain-of-thought
message = response.choices[0].message
if hasattr(message, "reasoning_content"):
print("Thinking:", message.reasoning_content)
print("Answer:", message.content)
⚡ Performance tip:
DeepSeek-R1 generates reasoning tokens before the final answer. For simple tasks, use deepseek-v3 instead — faster and cheaper. Use R1 only when you need step-by-step reasoning (math, hard coding, logic).
Node.js Integration: DeepSeek R1 API
Install: npm install openai
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.chinamodelapi.com/v1",
apiKey: "your-chinamodelapi-key"
});
const response = await client.chat.completions.create({
model: "deepseek-r1",
messages: [{
role: "user",
content: "Write a merge sort implementation in TypeScript with type annotations."
}],
max_tokens: 4096
});
console.log(response.choices[0].message.content);
Best Use Cases for DeepSeek R1
Advanced Mathematics
97.3% AIME 2024 score. Ideal for olympiad problems, proofs, calculus, and numerical computation. The chain-of-thought shows all working steps.
Complex Code Problems
~95% HumanEval. Best for algorithm design, dynamic programming, data structure implementation, and debugging complex logic errors.
Legal & Financial Analysis
Structured argument reasoning. Analyzing contracts, identifying clauses, financial model logic — tasks requiring sequential logical steps.
Scientific Reasoning
Physics, chemistry, and biology problem-solving. Strong on GPQA (graduate-level science questions). Useful for research assistance.
Frequently Asked Questions
Can I run DeepSeek R1 locally?
Yes. DeepSeek-R1 weights are on Hugging Face under MIT license. The full model requires ~800GB VRAM for inference. Distilled versions (DeepSeek-R1-Distill-Qwen-7B, etc.) run on consumer hardware from 16GB VRAM. For most developers, the API via ChinaModelAPI is easier than local deployment.
Does DeepSeek R1 support streaming?
Yes. Set stream=True in Python or stream: true in Node.js. Note that with R1, you'll receive reasoning tokens before the final answer tokens in the stream.
Why is DeepSeek R1 slower than GPT-4?
DeepSeek-R1 generates internal reasoning tokens (chain-of-thought) before producing its answer — similar to GPT-o1. This "thinking time" increases latency but dramatically improves accuracy on hard problems. For tasks where speed matters more than reasoning depth, use DeepSeek-V3 instead.
Is there a DeepSeek R1 distilled version for faster inference?
Yes. DeepSeek released distilled variants: R1-Distill-Qwen-7B, R1-Distill-Qwen-14B, R1-Distill-Qwen-32B, and R1-Distill-Llama-70B. These retain much of R1's reasoning capability in smaller, faster models. Contact us about access to distilled variants via ChinaModelAPI.
Related Guides
Access DeepSeek-R1 and all Chinese AI models via one OpenAI-compatible API.
Get Early Access