04-Linux_DeepSeek_Large_Model
04-Linux_DeepSeek_Large_Model
This article will explain how to use RKLLM to deploy the refined model DeepSeek-R1-Distill-Qwen-1.5B to the Rockchip platform and utilize the NPU for hardware-accelerated inference.
Chip platform: RK3576/RK3588
System version: Debian12/Debian11
I. Development Environment Setup
RKLLM SDK Documentation
RKNPU Driver
RKLLM-Toolkit
Model Download
Runtime Download
II. Deployment and Operation
Board-side Deployment Environment
Baidu Netdisk directory: 3-SoftwareData Software Materials / rk35xx-rkllm-deepseek.tar.gz
Test package description:
DeepSeek-R1-Distill-Qwen-1.5B.rkllm
is the converted model.
llm_demo
is the compiled LLM test program.
Copy the test package to RK3588.
Set environment variables
Run Tests
Performance Analysis
For the math problem: Solve the equations x+y=12, 2x+4y=34, find the values of x and y
, RK3588 achieves 14.93 tokens per second.
Stage
Total Time (ms)
Tokens
Time per Token (ms)
Tokens per Second
Pre-fill
429.63
81
5.30
188.53
Generation
56103.71
851
66.99
14.93
export RKLLM_LOG_LEVEL=1
Last updated