Qwen3-4B-Instruct部署教程:HTTPS反向代理+Nginx负载均衡配置
Qwen3-4B-Instruct部署教程HTTPS反向代理Nginx负载均衡配置1. 模型介绍与部署准备Qwen3-4B-Instruct-2507是Qwen3系列的端侧/轻量旗舰模型原生支持256K token约50万字上下文窗口并可扩展至1M token能够轻松处理整本书、大型PDF、长代码库等长文本任务。1.1 基础环境要求在开始部署前请确保您的服务器满足以下要求操作系统Ubuntu 20.04/22.04或CentOS 7/8GPUNVIDIA显卡显存≥8GB推荐RTX 3090/A10G及以上CUDA11.8或12.xPython3.9或3.10内存≥32GB存储空间≥20GB模型文件约8GB1.2 项目基础信息项目值模型Qwen3-4B-Instruct-2507模型路径/root/ai-models/Qwen/Qwen3-4B-Instruct-2507访问地址http://localhost:7860WebUIGradio推理引擎TransformersConda 环境torch292. 基础服务部署2.1 安装Conda环境# 下载Miniconda安装包 wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh # 安装Miniconda bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/miniconda3 # 初始化Conda /opt/miniconda3/bin/conda init source ~/.bashrc # 创建torch29环境 conda create -n torch29 python3.10 -y conda activate torch29 # 安装基础依赖 pip install torch2.9.0cu121 transformers5.5.0 gradio accelerate2.2 配置Supervisor服务创建Supervisor配置文件/etc/supervisor/conf.d/qwen3-4b-instruct.conf[program:qwen3-4b-instruct] command/opt/miniconda3/envs/torch29/bin/python /root/Qwen3-4B-Instruct/webui.py directory/root/Qwen3-4B-Instruct userroot autostarttrue autorestarttrue stopasgrouptrue killasgrouptrue stderr_logfile/root/Qwen3-4B-Instruct/logs/webui.log stdout_logfile/root/Qwen3-4B-Instruct/logs/webui.log environmentPYTHONUNBUFFERED1重载Supervisor配置supervisorctl reread supervisorctl update2.3 服务管理命令# 查看服务状态 supervisorctl status qwen3-4b-instruct # 重启服务 supervisorctl restart qwen3-4b-instruct # 停止服务 supervisorctl stop qwen3-4b-instruct # 启动服务 supervisorctl start qwen3-4b-instruct # 查看实时日志 tail -f /root/Qwen3-4B-Instruct/logs/webui.log3. HTTPS反向代理配置3.1 安装Nginx和SSL证书# Ubuntu/Debian sudo apt update sudo apt install -y nginx certbot python3-certbot-nginx # CentOS/RHEL sudo yum install -y epel-release sudo yum install -y nginx certbot python3-certbot-nginx3.2 获取SSL证书sudo certbot --nginx -d your-domain.com3.3 配置Nginx反向代理编辑/etc/nginx/sites-available/qwen3-4b-instructserver { listen 443 ssl; server_name your-domain.com; ssl_certificate /etc/letsencrypt/live/your-domain.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/your-domain.com/privkey.pem; location / { proxy_pass http://localhost:7860; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # WebSocket支持 proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection upgrade; } }启用配置并重启Nginxsudo ln -s /etc/nginx/sites-available/qwen3-4b-instruct /etc/nginx/sites-enabled/ sudo nginx -t sudo systemctl restart nginx4. 负载均衡配置4.1 多实例部署在多台服务器上重复上述部署步骤确保每个实例都能独立运行。4.2 Nginx负载均衡配置编辑/etc/nginx/nginx.conf在http块中添加upstream qwen3_backend { server server1_ip:7860; server server2_ip:7860; server server3_ip:7860; # 负载均衡策略 least_conn; # 最少连接数策略 } server { listen 443 ssl; server_name your-domain.com; ssl_certificate /etc/letsencrypt/live/your-domain.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/your-domain.com/privkey.pem; location / { proxy_pass http://qwen3_backend; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # WebSocket支持 proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection upgrade; } }4.3 健康检查配置upstream qwen3_backend { server server1_ip:7860 max_fails3 fail_timeout30s; server server2_ip:7860 max_fails3 fail_timeout30s; server server3_ip:7860 max_fails3 fail_timeout30s; # 健康检查 check interval5000 rise2 fall3 timeout3000 typehttp; check_http_send HEAD / HTTP/1.0\r\n\r\n; check_http_expect_alive http_2xx http_3xx; }5. 常见问题解决5.1 服务启动失败排查检查日志cat /root/Qwen3-4B-Instruct/logs/webui.log常见错误及解决方案GPU内存不足nvidia-smi --query-gpumemory.used --formatcsv关闭其他占用GPU的进程或升级显卡端口冲突ss -tlnp | grep 7860修改webui.py中的端口号或终止占用进程5.2 防火墙配置# CentOS/RHEL firewall-cmd --add-port7860/tcp --permanent firewall-cmd --reload # Ubuntu/Debian ufw allow 7860/tcp5.3 GPU监控# 实时监控GPU使用情况 watch -n 1 nvidia-smi # 查看显存占用 nvidia-smi --query-gpumemory.used --formatcsv6. 总结通过本教程您已经完成了Qwen3-4B-Instruct模型的完整部署包括基础环境配置和模型服务部署HTTPS反向代理设置确保通信安全Nginx负载均衡配置提高服务可用性和性能常见问题排查方法保障服务稳定运行这套架构可以支持高并发访问同时通过负载均衡实现故障自动转移适合生产环境部署。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。