YOLOE检测结果可视化方法，轻松查看分割边界-编程阁

YOLOE检测结果可视化方法，轻松查看分割边界

YOLOE不是又一个“更快的YOLO”，而是真正意义上让模型“看见一切”的新范式。当你第一次运行predict_text_prompt.py，看到模型不仅框出了图中所有“person”“dog”“cat”，还用彩色掩码精准勾勒出每只狗的毛发轮廓、每个行人的衣角褶皱时——那种直观的视觉反馈，远比AP数值更让人确信：开放词汇表检测与分割，真的落地了。

但问题随之而来：原始输出只有保存在runs/predict/下的图片和JSON文件，边界是画出来了，可怎么快速验证分割精度？怎么对比不同提示词对边缘细节的影响？怎么把结果嵌入自己的工作流做二次分析？这些都不是--save参数能解决的。

本指南不讲原理推导，不堆参数配置，只聚焦一件事：如何用最轻量的方式，把YOLOE的分割结果变成你一眼就能看懂、能调、能复用的可视化界面。从命令行快速预览，到Gradio交互式调试，再到自定义绘图逻辑——所有方法均基于YOLOE官版镜像原生环境，无需额外安装，开箱即用。

1. 理解YOLOE的输出结构：先看清“它给了什么”

YOLOE的分割结果不是黑盒，而是一组结构清晰的张量。理解其组织方式，是定制化可视化的前提。

1.1 核心输出文件解析

运行以下命令后：

python predict_text_prompt.py \ --source ultralytics/assets/bus.jpg \ --checkpoint pretrain/yoloe-v8l-seg.pt \ --names person dog cat \ --device cuda:0

你会在runs/predict/目录下得到：

bus.jpg：叠加了检测框与分割掩码的可视化图（默认保存）
bus.json：结构化预测结果（关键！）

打开bus.json，你会看到类似这样的内容：

{ "boxes": [[245.3, 112.7, 398.1, 286.4], [12.5, 45.2, 87.6, 192.3]], "labels": ["person", "dog"], "scores": [0.92, 0.87], "masks": [ [[1, 1, 1, ..., 0], [1, 1, 0, ..., 0], ...], [[0, 0, 1, ..., 1], [0, 1, 1, ..., 1], ...] ] }

其中：

boxes是标准的[x1, y1, x2, y2]坐标（像素单位）
labels和scores对应每个检测实例的类别与置信度
masks是最关键的分割数据：每个元素是一个二维布尔数组（H×W），True表示该像素属于对应目标

注意：masks的尺寸与输入图像分辨率一致，而非固定大小。YOLOE-v8l-seg 默认处理640×640图像，因此掩码通常是640×640的二值矩阵。

1.2 掩码坐标系与图像对齐原理

YOLOE的掩码并非直接绘制在原始图像上，而是通过仿射变换对齐。其内部流程为：

模型在归一化坐标系（0~1）中生成掩码；
根据原始图像宽高比，计算缩放因子与填充偏移；
将掩码反向映射回原始图像像素坐标。

这意味着：你不能直接用cv2.imshow()显示masks[0]，它需要先重采样并叠加到原图。这也是为什么官方脚本里总有一段看似冗余的mask_to_image逻辑。

2. 方法一：命令行快速预览——三行代码搞定实时可视化

如果你只想快速确认某张图的分割效果是否合理，不需要GUI，那这个方法最高效。它复用YOLOE原生代码，仅增加5行绘图逻辑，全程在终端完成。

2.1 修改`predict_text_prompt.py`（推荐备份原文件）

找到/root/yoloe/predict_text_prompt.py，定位到main()函数末尾，在save_results()调用之后，插入以下代码：

# === 新增：实时可视化分割掩码 === import cv2 import numpy as np # 加载原始图像（保持与预测一致的预处理） img = cv2.imread(args.source) h, w = img.shape[:2] # 遍历每个检测结果 for i, (mask, label, score) in enumerate(zip(results['masks'], results['labels'], results['scores'])): if score < 0.5: # 过滤低置信度结果 continue # 将掩码从 (H, W) 转为 (H, W, 1)，并缩放到原始图像尺寸 mask_resized = cv2.resize(mask.astype(np.uint8), (w, h)) # 生成随机颜色（确保不同类别不重复） np.random.seed(hash(label) % 255) color = tuple(np.random.randint(0, 255, 3).tolist()) # 在原图上绘制半透明掩码 overlay = img.copy() overlay[mask_resized == 1] = color cv2.addWeighted(overlay, 0.4, img, 0.6, 0, img) # 绘制带标签的边框 x1, y1, x2, y2 = map(int, results['boxes'][i]) cv2.rectangle(img, (x1, y1), (x2, y2), color, 2) cv2.putText(img, f"{label} {score:.2f}", (x1, y1-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2) # 显示结果（按任意键继续） cv2.imshow("YOLOE Segmentation Preview", img) cv2.waitKey(0) cv2.destroyAllWindows() # === 新增结束 ===

2.2 执行与效果

激活环境后直接运行：

conda activate yoloe cd /root/yoloe python predict_text_prompt.py \ --source ultralytics/assets/bus.jpg \ --checkpoint pretrain/yoloe-v8l-seg.pt \ --names person dog cat \ --device cuda:0

你会立刻看到一个弹窗，其中：

每个目标被彩色边框框出；
目标区域覆盖半透明色块（如person为蓝色，dog为绿色）；
标签与置信度显示在框上方；
按任意键关闭窗口，程序继续执行保存逻辑。

优势：零依赖、秒级响应、适合批量调试多张图；
适用场景：模型微调时快速检查分割质量、筛选bad case、验证新提示词效果。

3. 方法二：Gradio交互式调试——拖拽上传，实时调整提示词

YOLOE官版镜像已预装gradio，这意味着你无需部署Web服务器，一行命令就能启动一个功能完整的可视化界面。它支持：

拖拽上传任意本地图片；
动态输入文本提示（支持中文）；
实时切换设备（CPU/GPU）；
下载高清可视化结果。

3.1 创建`gradio_demo.py`

在/root/yoloe/目录下新建文件：

# gradio_demo.py import gradio as gr import torch import cv2 import numpy as np from ultralytics import YOLOE # 加载模型（自动缓存，首次运行稍慢） model = YOLOE.from_pretrained("jameslahm/yoloe-v8l-seg") def visualize_segmentation(image, text_prompt, device="cuda:0"): if image is None: return None # 保存临时文件供YOLOE读取 temp_path = "/tmp/gradio_input.jpg" cv2.imwrite(temp_path, image) # 构建命令行参数（模拟predict_text_prompt.py） import subprocess import json import os cmd = [ "python", "predict_text_prompt.py", "--source", temp_path, "--checkpoint", "pretrain/yoloe-v8l-seg.pt", "--names", text_prompt.replace("，", ",").replace(" ", ""), "--device", device ] try: # 执行预测（超时30秒） result = subprocess.run(cmd, capture_output=True, text=True, timeout=30) # 读取生成的JSON结果 json_path = "runs/predict/gradio_input.json" if os.path.exists(json_path): with open(json_path, 'r') as f: data = json.load(f) # 绘制结果（复用方法一逻辑） img_out = image.copy() h, w = img_out.shape[:2] for i, (mask, label, score) in enumerate(zip(data['masks'], data['labels'], data['scores'])): if score < 0.3: continue mask_resized = cv2.resize(np.array(mask, dtype=np.uint8), (w, h)) np.random.seed(hash(label) % 255) color = tuple(np.random.randint(0, 255, 3).tolist()) overlay = img_out.copy() overlay[mask_resized == 1] = color cv2.addWeighted(overlay, 0.4, img_out, 0.6, 0, img_out) x1, y1, x2, y2 = map(int, data['boxes'][i]) cv2.rectangle(img_out, (x1, y1), (x2, y2), color, 2) cv2.putText(img_out, f"{label} {score:.2f}", (x1, y1-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2) return img_out else: return image # 返回原图作为fallback except Exception as e: print(f"Error: {e}") return image # Gradio界面 with gr.Blocks(title="YOLOE Segmentation Visualizer") as demo: gr.Markdown("# YOLOE 分割结果可视化工具") gr.Markdown("上传图片，输入文本提示（如：person, dog, car），实时查看分割边界") with gr.Row(): with gr.Column(): input_img = gr.Image(type="numpy", label="上传图片") text_prompt = gr.Textbox(value="person, dog, cat", label="文本提示（逗号分隔）") device = gr.Radio(["cuda:0", "cpu"], value="cuda:0", label="推理设备") run_btn = gr.Button(" 开始可视化", variant="primary") with gr.Column(): output_img = gr.Image(label="分割结果", interactive=False) run_btn.click( fn=visualize_segmentation, inputs=[input_img, text_prompt, device], outputs=output_img ) if __name__ == "__main__": demo.launch(server_name="0.0.0.0", server_port=7860, share=False)

3.2 启动服务

conda activate yoloe cd /root/yoloe python gradio_demo.py

终端会输出类似：

Running on local URL: http://0.0.0.0:7860

在浏览器中打开该地址，即可使用图形界面。上传一张街景图，输入bicycle, traffic light, bus，点击运行——3秒内，你将看到所有目标被精确分割并标注。

优势：免代码操作、支持中文提示、可分享给非技术人员；
适用场景：产品演示、客户沟通、跨团队协作评审。

4. 方法三：自定义绘图逻辑——导出矢量边界，对接下游系统

当你的需求超越“看一眼”，比如要将分割结果导入GIS系统、生成CAD轮廓、或做物理仿真，这时就需要提取精确的矢量边界（contour），而非像素掩码。

4.1 从掩码提取OpenCV轮廓

YOLOE的masks是二值矩阵，用OpenCV的findContours可直接转为多边形点集：

import cv2 import numpy as np def mask_to_contours(mask, epsilon=2.0): """ 将二值掩码转换为多边形轮廓点集 :param mask: (H, W) bool or uint8 array :param epsilon: 轮廓近似精度（越小越精细） :return: list of np.ndarray, each shape (N, 1, 2) """ # 确保为uint8 mask_uint8 = (mask * 255).astype(np.uint8) # 查找外部轮廓（忽略孔洞） contours, _ = cv2.findContours( mask_uint8, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_TC89_L1 ) # 轮廓简化（可选） simplified = [] for cnt in contours: approx = cv2.approxPolyDP(cnt, epsilon, True) simplified.append(approx) return simplified # 使用示例 mask = np.array(results['masks'][0]) # 取第一个目标的掩码 contours = mask_to_contours(mask) # 绘制到原图（验证） img_with_contour = img.copy() cv2.drawContours(img_with_contour, contours, -1, (0, 255, 0), 2)

4.2 导出为标准格式：GeoJSON与SVG

YOLOE本身不提供导出功能，但结合shapely和svgwrite库（镜像已预装），可一键生成行业通用格式：

# 安装依赖（首次运行） # pip install shapely svgwrite from shapely.geometry import Polygon, MultiPolygon import json import svgwrite def contours_to_geojson(contours, crs="EPSG:4326"): """导出为GeoJSON（适用于GIS软件）""" features = [] for i, cnt in enumerate(contours): # OpenCV轮廓是(N,1,2)，转为Shapely Polygon points = cnt.squeeze() # (N, 2) if len(points) < 3: continue poly = Polygon(points) if not poly.is_valid: poly = poly.buffer(0) # 修复无效几何 features.append({ "type": "Feature", "properties": {"id": i, "class": "object"}, "geometry": { "type": "Polygon", "coordinates": [points.tolist()] } }) return { "type": "FeatureCollection", "crs": {"type": "name", "properties": {"name": crs}}, "features": features } def contours_to_svg(contours, output_path, width=800, height=600): """导出为SVG（适用于设计软件）""" dwg = svgwrite.Drawing(output_path, profile='tiny', size=(f'{width}px', f'{height}px')) for i, cnt in enumerate(contours): points = cnt.squeeze().tolist() if len(points) < 3: continue # SVG坐标系Y轴向下，需翻转 svg_points = [(x, height - y) for x, y in points] dwg.add(dwg.polygon(svg_points, fill='none', stroke='red', stroke_width=2)) dwg.save() # 保存 geojson_data = contours_to_geojson(contours) with open("segmentation.geojson", "w") as f: json.dump(geojson_data, f, indent=2) contours_to_svg(contours, "segmentation.svg")

生成的segmentation.geojson可直接拖入QGIS，segmentation.svg可在Illustrator中编辑。这让你的YOLOE结果真正融入工程生产链路。

5. 进阶技巧：提升可视化专业度的三个关键点

以上方法已覆盖90%场景，但若你追求极致效果，以下三点能显著提升专业感与实用性。

5.1 边界平滑与抗锯齿

YOLOE原始掩码边缘常有阶梯状锯齿。添加高斯模糊+阈值可柔化：

def smooth_mask(mask, sigma=1.0): mask_float = mask.astype(np.float32) blurred = cv2.GaussianBlur(mask_float, (0, 0), sigma) return (blurred > 0.5).astype(np.uint8) # 使用 smoothed_mask = smooth_mask(mask) contours = mask_to_contours(smoothed_mask, epsilon=1.0)

5.2 多目标层级渲染（避免遮挡）

当多个目标重叠时，小目标易被大目标掩码覆盖。按面积倒序绘制可解决：

# 按掩码面积排序，先画大的，再画小的 areas = [cv2.contourArea(c) for c in contours] sorted_idx = np.argsort(areas)[::-1] for i in sorted_idx: cv2.drawContours(img, [contours[i]], -1, colors[i], 2)

5.3 实时性能监控：记录FPS与显存占用

在Gradio或命令行脚本中加入性能统计：

import time import pynvml def get_gpu_memory(): pynvml.nvmlInit() handle = pynvml.nvmlDeviceGetHandleByIndex(0) info = pynvml.nvmlDeviceGetMemoryInfo(handle) return info.used / 1024**2 # MB start_time = time.time() # ... 执行预测 ... end_time = time.time() gpu_mem = get_gpu_memory() print(f"FPS: {1/(end_time-start_time):.1f}, GPU Mem: {gpu_mem:.0f}MB")

6. 总结：让YOLOE的“看见”真正为你所用

YOLOE的强大，不在于它多快或多准，而在于它把“开放词汇表分割”这个曾经需要复杂pipeline的任务，压缩进一个模型、一次推理、一组掩码。但技术价值的最终兑现，永远取决于你能否快速、直观、可控地与结果对话。

本文提供的三种方法，覆盖了从开发调试到工程落地的全链条：

命令行预览——是你的第一道质量门禁，确保每次训练后都能肉眼验证；
Gradio界面——是你的协作语言，让产品经理、设计师、客户在同一页面上理解AI能力；
矢量导出——是你的生产接口，让分割结果成为GIS、CAD、仿真系统的活水源头。

它们都不依赖外部服务，全部基于YOLOE官版镜像原生环境，意味着你今天复制粘贴，明天就能投入实战。

真正的AI工程化，从来不是堆砌最炫的模型，而是构建最顺手的工具链。而YOLOE的可视化，正是这条链路上，你最先握在手中的那一环。

--- > **获取更多AI镜像** > > 想探索更多AI镜像和应用场景？访问 [CSDN星图镜像广场](https://ai.csdn.net/?utm_source=mirror_blog_end)，提供丰富的预置镜像，覆盖大模型推理、图像生成、视频生成、模型微调等多个领域，支持一键部署。

YOLOE检测结果可视化方法，轻松查看分割边界