⚙️The YOLO v8 architecture is divided into three parts: the backbone, neck, and head.
🔍The backbone is a feature extractor that uses convolutional layers to extract distinct features at various resolutions.
📐The neck combines features from the backbone and adjusts the resolution using upsample layers and concap.
🎯The head predicts classes and bounding box regions, with specialized detect blocks for different object sizes.
🔢The architecture uses a numbering system based on the YOLO configuration file.