Skip to search formSkip to main contentSkip to account menu
- Corpus ID: 269983404
@inproceedings{Wang2024YOLOv10RE, title={YOLOv10: Real-Time End-to-End Object Detection}, author={Ao Wang and Hui Chen and Lihao Liu and Kai Chen and Zijia Lin and Jungong Han and Guiguang Ding}, year={2024}, url={https://api.semanticscholar.org/CorpusID:269983404}}
- Ao Wang, Hui Chen, Guiguang Ding
- Published 23 May 2024
- Computer Science, Engineering
A new generation of YOLO series for real-time end-to-end object detection, dubbed YOLOv10, is presented and the holistic efficiency-accuracy driven model design strategy for YOLOs is introduced, which greatly reduces the computational overhead and enhances the capability.
1 Citation
Figures and Tables from this paper
- table 1
- figure 2
- figure 3
- table 4
- figure 4
- table 5
- table 7
- table 13
- table 14
- table 15
Ask This Paper
BETA
AI-Powered
Ask This Paper
BETA
AI-Powered
Unknown Error
An unexpected error occurred. Please try again.
No Answer Found
Ask another question that can be answered by this paper or rephrase your question.
We are still processing this paper
Please try again later.
Question Answering Unavailable
Please try again later.
No Response
The server took too long to answer your question. You can either rephrase your question or wait until it is less busy.
AI-Generated
Thank you for your feedback!
We're sorry, something went wrong while submitting this feedback.
Thank you for your feedback!
We're sorry, something went wrong while submitting this feedback.
Supporting Statements
Our system tries to constrain to information found in this paper. Results quality may vary. Learn more about how we generate these answers.
Feedback?
One Citation
- Yiming CuiCheng HanDongfang Liu
- 2024
Computer Science, Engineering
This work introduces a stepwise spatial global-local aggregation network that combines the local information from the neighboring frames and global semantics from the current frame to eliminate the feature degradation in video object detection.
74 References
- Wenyu LvShangliang Xu Yi Liu
- 2023
Computer Science
ArXiv
This paper proposes the Real-Time DEtection TRansformer (RT-DETR), the first real-time end-to-end object detector to the best knowledge that addresses the above dilemma and designs an efficient hybrid encoder to expeditiously process multi-scale features by decoupling intra-scale interaction and cross-scale fusion to improve speed.
- 93
- Highly Influential[PDF]
- Xianzhe XuYiqi JiangWeihua ChenYi-Li HuangYuanhang ZhangXiuyu Sun
- 2022
Computer Science
ArXiv
DAMO-YOLO is extended from YOLO with some new technologies, including Neural Architecture Search (NAS), efficient Reparameterized Generalized-FPN (RepGFPN), a lightweight head with AlignedOTA label assignment, and distillation enhancement to improve performance to a higher level.
- 58 [PDF]
- Chengcheng WangWei He Yunhe Wang
- 2023
Computer Science
NeurIPS
This study provides an advanced Gatherand-Distribute mechanism (GD) mechanism, which is realized with convolution and self-attention operations, and implements MAE-style pretraining in the YOLO-series for the first time, allowing Y OLOseries models could be to benefit from unsupervised pretraining.
- 20
- Highly Influential[PDF]
- Chengqi LyuWenwei Zhang Kai Chen
- 2022
Computer Science
ArXiv
An efficient real-time object detector is designed that exceeds the YOLO series and is easily extensible for many object recognition tasks such as instance segmentation and rotated object detection and introduces soft labels when calculating matching costs in the dynamic label assignment to improve accuracy.
- 114 [PDF]
- Yuming ChenXinbin YuanRuiqi WuJiabao WangQibin HouMingg-Ming Cheng
- 2023
Computer Science
ArXiv
The core design is based on a series of investigations on how convolutions with different kernel sizes affect the detection performance of objects at different scales, resulting in a new strategy that can strongly enhance multi-scale feature representations of real-time object detectors.
- 8
- Highly Influential[PDF]
- Jianfeng WangLin SongZeming LiHongbin SunJian SunN. Zheng
- 2021
Computer Science
2021 IEEE/CVF Conference on Computer Vision and…
A Prediction-aware One-To-One (POTO) label assignment for classification is introduced to enable end-to-end detection, which obtains comparable performance with NMS and a simple 3D Max Filtering is proposed to utilize the multi-scale features and improve the discriminability of convolutions in the local region.
- 159 [PDF]
- Pei SunYi Jiang Ping Luo
- 2021
Computer Science
ICML
It is pointed out that one-to-one positive sample assignment is the key factor, while, one- to-many assignment in previous detectors causes redundant predictions in inference, and the concept of score gap is introduced to explore the effect of matching cost.
- 65 [PDF]
- Nicolas CarionFrancisco MassaGabriel SynnaeveNicolas UsunierAlexander KirillovSergey Zagoruyko
- 2020
Computer Science
ECCV
This work presents a new method that views object detection as a direct set prediction problem, and demonstrates accuracy and run-time performance on par with the well-established and highly-optimized Faster RCNN baseline on the challenging COCO object detection dataset.
- 8,870 [PDF]
- Fangao ZengBin DongTiancai WangCheng ChenX. ZhangYichen Wei
- 2022
Computer Science
ECCV
MOTR is proposed, which extends DETR and introduces track query to model the tracked instances in the entire video to enhance temporal relation modeling and serve as a stronger baseline for future research on temporal modeling and Transformer-based trackers.
- 311 [PDF]
- Joseph RedmonS. DivvalaRoss B. GirshickAli Farhadi
- 2016
Computer Science
2016 IEEE Conference on Computer Vision and…
Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
- 29,480
- Highly Influential[PDF]
...
...
Related Papers
Showing 1 through 3 of 0 Related Papers