Scroll Down to get the courseGet The Course

Transformers in Computer Vision | Free Udemy Course

IT & Software Other IT & Software Computer Vision

Free $59.99100% off

Price expires 2 years ago or 641 uses

Transformers in Computer Vision | Free Udemy Course

4.62

(26 ratings)

1270 students

Created by:

Coursat.ai Dr. Ahmad ElSallab

Last updated 10/2022Course Language ArabicCourse Caption Course Length 07:06:20 to be exact 25580 seconds!Number of Lectures 35

This course includes:

7 hours hours of on-demand video
1 article
Full lifetime access
Access on mobile and TV
Certificate of completion
1 additional resources

Transformer Networks are the new trend in Deep Learning nowadays. Transformer models have taken the world of NLP by storm since 2017. Since then, they become the mainstream model in almost ALL NLP tasks. Transformers in CV are still lagging, however they started to take over since 2020. We will start by introducing attention and the transformer networks. Since transformers were first introduced in NLP, they are easier to be described with some NLP example first. From there, we will understand the pros and cons of this architecture. Also, we will discuss the importance of unsupervised or semi supervised pre-training for the transformer architectures, discussing Large Scale Language Models (LLM) in brief, like BERT and GPT.This will pave the way to introduce transformers in CV. Here we will try to extend the attention idea into the 2D spatial domain of the image. We will discuss how convolution can be generalized using self attention, within the encoder-decoder meta architecture. We will see how this generic architecture is almost the same in image as in text and NLP, which makes transformers a generic function approximator. We will discuss the channel and spatial attention, local vs. global attention among other topics.In the next three modules, we will discuss the specific networks that solve the big problems in CV: classification, object detection and segmentation. We will discuss Vision Transformer (ViT) from Google, Shifter Window Transformer (SWIN) from Microsoft, Detection Transformer (DETR) from Facebook research, Segmentation Transformer (SETR) and many others. Then we will discuss the application of Transformers in video processing, through Spatio-Temporal Transformers with application to Moving Object Detection, along with Multi-Task Learning setup.Finally, we will show how those pre-trained arcthiectures can be easily applied in practice using the famous Huggingface library using the Pipeline interface.Who this course is for:Intermediate to Advanced CV EngineersIntermediate to Advanced CV Researchers

Course Content:

Sections are minimized for better readability, click the section title to view the course content

1 Lectures | 06:45

11 Lectures | 02:48:23

8 Lectures | 01:24:42

2 Lectures | 24:44

3 Lectures | 01:10:13

3 Lectures | 26:41

1 Lectures | 15:24

4 Lectures | 23:50

1 Lectures | 05:38

1 Lectures | 00:00

4.62

(26 course ratings)

0/26

6/26

20/26

JOIN OUR WHATSAPP GROUP TO GET LATEST COUPON AS SOON AS UPDATED

JOIN WHATSAPP

JOIN OUR TELEGRAM CHANNEL TO GET LATEST COUPON

JOIN TELEGRAM

JOIN OUR FACEBOOK GROUP TO GET LATEST COUPON

JOIN FACEBOOK

Get The Course

If you like to get inspired by great web projects, you should check out Made with Javascript. If you have a project that you wish to share with the world, feel free to submit your project on Made with Javascript Club website.