Panoptic Segmentation: A Basic to Advanced Guide (2024)
Viso.ai
by Gaudenz Boesch
6d ago
Image segmentation task is a fundamental computer vision task that aims to partition a digital image into multiple segments or sets of pixels. These segments correspond to different objects, materials, or semantic parts of the scene.  The goal of image segmentation is to simplify and/or change the representation of an image into something more meaningful and easier to analyze. There are three main types of image segmentation: semantic segmentation, instance segmentation, and panoptic segmentation. We have put together a detailed guide on semantic and instance segmentation that you can che ..read more
Visit website
Convolution Operations: an In-Depth 2024 Guide
Viso.ai
by Nico Klingler
6d ago
Convolution is a feature extractor in image processing that extracts key characteristics and attributes from images and outputs useful image representations. CNNs learn features directly from the training data. These features can include edges, corners, textures, or other relevant attributes that aid in distinguishing an image and understanding its contents.  Object detection and image classification models later use these extracted features. Deep Learning extensively utilizes Convolutional Neural Networks (CNNs) in which convolution operations play a central role in automatic feature ext ..read more
Visit website
Complete 2024 Guide to Feature Extraction in Python
Viso.ai
by Gaudenz Boesch
6d ago
Feature Extraction is the process of transforming raw data, often unorganized, into meaningful features, which are used to train machine learning models. In today’s digital world, machine learning algorithms are used widely for credit risk prediction, stock market forecasting, early disease detection, etc. The accuracy and performance of these models rely on the quality of the input features. In this blog, we will introduce you to feature engineering, why we need it and the different machine learning techniques available to execute it. What is Feature Extraction in Machine Learning? We provide ..read more
Visit website
What is a Decision Tree?
Viso.ai
by Nico Klingler
1w ago
Machine learning is a key domain of Artificial Intelligence that creates algorithms and training models. Two important problems that machine learning tries to deal with are Regression and Classification. Many machine Learning algorithms perform these two tasks. However, algorithms like Linear regression make assumptions about the dataset. These algorithms may not work properly if the dataset fails to satisfy the assumptions. The Decision Tree algorithm is independent of such assumptions and works fine for both regression and classification tasks. In this article, we will discuss the Decision T ..read more
Visit website
Modality: The Multi-Dimensional Language of Computer Vision
Viso.ai
by Gaudenz Boesch
1w ago
The meaning of modality is defined as “a particular mode in which something exists or is experienced or expressed.” In artificial intelligence, we use this term to talk about the type(s) of input and output data an AI system can interpret. In human terms, modality’s meaning refers to the senses of touch, taste, smell, sight, and hearing. However, AI systems can integrate with a variety of sensors and output mechanisms to interact through an additional array of data types.   Pattern recognition performed with a variety of cameras and sensors enables systems to identify and interpret meanin ..read more
Visit website
Understanding Visual Question Answering – VQA
Viso.ai
by Nico Klingler
2w ago
With the advancement of Deep Learning (DL), the invention of Visual Question Answering (VQA) has become possible. VQA has recently become popular among the computer vision research community as researchers are heading towards multi-modal problems. VQA is a challenging yet promising multidisciplinary Artificial Intelligence (AI) task that enables several applications. In this blog we’ll cover: Overview of Visual Question Answering The fundamental principles of VQA Working on a VQA system VQA datasets Applications of VQA across various industries Recent developments and future challenges What ..read more
Visit website
Vision Language Models: Exploring Multimodal AI
Viso.ai
by Gaudenz Boesch
2w ago
Vision Language Models (VLMs) bridge the gap between visual and linguistic understanding of AI. They consist of a multimodal architecture that learns to associate information from image and text modalities. In simple terms, a VLM can understand images and text jointly and relate them together. By using advances in Transformers and pretraining strategies, VLMs unlock the potential of vision language applications ranging from image search to image captioning, generative tasks, and more! This article will explore vision language models alongside code examples to help with implementation. Topics c ..read more
Visit website
Concept Drift vs Data Drift: How AI Can Beat the Change
Viso.ai
by Nico Klingler
2w ago
Model drift is an umbrella term encompassing a spectrum of changes that impact machine learning model performance. Two of the most important concepts underlying this area of study are concept drift vs data drift. These phenomena manifest when certain factors alter the statistical properties of model inputs or outputs. In most cases, this necessitates updating the model to account for this “model drift” to preserve accuracy. A deep learning model using TensorFlow or facial recognition might experience data drift due to poor lighting or demographic changes. These changes in the input data may de ..read more
Visit website
Gradient Descent in Computer Vision
Viso.ai
by Nico Klingler
2w ago
Computer Vision (CV) models use training data to learn the relationship between input and output data. The training is an optimization process. Gradient descent is an optimization method based on a cost function. It defines the difference between the predicted and actual value of data. CV models try to minimize this loss function or lower the gap between prediction and actual output data. To train a deep learning model – we provide annotated images. In each iteration – GD tries to lower the error and improve the model’s accuracy. Then it goes through a process of trials to achieve the desired ..read more
Visit website
Multispectral Imaging: Looking Beyond the Visible Light
Viso.ai
by Gaudenz Boesch
2w ago
Multispectral imaging is a technique that captures light across a broad range of spectral bands, extending beyond what the human eye can see, including infrared and ultraviolet light. This approach significantly surpasses traditional color imaging by revealing details invisible to the naked eye. Using this method to gather rich information benefits various applications, including analyzing crop health and detecting skin diseases. At the heart of multispectral imaging is the principle, that different materials possess distinctive spectral signatures(unique patterns of light absorption, reflecti ..read more
Visit website

Follow Viso.ai on FeedSpot

Continue with Google
Continue with Apple
OR