Infosys experts share their views on how digital is significantly impacting enterprises and consumers by redefining experiences, simplifying processes and pushing collaborative innovation to new levels

« How to design data privacy controls for your legacy data? | Main | PrivacyNext - Privacy Experience for your Tester »

Designing data for deep learning in discovering images

Deep Learning and its amazing applications:

Deep learning is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised and thus can happen without human supervision, drawing from data that is both unstructured and unlabeled.

There are several deep learning algorithms like Convolutional Neural Networks (CNNs), Long Short-Term Memory Networks (LSTMs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), Autoencoders, etc. which help in deriving valuable insights about the data, predict events, occurrences, user habits, etc.

Businesses can predict their customer behavior, customer shopping patterns, create synthetic image representations, generate social media analytics to understand user behavior and usage patterns, use facial recognition at airports or unmanned shopping complexes. Deep learning can also be used in medical imaging, finding cancer cell patterns in microscopic images of cells, generate synthetic images of cells and cell components.

Lack of data: A major challenge in Deep Learning

Deep learning requires a vast amount of training data to generate an optimal training model which can accurately predict and generate realistic outputs. Without a significant amount of data, the deep learning algorithms may not be learning all the possibilities in the input, able to predict the output accurately.

Complexities in obtaining large volume data for solving business problems:

There are several reasons why businesses are either unable to or reluctant to share their data:

  • Data privacy laws: Clients are forbidden by law to share sensitive data of various forms like personal data of customers, financial details, sensitive government data, etc. Sharing such data can lead to several privacy issues and businesses can be held liable legally for sharing sensitive data.
  • Less actual data available: Sometimes the data may be available but not on a large scale or the data required might be completely unavailable.
  • Expensive data: Training data is needed for machine learning algorithms. However, especially in the case of self-driving cars, such data is expensive to generate in real life.

So how do we solve them? Create realistic synthetic data!

One of the best ways to solve the issue of less data availability is to generate new synthetic data that is realistic and identical to the actual data.

While it's easy to generate synthetic numeric and descriptive data, synthetic image generation for Deep Learning applications like computer vision is a more complex and difficult process. The following is a brief description of the standard steps involved in the generation of synthetic image data:

Steps to create synthetic datasets for computer vision:

  • Data collection and harmonization: The sample ground truth images are collected from multiple sources and harmonized.
  • Image annotations: The images are annotated and labeled and are then combined with the ground truth image and fed as input.
  • Image classification: The images are categorized into their type, for example, pharmacy, machinery, industrial tools, maps, roads, cities, facades, etc.
  • Data cleaning: Clean data is essential for focusing on the actual data by excluding the unnecessary junk data with ML techniques and filtering.
  • Image preprocessing: The image is preprocessed to resize, flip, rotate and transform the image in multiple steps.
  • Train the model: The model is trained on the large volumes of data using deep learning algorithms and computer vision.
  • Validate the model: The model is validated to check for accuracy, performance, and whether any parameters need to be tuned using the validation dataset.
  • Test the model: The model is finally tested with the test dataset, which is independent of the training dataset to check if the fitting is near perfect and accurate.

Best practices of synthetic dataset generation for computer vision:

    • Generative Adversarial Network (GAN): GAN is a non-supervised deep learning method that can be used to learn regularities in image data patterns and generate new identical realistic examples. It is a clever way of model training by framing the problem as a supervised learning problem with two sub-models: the generator model that we train to generate new examples, and the discriminator model that tries to classify examples as either real or fake. Labeling and annotation are created on the images depending on the objects and features of each image, which is then treated as input to generate synthetic images which are realistic and identical to the ground truth images. 
    • Deep Convolutional Generative Adversarial Networks (DCGAN): DCGAN is an architecture in deep learning that generates outputs identical to the training image data. This model replaces the completely connected layers of the GAN model with convolution layers and is thus more useful for image and video datasets. 
    • Recurrent Neural Networks (RNNs): Recurrent neural networks can be utilized for generation of realistic synthetic image data, one common example being DRAW (Deep Recurrent Attentive Writer). Its networks blend a chronological VAE with a state-of-the-art spatial attentive mechanic which imitates the curvature of the eyeball, thereby enabling generation of realistic images.

Some real-time examples of synthetic datasets of computer vision:

  • Facade dataset: The facade dataset was manually annotated and labeled by the Center for Machine Perception at the Czech Technical University in Prague. It consists of images of facades from different cities around the world with diverse architectural styles. The input image is the labeled and annotated input, the ground truth is the actual picture of the facade and the predicted image is the image generated by the GAN.

  • Map dataset: The maps dataset consists of snippets of Google Maps for satellite and normal view. The GAN takes the normal view as input and generates the output identical to the ground truth. The reverse can also be easily produced by the GAN.

  • Cityscapes dataset: The cityscapes dataset was created by the cityscapes team comprising members from different German organizations. It contains various images and recorded videos taken on the streets from 50 different cities.

  • Edge to handbags dataset: Images of different handbags are passed through the Holistically Nested Edge Detection algorithm created by Saining Xie at UC San Diego. The input is created as edges of the handbags and it produces output handbags image identical to the ground truth with different colors.  

  •  Night to day dataset: The night to day dataset was created by a group of researchers at Brown University, Rhode Island. The dataset consists of images of various places taken on webcams and security cameras both during night and daytime. The GAN takes the daytime as input, night as ground truth, and generates a realistic nighttime view of the image. 

Ideal System configuration for GANs:

    • GPU: Nvidia Titan / Nvidia P100 PCIe / Nvidia RTX 2060
    • CPU: 12-core Intel Core i7 processor
    • RAM: Minimum 16 GB

How can we do this using iEDPS?

Synthetic image generation feature of iEDPS's data augmentation will help in generating synthetic image dataset for addressing computer vision problems. Data scientists can upload their sample image data to iEDPS and generate new synthetic images aligned to the sample input image. This will aid them in increasing their dataset size and it will also address the problem of non-availability of image dataset, and thereby create better trained models and predictions.


About the Authors

    • Suvojit Hore is a Systems Engineer - Python developer from iEDPS Data Discovery team.
    • Gayathri Nadella is a Technology lead - AI Professional from iEDPS Data Discovery team.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.