Machine Learning with AWS Recognition

What is AWS Recognition?

Amazon Recognition is a service that makes it easy to add powerful, image and video-based, visual analysis to your applications.

Recognition Image lets you easily build powerful applications to search, verify, and organize millions of images.

Recognition Video lets you extract motion-based context from stored or live stream videos and helps you analyze them.

You just provide an image or video to the Recognition API, and the service can identify objects, people, text, scenes, and activities. It can detect any inappropriate content as well.

Amazon Recognition also provides highly accurate facial analysis and facial recognition. You can detect, analyze, and compare faces for a wide variety of use cases, including user verification, cataloging, people counting, and public safety.

Amazon Recognition is a HIPAA eligible service

You need to ensure that the Amazon S3 bucket you want to use is in the same region as your Amazon Recognition API endpoint.

How does it work?

Amazon Recognition provides two API sets, they are Amazon Recognition Image for analyzing images and Amazon Recognition Video, for analyzing videos.

Both API sets perform detection and recognition analysis of images and videos.
Amazon Recognition Video can be used to track the path of people in a stored video.
Amazon Recognition Video to searching a streaming video for persons whose facial descriptions match facial descriptions already stored by Amazon Recognition.

RecognizeCelebrities API returns information for up to 100 celebrities detected in an image.

Use cases for AWS recognition

Searchable image and video libraries

Amazon Recognition makes images and stored videos searchable

Face-based user verification

It can be used in building access or similar applications, compares a live image to a reference image

Sentiment and demographic analysis

Amazon Recognition detects emotions such as happiness, sadness, or surprise, and demographic information such as gender from facial images.

Recognition can analyze images and send the emotion and demographic attributes to Amazon Redshift for periodic reporting on trends such as in-store locations and similar scenarios.

Facial recognition

Images, Stored Videos, and Streaming videos can be searches for faces that match those in a face collection A face collection is an index of faces that you own and manage

Unsafe Content Detection

Amazon Recognition can detect explicit and suggestive adult content in images and videos.

For example, social and dating sites, photo-sharing platforms, blogs and forums, apps for children, e-commerce sites, entertainment, and online advertising services.

Celebrity recognition

Amazon Recognition can recognize thousands of celebrities (politicians, sports, business, entertainment, and media) within supplied images and in videos.

Text detection

It detecting text in an image allows for extracting textual content from images.


Integrate powerful image and video recognition into your apps

Amazon Recognition removes the complexity of building image recognition capabilities into applications by making powerful and accurate analysis available with a simple API.

Deep learning-based image and video analysis

Recognition uses deep learning technology to accurately analyze images, find and compare faces in images, and detect objects and scenes within images and videos.

Scalable image analysis

Amazon Recognition enables the analysis of millions of images.
This allows for curating and organizing massive amounts of visual data.

Low cost

Clients pay for the images and videos they analyze and the face metadata stored. There are no minimum fees or upfront commitments.

Machine Learning development with AWS Sage Maker

Make your Machine Learning team working easier, focus more on business and quick deployment with AWS managed service SageMaker.

Today, Machine Learning(ML) is resolving complex problems which make more business values for customer and many companies also apply ML to resolve robust business problems. ML have more benefit, but also more challenges to building the ML model with high accuracy. I currently working on the AI team to help the company deliver AI/ML project quickly and help Data Scientist(DS) team developing Data Pipeline, Machine Learning pipeline which helps project grow and quick delivery with high quality.

Overview of Machine Learning development

Machine Learning Sage Maker AWS
Figure1. Machine learning process

Here is the basic machine learning process which is the practice model of big companies. We are including multiple phases (business analyst, data processing, model training, and deployment), multiple steps each phase, and a fleet of tools that we use to result in dedicated steps.

Business problems: is including problems that challenge the business and we can use ML learning as a better solution to resolve it.
ML Problem Framing: is a phase that helps DS and Engineering definition for ML problems, propose ML solutions, data pipeline and, planning.
Data processing(Collection, integration, preparation and cleaning, visualization, and analysis): This phase including multiple steps that help to prepare data for visualization and ML training.
Model Training(Feature engineering, Model training, and parameter tuning, model evaluation): DS and developer will working on this phase to develop engineering features and prepare data for specific model and training model using frameworks such as Tensorflow, Pytorch.

When we don’t use any platform such as AWS SageMaker or Azure ML Studio then we take more time to develop complex stack skills. We need to have many skills in compute, network, storage, ML frameworks, programming language, engineering features …Machine Learning Sage Maker AWS - Machine Learning Stack

When we develop an ML model with a lack of skills and complex components that takes more time to handle tasks about programming, compute, and challenges for the engineering team which using the model and deploys it. In Figure 2, we have multiple instants of cloud computing (Infrastructure as a ServicePlatform as a ServiceApplication as a Service) that provide resource type according to business necessary at the level of control. We can choose a specific instant layer for business necessary or compact of layers to meet the business objective. So, which Machine Learning projects and research environment, I would like to highly recommend for any DS and Develop should use PaaS and AaaS first to meet your business requirement first to quick delivery, reduce cost and effort. That is the main reason I want to describe AWS SageMaker service which can use as a standard platform for ML team to quickly develop/deploy ML model, focus on resolve ML business problems and improve the quality of the model.

AWS SageMaker offers the base of end to end machine learning development environment

Machine Learning Sage Maker AWS Purpose
Figure 3. AWS SageMaker benefits

When we developing an ML model, we need to take care of multiple partitions and sometimes we want to try with a new model and responsibility accuracy response as quickly as possible. So, that also depends on “Is there enough data for processing and training?” and “How much time will take for training a new model?”. AWS SageMaker is using in thousands of AI companies and developed by experts and best practices of ML which help to improve the ML process, working environment. In my opinion, when I want to focus on building a model and resolve the challenge problem then I want to make another easy at the basic level and spend more time to resolve the main problems at first. SageMaker provides me a notebook solution and a notebook is a great space that I coding and analyzing data for training. Working together with SageMaker SDK, I can easy to connect and use other resources inside of AWS such as S3 bucket and training jobs. Everything helps me quickly develop and deliver a new model. So, I want to highlight the main benefit which we got from this service and also have disadvantages.

*💰 Cost-effective:

– SageMaker provides solutions for training ML model with training jobs that are distributed, elastic, high-performance computing using spot instance to save cost up to 90%, pay only for training time in seconds. (document)
– Elastic Inference: This feature helps to save cost for computes need GPU to process deep learning model such as prediction time. (document)
* 🎯 Reduce lack of skills and focus to resolve business problems. We can easy to set up a training environment from a notebook with “click” for elastic of CPUs/GPUs
* 🌐 Connectivity and easy to deploy
– AWS SageMaker is AWS managed service and it easy to integrate with other AWS services inside of a private network. Which also impact to big data solution, ETL processed with data can be processing inside of a private network and reduce cost for the transfer.
– EndPoint function helps DS/Develop easy to deploy trained model with “clicks” or from SDK. (document)
* 🏢 Easy to manage: When working with multiple teams on AWS, more resources will come every day, and challenges with the IT team to manage resources, roles that will impact cost and security. AWS managed service will help to reduce the resource we need to create.

* AWS SageMaker is a managed service, it implements best practices and focuses on popular frameworks, sometimes it does not match your requirement and should be considered before choosing it.
* 🎿 Learning new skills and basic knowledge of AWS Cloud: When we working on AWS cloud, basic knowledge about cloud infrastructure is necessary and new knowledge with AWS managed service which you want to use.
* 👮 It also more expensive than a normal EC2 instance because it supports dedicated to ML, we need to choose the right resource for the development to save cost.

AWS SageMaker is the best suite service for the production environment. It helps to build a quality model and standard environment. Which reduces the risk in product development. We trade-off to get most of the benefit and quickly achieve the goal of the team. Thank you so much for reading, please let me know if you have any concerns.



Hiring Data Scientist / Engineer

We are looking for Data Scientist and Engineer.
Please check our Career Page.

Vietnam AI / Data Science Lab

Vietnam AI Lab

Practice Design for Try/Fail Fast

At the moment, AI/ML/DL are hot keywords in the trend of Software development. The world have more successful projects based on AI technologies such as Google Translate, AWS Alexa, …AI makes machine smarter than. So, the way from idea to successfully have many challenges if want to make great solutions. I have some time working with AI projects and start-ups to build great solutions based on Algorithms and ML; I aimed to propose and implement solutions that help the development team working smoothly. Today, I would like to describe the development process, architecture, CI/CD, and Programming for quickly implement multiple AI approaches with Agile software development methodology.

Continues Integration and Continues Deployment

– Batch Processing, Parallel Processing
– Data-Driven Development and Test Driven Development (to be continued)


AI project including multiple services with domains focus on: AI/ML/DL, and engineering that develop independent, integration and verification automatically. Popular, the ML services very specially with Engineering Service, resolve challenges problems linking with technologies: Machine learning, deep learning, big data, distributed computing … Microservices architecture, in this case, is a first choice, that helps to separate businesses problem into specific services and can be resolve by specific domain knowledge of Data Science team, Engineering team. And more advantage of microservices with Agile development, more information here. With the AI project, there will focus more on “How to resolve business by AI technology?”.

Microservices maybe not the best choice but that help to quickly development and delivery with Agile methodology.

Continues Integration and Continues Deployment

When a project including multiple teams, multiple services challenges at the integration and deployment. CI/CD is most popular with software development but I got more specific from Data Science(DS) team. The big question of DS is “We have more solutions to resolve this problem, Could you help me propose a solution to quickly evaluation and integration?

With the Engineering team, CI/CD pipeline is so general. With AI solution, you will meet some challenges linking to:
– How to running on distributed computing? We choose batch jobs
– How to save money with long-time jobs? We choose AWS spot instances
– How do parallel jobs improve performance? We running parallel jobs and parallel on structure design(Python coding)
– How to control Data versions, Model versions? we choose Data Version Control and AWS S3 to versioning training/evaluation data and models

All solutions applied to my project aimed to resolve challenges of AI technology, but it interesting. A good abstract of structure will help to quickly integrate and deliver multiple approaches.

This pipeline can implement with any CI/CD framework such as Gitlab CI, Jenkins, AWS Code Build … So, each framework should have a function for custom distributed and parallel jobs. Because the jobs in the pipeline need specific resources and the resource should be auto-scale. Example for Training Jobs need more GPUs and System Evaluation need more CPUs for parallels, a scalable resource is most important to save the cost.

CI/CD pipeline including training and system will help fast try and fast result, the implementation can easy to integrate quickly, trust and ablility to control quality.