Connect with us

AI

Microsoft wields ML to catch child predators, city drops 7-year facial-recognition experiment after no arrests…

Avatar

Published

on

Roundup Welcome to the first AI round up of this year. AI continues to spread like wildfire and everyone wants a slice of the pie – even Hollywood. Read on for the latest flop in facial recognition, too.

Hollywood is cosying up to AI algos: Warner Bros, the massive American film studio and entertainment conglomerate, is employing algorithmic tools to help it decide if a film will become a blockbuster, or go bust at the cinema.

Studios like Warner Bros have a limited amount of budget to splash on new projects every year. Directors bid fiercely to fund the films they believe will make everyone the most profits. But there are a vast number of factors to consider, and it can become all very time consuming and prove to be a total waste of time if a film eventually flops. So, why not employ a machine to help you decide?

Warner Bros have signed a deal with Cinelytic, an AI analytics startup based in Los Angeles, to do just that, according to Hollywood Reporter. Cinelytic’s software will help predict a particular film’s profits, helping studios decide issues like what to release and where.

“The platform reduces executives’ time spent on low-value, repetitive tasks and instead focuses on generating actionable insights for packaging, green-lighting, marketing and distribution decisions in real time,” according to a statement from Cinelytic.

The ultimate decision whether to fund a film or not, however, is still up to humans. Hopefully that’ll stop more mistakes like Cats.

Over Christmas… Nvidia improved its StyleGAN software – capable of generating realistic photos of faces, buildings, and so on, from scratch – to version two, ironing out artifacts that give away the fact the images were imagined by a computer.

Microsoft is licensing software that catches child groomers on Xbox: Redmond has deployed a tool it has been developing with academics to prevent online child abuse.

Codenamed Project Artemis, the software analyzes text conversations and rates how inappropriate the interactions are and if the messages should be flagged for human moderators to review. Those humans then report suspected sexual exploitation to law enforcement.

Microsoft’s chief digital safety officer Courtney Gregoire did not reveal how Project Artemis works in a blog post, this week, so we spoke to the boffins behind it directly.

The tool was developed internally with the help of academics, who participated in a hackathon in 2018. Hany Farid, a professor working at the University of California Berkeley’s department of electrical engineering and computer science and the school of information, told The Register that no fancy deep learning was used, instead the system is based on some “fairly standard non-linear regression to learn a numeric risk score based on the text-based conversation between two people.”

Companies interested in licensing the technology should contact Thorn, a tech company building software applications aimed at protecting children against sexual abuse.

Uh oh! Contractors have snooped in on thousands of Skype calls: Stop us if you’ve heard this one before, but contractors working on behalf of tech companies have been listening to sensitive audio clips gleaned from users in the hopes of improving its services.

This time it’s Skype owner Microsoft. A former contractor working in Beijing revealed that he had listened to thousands of sensitive and disturbing recordings over Skype and Cortana. There was little security and workers in China could access the clips via a web app on Google Chrome, as reported by The Guardian.

The leaker was also encouraged to use the same password for all his Microsoft accounts, apparently. Contractors were not given any security training either, a risky move considering the data could be stolen by miscreants. He said he heard “all kinds of unusual conversations, including what could have been domestic violence.”

Microsoft has since said that it has updated its privacy statement to make it clear that humans are sometimes listening in on Skype calls or interactions with its voice-enabled assistant Cortana. And it said that recorded audio clips flagged for review are only ten seconds long, so that contractors don’t have access to longer conversations.

Here’s how the White House wants America’s companies to develop AI tech: The Trump Administration is working to expand its national AI strategy to broach the topic of regulation.

There is little oversight or rules on how AI technology should be used by the private sector at the moment. So the US government wants to take a stab at changing that by, erm, “proposing a first-of-its-kind set of regulatory principles”.

These principles probably won’t do much, they’re not real policies unless backed up by law. But nevertheless, the Trump Administration wants to make some sort of attempt at guiding regulation.

“Must we decide between embracing this emerging technology and following our moral compass?,” the chief technology officer of the US, Michael Kratsios, wrote in an op-ed published in Bloomberg, this week.

“That’s a false choice. We can advance emerging technology in a way that reflects our values of freedom, human rights and respect for human dignity,” he continued. Kratsios proposes that federal agencies should make it easier for the public, academics, companies, and non-profits to make comments and leave feedback on any AI policies made.

Agencies like the National Institute of Standards and Technology (NIST) should assess a product’s risk and cost before regulating a particular technology. They should also take into account issues like transparency, safety, security and fairness that support American values.

“Americans have long embraced technology as a tool to improve people’s lives. With artificial intelligence, we are ready to do it again,” Kratsios concluded.

San Diego has ended its seven-year experiment with facial recognition: Finally, here’s a long read on how San Diego’s law enforcement used facial recognition over seven years to hunt for criminals prowling the American city’s streets.

A network of 1,300 cameras embedded on smartphones and tablets manipulated by staff recorded over 65,000 faces from 2012 to 2019. These images were then ran against a database of mugshots to look for any potential matches.

And over these last seven years, no arrests as result from the technology were ever made, according to Fast Company. But for some bizarre reason, police didn’t track the results of its experiment so there is no solid evaluation on the system’s performance.

As of 2020, San Diego has shut down the experiment. You can read more about that here. ®

Sponsored: Detecting cyber attacks as a small to medium business

Source: https://go.theregister.co.uk/feed/www.theregister.co.uk/2020/01/13/ai_roundup_100120/

Continue Reading

AI

Amazon Textract now available in Asia Pacific (Mumbai) and EU (Frankfurt) Regions 

Avatar

Published

on

You can now use Amazon Textract, a machine learning (ML) service that quickly and easily extracts text and data from forms and tables in scanned documents, for workloads in the AWS Asia Pacific (Mumbai) and EU (Frankfurt) Regions.

Amazon Textract goes beyond simple optical character recognition (OCR) to identify the contents of fields in forms, information stored in tables, and the context in which the information is presented. The Amazon Textract API supports multiple image formats like scans, PDFs, and photos, and you can use it with other AWS ML services like Amazon Comprehend, Amazon Comprehend Medical, Amazon Augmented AI, and Amazon Translate to derive deeper meaning from the extracted text and data. You can also use this text and data to build smart searches on large archives of documents, or load it into a database for use by applications, such as accounting, auditing, and compliance software.

An in-country infrastructure is critical for customers with data residency requirements and regulations, such as those operating in government, insurance, healthcare, and financial services. With this launch, customers in the AWS Asia Pacific (Mumbai) and EU (Frankfurt) Regions can benefit from Amazon Textract while complying with data residency requirements and integrating with other services and applications available in these Regions.

Perfios is a leading product technology company enabling businesses to aggregate, curate, and analyze structured and unstructured data to help in decision-making. “We have been testing Amazon Textract since its early days and are very excited to see it launch in India to help us address data sovereignty requirements for the region, which now unblocks us to use it at scale,” says Ramgopal Cillanki, Vice President, Head of Engineering at Perfios Software. “We believe that the service will help to transform the banking, financial services, and insurance (BFSI) industry from operations-heavy, human-in-the-loop processes to machine learning-powered API automation with minimal manual operations. Textract will not only help us reduce lenders’ decision-making turnaround time but also create business impact for our end-users in the long run.”

For more information about Amazon Textract and its Region availability, see Amazon Textract FAQs. To get started with Amazon Textract, see the Amazon Textract Developer Guide.


About the Author

Raj Copparapu is a Product Manager focused on putting machine learning in the hands of every developer.

Source: https://aws.amazon.com/blogs/machine-learning/amazon-textract-now-available-in-asia-pacific-mumbai-and-eu-frankfurt-regions/

Continue Reading

AI

Accessing data sources from Amazon SageMaker R kernels

Avatar

Published

on

Amazon SageMaker notebooks now support R out-of-the-box, without needing you to manually install R kernels on the instances. Also, the notebooks come pre-installed with the reticulate library, which offers an R interface for the Amazon SageMaker Python SDK and enables you to invoke Python modules from within an R script. You can easily run machine learning (ML) models in R using the Amazon SageMaker R kernel to access the data from multiple data sources. The R kernel is available by default in all Regions that Amazon SageMaker is available in.

R is a programming language built for statistical analysis and is very popular in data science communities. In this post, we will show you how to connect to the following data sources from the Amazon SageMaker R kernel using Java Database Connectivity (JDBC):

For more information about using Amazon SageMaker features using R, see R User Guide to Amazon SageMaker.

Solution overview

To build this solution, we first need to create a VPC with public and private subnets. This will allow us to securely communicate with different resources and data sources inside an isolated network. Next, we create the data sources in the custom VPC and the notebook instance with all necessary configuration and access to connect various data sources using R.

To make sure that the data sources are not reachable from the Internet, we create them inside a private subnet of the VPC. For this post, we create the following:

Connect to the Amazon EMR cluster inside the private subnet using AWS Systems Manager Session Manager to create Hive tables.

To run the code using the R kernel in Amazon SageMaker, create an Amazon SageMaker notebook. Download the JDBC drivers for the data sources. Create a lifecycle configuration for the notebook containing the setup script for R packages, and attach the lifecycle configuration to the notebook on create and on start to make sure the setup is complete.

Finally, we can use the AWS Management Console to navigate to the notebook to run code using the R kernel and access the data from various sources. The entire solution is also available in the GitHub repository.

Solution architecture

The following architecture diagram shows how you can use Amazon SageMaker to run code using the R kernel by establishing connectivity to various sources. You can also use the Amazon Redshift query editor or Amazon Athena query editor to create data resources. You need to use the Session Manager in AWS Systems Manager to SSH to the Amazon EMR cluster to create Hive resources.

Launching the AWS CloudFormation template

To automate resource creation, you run an AWS CloudFormation template. The template gives you the option to create an Amazon EMR cluster, Amazon Redshift cluster, or Amazon Aurora MySQL-compatible cluster automatically, as opposed to executing each step manually. It will take a few minutes to create all the resources.

  1. Choose the following link to launch the CloudFormation stack, which creates the required AWS resources to implement this solution:
  2. On the Create stack page, choose Next.
  3. Enter a stack name.
  4. You can change the default values for the following stack details:
Stack Details Default Values
Choose Second Octet for Class B VPC Address (10.xxx.0.0/16) 0
SageMaker Jupyter Notebook Instance Type ml.t2.medium
Create EMR Cluster Automatically? “Yes”
Create Redshift Cluster Automatically? “Yes”
Create Aurora MySQL DB Cluster Automatically? “Yes”
  1. Choose Next.
  2. On the Configure stack options page, choose Next.
  3. Select I acknowledge that AWS CloudFormation might create IAM resources.
  4. Choose Create stack.

You can now see the stack being created, as in the following screenshot.

When stack creation is complete, the status shows as CREATE_COMPLETE.

  1. On the Outputs tab, record the keys and their corresponding values.

You use the following keys later in this post:

  • AuroraClusterDBName – Aurora cluster database name
  • AuroraClusterEndpointWithPort – Aurora cluster endpoint address with port number
  • AuroraClusterSecret – Aurora cluster credentials secret ARN
  • EMRClusterDNSAddress – EMR cluster DNS name
  • EMRMasterInstanceId – EMR cluster primary instance ID
  • PrivateSubnets – Private subnets
  • PublicSubnets – Public subnets
  • RedshiftClusterDBName – Amazon Redshift cluster database name
  • RedshiftClusterEndpointWithPort – Amazon Redshift cluster endpoint address with port number
  • RedshiftClusterSecret – Amazon Redshift cluster credentials secret ARN
  • SageMakerNotebookName – Amazon SageMaker notebook instance name
  • SageMakerRS3BucketName – Amazon SageMaker S3 data bucket
  • VPCandCIDR – VPC ID and CIDR block

Creating your notebook with necessary R packages and JAR files

JDBC is an application programming interface (API) for the programming language Java, which defines how you can access a database. RJDBC is a package in R that allows you to connect to various data sources using the JDBC interface. The notebook instance that the CloudFormation template created ensures that the necessary JAR files for Hive, Presto, Amazon Athena, Amazon Redshift and MySQL are present in order to establish a JDBC connection.

  1. In the Amazon SageMaker Console, under Notebook, choose Notebook instances.
  2. Search for the notebook that matches the SageMakerNotebookName key you recorded earlier.
  3. Select the notebook instance.
  4. Click on “Open Jupyter” under “Actions” to locate the “jdbc” directory.

The CloudFormation template downloads the JAR files for Hive, Presto, Athena, Amazon Redshift, and Amazon Aurora MySQL-compatible inside the “jdbc” directory.

  1. Locate the lifecycle configuration attached.

A lifecycle configuration allows you to install packages or sample notebooks on your notebook instance, configure networking and security for it, or otherwise use a shell script for customization. A lifecycle configuration provides shell scripts that run when you create the notebook instance or when you start the notebook.

  1. Inside the Lifecycle configuration section, choose View script to see the lifecycle configuration script that sets up the R kernel in Amazon SageMaker to make JDBC connections to data sources using R.

It installs the RJDBC package and dependencies in the Anaconda environment of the Amazon SageMaker notebook.

Connecting to Hive and Presto

Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto.

You can create a test table in Hive by logging in to the EMR master node from the AWS console using the Session Manager capability in Systems Manager. Systems Manager gives you visibility and control of your infrastructure on AWS. Systems Manager also provides a unified user interface so you can view operational data from multiple AWS services and allows you to automate operational tasks across your AWS resources. Session Manager is a fully managed Systems Manager capability that lets you manage your Amazon Elastic Compute Cloud (Amazon EC2) instances, on-premises instances, and virtual machines (VMs) through an interactive, one-click browser-based shell or through the AWS Command Line Interface (AWS CLI).

You use the following values from the AWS CloudFormation Outputs tab in this step:

  • EMRClusterDNSAddress – EMR cluster DNS name
  • EMRMasterInstanceId – EMR cluster primary instance ID
  • SageMakerNotebookName – Amazon SageMaker notebook instance name
  1. On the Systems Manager Console, under Instances & Nodes, choose Session Manager.
  2. Choose Start Session.
  3. Start an SSH session with the EMR primary node by locating the instance ID as specified by the value of the key EMRMasterInstanceId.

This starts the browser-based shell.

  1. Run the following SSH commands:
    # change user to hadoop whoami
    sudo su - hadoop

  2. Create a test table in Hive from the EMR master node as you have already logged in using SSH:
    # Run on the EMR master node to create a table called students in Hive
    hive -e "CREATE TABLE students (name VARCHAR(64), age INT, gpa DECIMAL(3, 2));" # Run on the EMR master node to insert data to students created above
    hive -e "INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney rubble', 32, 2.32);" # Verify hive -e "SELECT * from students;"
    exit
    exit

The following screenshot shows the view in the browser-based shell.

  1. Close the browser after exiting the shell.

To query the data from Amazon EMR using the Amazon SageMaker R kernel, you open the notebook the CloudFormation template created.

  1. On the Amazon SageMaker Console, under Notebook, chose Notebook instances.
  2. Find the notebook as specified by the value of the key SageMakerNotebookName.
  3. Choose Open Jupyter.
  4. To demonstrate connectivity from the Amazon SageMaker R kernel, choose Upload and upload the ipynb notebook.
    1. Alternatively, from the New drop-down menu, choose R to open a new notebook.
    2. Enter the code as mentioned in “hive_connect.ipynb”, replacing the emr_dns value with the value from key EMRClusterDNSAddress:
  5. Run all the cells in the notebook to connect to Hive on Amazon EMR using the Amazon SageMaker R console.

You follow similar steps to connect Presto:

  1. On the Amazon SageMaker Console, open the notebook you created.
  2. Choose Open Jupyter.
  3. Choose Upload to upload the ipynb notebook.
    1. Alternatively, from the New drop-down menu, choose R to open a new notebook.
    2. Enter the code as mentioned in “presto_connect.ipynb”, replacing the emr_dns value with the value from key EMRClusterDNSAddress:
  4. Run all the cells in the notebook to connect to PrestoDB on Amazon EMR using the Amazon SageMaker R console.

Connecting to Amazon Athena

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon Simple Storage Service (Amazon S3) using standard SQL. Amazon Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. To connect to Amazon Athena from the Amazon SageMaker R kernel using RJDBC, we use the Amazon Athena JDBC driver, which is already downloaded to the notebook instance via the lifecycle configuration script.

You also need to set the query result location in Amazon S3. For more information, see Working with Query Results, Output Files, and Query History.

  1. On the Amazon Athena Console, choose Get Started.
  2. Choose Set up a query result location in Amazon S3.
  3. For Query result location, enter the Amazon S3 location as specified by the value of the key SageMakerRS3BucketName.
  4. Optionally, add a prefix, such as results.
  5. Choose Save.
  6. Create a database or schema and table in Athena with the example Amazon S3 data.
  7. Similar to connecting to Hive and Presto, to establish a connection from Athena to Amazon SageMaker using the R kernel, you can upload the ipynb notebook.
    1. Alternatively, open a new notebook and enter the code in “athena_connect.ipynb”, replacing the s3_bucket value with the value from key SageMakerRS3BucketName:
  8. Run all the cells in the notebook to connect to Amazon Athena from the Amazon SageMaker R console.

Connecting to Amazon Redshift

Amazon Redshift is a fast, fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing business intelligence (BI) tools. It allows you to run complex analytic queries against terabytes to petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance storage, and massively parallel query execution. To connect to Amazon Redshift from the Amazon SageMaker R kernel using RJDBC, we use the Amazon Redshift JDBC driver, which is already downloaded to the notebook instance via the lifecycle configuration script.

You need the following keys and their values from the AWS CloudFormation Outputs tab:

  • RedshiftClusterDBName – Amazon Redshift cluster database name
  • RedshiftClusterEndpointWithPort – Amazon Redshift cluster endpoint address with port number
  • RedshiftClusterSecret – Amazon Redshift cluster credentials secret ARN

The CloudFormation template creates a secret for the Amazon Redshift cluster in AWS Secrets Manager, which is a service that helps you protect secrets needed to access your applications, services, and IT resources. Secrets Manager lets you easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle.

  1. On the AWS Secrets Manager Console, choose Secrets.
  2. Choose the secret denoted by the RedshiftClusterSecret key value.
  3. In the Secret value section, choose Retrieve secret value to get the user name and password for the Amazon Redshift cluster.
  4. On the Amazon Redshift Console, choose Editor (which is essentially the Amazon Redshift query editor).
  5. For Database name, enter redshiftdb.
  6. For Database password, enter your password.
  7. Choose Connect to database.
  8. Run the following SQL statements to create a table and insert a couple of records:
    CREATE TABLE public.students (name VARCHAR(64), age INT, gpa DECIMAL(3, 2));
    INSERT INTO public.students VALUES ('fred flintstone', 35, 1.28), ('barney rubble', 32, 2.32);
    

  9. On the Amazon SageMaker Console, open your notebook.
  10. Choose Open Jupyter.
  11. Upload the ipynb notebook.
    1. Alternatively, open a new notebook and enter the code as mentioned in “redshift_connect.ipynb”, replacing the values for RedshiftClusterEndpointWithPort, RedshiftClusterDBName, and RedshiftClusterSecret:
  12. Run all the cells in the notebook to connect to Amazon Redshift on the Amazon SageMaker R console.

Connecting to Amazon Aurora MySQL-compatible

Amazon Aurora is a MySQL-compatible relational database built for the cloud, which combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open-source databases. To connect to Amazon Aurora from the Amazon SageMaker R kernel using RJDBC, we use the MariaDB JDBC driver, which is already downloaded to the notebook instance via the lifecycle configuration script.

You need the following keys and their values from the AWS CloudFormation Outputs tab:

  • AuroraClusterDBName – Aurora cluster database name
  • AuroraClusterEndpointWithPort – Aurora cluster endpoint address with port number
  • AuroraClusterSecret – Aurora cluster credentials secret ARN

The CloudFormation template creates a secret for the Aurora cluster in Secrets Manager.

  1. On the AWS Secrets Manager Console, locate the secret as denoted by the AuroraClusterSecret key value.
  2. In the Secret value section, choose Retrieve secret value to get the user name and password for the Aurora cluster.

To connect to the cluster, you follow similar steps as with other services.

  1. On the Amazon SageMaker Console, open your notebook.
  2. Choose Open Jupyter.
  3. Upload the ipynb notebook.
    1. Alternatively, open a new notebook and enter the code as mentioned in “aurora_connect.ipynb”, replacing the values for AuroraClusterEndpointWithPort, AuroraClusterDBName, and AuroraClusterSecret:
  4. Run all the cells in the notebook to connect Amazon Aurora on the Amazon SageMaker R console.

Conclusion

In this post, we demonstrated how to connect to various data sources, such as Hive and PrestoDB on Amazon EMR, Amazon Athena, Amazon Redshift, and Amazon Aurora MySQL-compatible cluster, in your environment to analyze, profile, run statistical computions using R from Amazon SageMaker. You can extend this method to other data sources via JDBC.


Author Bio

Kunal Ghosh is a Solutions Architect at AWS. His passion is building efficient and effective solutions on the cloud, especially involving analytics, AI, data science, and machine learning. Besides family time, he likes reading, swimming, biking, and watching movies, and he is a foodie.

Gagan Brahmi is a Specialist Solutions Architect focused on Big Data & Analytics at Amazon Web Services. Gagan has over 15 years of experience in information technology. He helps customers architect and build highly scalable, performant, and secure cloud-based solutions on AWS.

Source: https://aws.amazon.com/blogs/machine-learning/accessing-data-sources-from-amazon-sagemaker-r-kernels/

Continue Reading

AI

Training a custom single class object detection model with Amazon Rekognition Custom Labels

Avatar

Published

on

Customers often need to identify single objects in images; for example, to identify their company’s logo, find a specific industrial or agricultural defect, or locate a specific event, like hurricanes, in satellite scans. In this post, we showcase how to train a custom model to detect a single object using Amazon Rekognition Custom Labels.

Amazon Rekognition is a fully managed service that provides computer vision (CV) capabilities for analyzing images and video at scale, using deep learning technology without requiring machine learning (ML) expertise. Amazon Rekognition Custom Labels lets you extend the detection and classification capabilities of the Amazon Rekognition pre-trained APIs by using data to train a custom CV model specific to your business needs. With the latest update to support single object training, Amazon Rekognition Custom Labels now lets you create a custom object detection model with single object classes.

Solution overview

To show you how the single class object detection feature works, we create a custom model to detect pizza in images. Because we only care about finding pizza in our images, we don’t want to create labels for other food types or create a “not pizza” label.

To create our custom model, we follow these steps:

  1. Create a project in Amazon Rekognition Custom Labels.
  2. Create a dataset with images containing one or more pizzas.
  3. Label the images by applying bounding boxes on all pizzas in the images using the user interface provided by Amazon Rekognition Custom Labels.
  4. Train the model and evaluate the performance.
  5. Test the new custom model using the automatically generated API endpoint.

Amazon Rekognition Custom Labels lets you manage the ML model training process on the Amazon Rekognition console, which simplifies the end-to-end process.

Creating your project

To create your pizza-detection project, complete the following steps:

  1. On the Amazon Rekognition console, choose Custom Labels.
  2. Choose Get Started.
  3. For Project name, enter PizzaDetection.
  4. Choose Create project

You can also create a project on the Projects page. You can access the Projects page via the left navigation pane.

Creating your dataset

To create your pizza model, you first need to create a dataset to train the model with. For this post, our dataset is composed of 39 images that contain pizza. We sourced our images from pexels.com.

To create your dataset:

  1. Choose Create dataset.
  2. Select Upload images from your computer.

  1. Choose Add Images.
  2. Upload your images. You can always add more images later.

Labeling the images with bounding boxes

You’re now ready to label the images by applying bounding boxes on all images with pizza.

  1. Add Pizza as a label to your dataset via the labels list on the left side of the gallery.
  2. Apply the label to the pizzas in the images by selecting all the images with pizza and choosing Draw Bounding Box.

You can use the Shift key to automatically select multiple images between the first and last selected images.

Make sure to draw a bounding box that covers the pizza as tightly as possible.

Training your model

After you label your images, you’re ready to train your model.

  1. Choose Train Model.
  2. For Choose project, choose your PizzaDetection project.
  3. For Choose training dataset, choose your PizzaImages dataset.

As part of the training, Amazon Rekognition Custom Labels requires a labeled test dataset. You use the text dataset to verify how well the trained model predicts the correct labels and generate evaluation metrics. You don’t use the images in the test dataset to train your model; they should represent the types of images you want your model to analyze.

  1. For Create test set, choose how you want to provide your test dataset.

Amazon Rekognition Custom Labels provides three options:

  • Choose an existing test dataset
  • Create a new test dataset
  • Split training dataset

For this post, we select Split training dataset and let Amazon Rekognition hold back 20% of the images for testing and use the remaining 80% of the images to train the model.

Our model took approximately 1 hour to train. The training time required for your model depends on many factors, including the number of images provided in the dataset and the complexity of the model.

When training is complete, Amazon Rekognition Custom Labels outputs key metrics with every training, including F1 score, precision, recall, and the assumed threshold for each label. For more information about metrics, see Metrics for Evaluating Your Model.

Looking at our evaluation results, our model has a precision of 1.0, which means that no objects were mistakenly identified as pizza (false positives) in our test set. Our model did miss some pizzas in our test set (false negatives), which is reflected in our recall score of 0.81. You can often use the F1 score as an overall quality score because it takes both precision and recall into account. Finally, we see that our assumed threshold to generate the F1 score, precision, and recall metrics for Pizza is 0.61. By default, our model returns predictions above this assumed threshold. We can increase the recall for this model if we lower the confidence threshold. However, this would most likely cause a drop in precision.

We can also choose View Test Results to see each test image and how our model performed. The following screenshot shows an example of a correctly identified image of pizza during the model testing (true positive).

Testing your model

Your custom pizza detection model is now ready for use. Amazon Rekogntion Custom Labels provides the API calls for starting and using the model; you don’t need to deploy, provision, or manage any infrastructure. The following screenshot shows the API calls for using the model.

By using the API, we tried our model on a new test set of images from pexels.com.

For example, the following image shows a pizza on a table with other objects.

The model detects the pizza with a confidence of 91.72% and a correct bounding box. The following code is the JSON response received by the API call:

{ "CustomLabels": [ { "Name": "Pizza", "Confidence": 91.7249984741211, "Geometry": { "BoundingBox": { "Width": 0.7824199795722961, "Height": 0.3644999861717224, "Left": 0.11868999898433685, "Top": 0.37672001123428345 } } } ]
}

The following image has a confidence score of 98.40.

The following image has a confidence score of 96.51.

The following image has an empty JSON result, as expected, because the image doesn’t contain pizza.

The following image also has an empty JSON result.

In addition to using the API, you can also use the Custom Labels Demonstration. This AWS CloudFormation template enables you to set up a custom, password-protected UI where you can start and stop your models and run demonstration inferences.

Conclusion

In this post, we showed you how to create a single class object detection model with Amazon Rekognition Custom Labels. This feature makes it easy to train a custom model that can detect an object class without needing to specify other objects or losing accuracy in its results.

For more information about using custom labels, see What Is Amazon Rekognition Custom Labels?


About the Author

Woody Borraccino is a Senior AI Solutions Architect at AWS.

 
Anushri Mainthia is the Senior Product Manager on the Amazon Rekognition team and product lead for Amazon Rekognition Custom Labels. Outside of work, Anushri loves to cook, explore Seattle and video-chat with her nephew.

Source: https://aws.amazon.com/blogs/machine-learning/training-a-custom-single-class-object-detection-model-with-amazon-rekognition-custom-labels/

Continue Reading
Blockchain4 hours ago

The Nebulas blockchain project releases plans for a massive DeFi ecosystem!

Blockchain5 hours ago

Bitcoin, Ethereum lose August’s first round to small-caps

AR/VR6 hours ago

‘Hitman III’ VR Clip Confirms PS Move Support, Reveals Impressive Level of Detail

AR/VR6 hours ago

Psychic VR Lab Opens Applications for NEWVIEW Awards 2020

Blockchain7 hours ago

Bitcoin Cash, Tron, Synthetix Price Analysis: 13 August

Publications7 hours ago

IEA sees lower oil demand in 2020, 2021 on upsurge of coronavirus cases and stalling mobility

Publications7 hours ago

Coronavirus live updates: China says chicken imported from Brazil tests positive for virus; relief talks at a standstill

Publications8 hours ago

What a touch-free airplane bathroom is going to look like

AR/VR8 hours ago

‘Vox Machinae’ Quietly Added New Mechs, Weapons, & Co-op in Updates, Studio Has “Ambitious plans” for the Future

Cannabis8 hours ago

Can comedy normalize cannabis use?

Blockchain9 hours ago

Gold’s Sharp Rebound After Rout Hints Bitcoin En Route to $12K

Publications9 hours ago

Pompeo says Trump’s executive orders are ‘broader’ than just TikTok and WeChat, hinting at more action

Blockchain9 hours ago

Down to the Wire: Yam Finance Saved at the Last Minute

Blockchain9 hours ago

Boom! Kraken Predicts Imminent Bitcoin Price Rally of Up to 200%

Blockchain10 hours ago

Global P2P Bitcoin Trading Volume at Highest Point Since Jan. 2018

Blockchain10 hours ago

Tron’s BitTorrent Network Reaches 2 Billion Downloads

Publications11 hours ago

Stock futures edge lower after S&P 500 closes just under a record

Blockchain12 hours ago

CoinList Exchange Struggles, but NEAR Disaster Averted

Publications12 hours ago

3 charts show China is far from meeting its ‘phase one’ trade commitment to the U.S.

Blockchain12 hours ago

Five Hours to Failure: The ‘Save Yam’ Proposal Is Falling Short

Publications14 hours ago

Latin America will see ‘record-breaking contraction’ as the coronavirus shatters their economies, Goldman says

Blockchain14 hours ago

Analyst Who Called Bitcoin’s Tuesday Low Expects a Move to $13,000

Publications14 hours ago

Depression-like collapse is sparking a wartime-type boom, market bull Jim Paulsen predicts

Publications15 hours ago

Kamala Harris blames Trump for severity of U.S. coronavirus outbreak: He failed to take it ‘seriously from the start’

Automotive15 hours ago

2021 Hyundai Elantra N Line picks up where Elantra Sport, GT N Line leave off

Cannabis16 hours ago

New Jersey Medical Cannabis Patients Can Now Use Telehealth

Publications16 hours ago

Accuracy of U.S. coronavirus data thrown into question as decline in testing skews drop in new cases

Blockchain16 hours ago

Bitcoin Could Retrace to $9,000 if it Breaks Below This One Key Level

AI16 hours ago

Amazon Textract now available in Asia Pacific (Mumbai) and EU (Frankfurt) Regions 

Publications16 hours ago

Walt Disney World actors to return to work after company offers coronavirus tests

Publications17 hours ago

Uber CEO says its service will probably shut down temporarily in California if it’s forced to classify drivers as employees

Blockchain17 hours ago

Litecoin, VeChain, Algorand Price Analysis: 12 August

Cannabis17 hours ago

Can Marijuana Really Cause A Person To Become Aggressive?

Publications17 hours ago

Stocks making the biggest moves after hours: Lyft, Cisco, Vroom & more

Automotive17 hours ago

Meet Tucson, a stray dog who became Prime Hyundai’s newest car consultant

Cannabis17 hours ago

Trump And GOP Already Attacking Kamala Harris Over Marijuana Record

Publications17 hours ago

Cisco falls on disappointing quarterly guidance as revenue continues to drop

Publications18 hours ago

Lyft may suspend service in California if court requires it to classify drivers as employees

Blockchain18 hours ago

Is Chainlink riding the DeFi bubble?

Cannabis18 hours ago

$61M Worth of Drugs Discovered in Shipping Containers Filled With Cacti, Limes

Trending