Connect with us

AI

Three AI companies join a business development group built by the London Stock Exchange

Avatar

Published

on

The cohort will get help refining their investment pitches as well as access to the group’s network of institutional investors.

Artificial intelligence technology development

Image: elenabs, Getty Images/iStockphoto

Three companies building artificial intelligence products and services have a fast track to funding as the newest members of the ELITE Group. This private business development program developed by the London Stock Exchange and Global Accelerated Ventures provides entrepreneurs with business coaching and access to institutional investors. 

ELITE announced Friday three new companies that will be in the organization’s latest cohort:

  • ModuleQ: An artificial intelligence (AI) platform that analyzes data from calendars, email, and Microsoft Teams to make business recommendations for individuals.
  • Covex 2020:  An AI company that combines and analyzes multiple data sets to support decision making.
  • vElement: A service provider that specializes in robotic process automation, data science, and artificial intelligence. 

Thomas Tyler, global head of ELITE Americas and global head of business development, said that companies in the new cohort first will refine their core strategy and their pitch to investors. The next step is connecting the companies with ELITE’s network of institutional investors.  

“The goal is to get bigger, get bought, or get listed,” he said.

SEE: Robotic process automation: A cheat sheet (free PDF) (TechRepublic)

ELITE has worked with companies in 45 countries and in all sectors that share a common factor of an ambition to grow and the willingness to learn. The focus for this cohort was fintech and healthtech. Companies applied for a spot and had to meet these criteria:

  • Turnover greater than $5 million 
  • Operating earnings greater than 5% of turnover
  • Positive net profit
  • Demonstrate historic growth and future potential
  • Convincing projections
  • Credible management

  • Motivation to deal with the cultural, organizational, managerial change required to access long-term financing opportunities

Tyler said the ELITE program fills a gap for businesses that are beyond the accelerator phase and just starting to scale and grow.  
  
ELITE works with more than 200 partners including lawyers, brokers, and sales and marketing experts to support cohort companies.
Tyler said that even in this climate of uncertainty there are opportunities to engage with investors, although entrepreneurs might have to accept slightly lower valuations. 

“Clearly COVID is going to change some portfolio strategies but our view is that the fundamentals of that process don’t change,” he said.

Companies that are invited to join the organization can work with the coaches and use the funding network for as long as they need to. Tyler said members have been contacting ELITE for advice more frequently since the pandemic started.
 
“COVID has highlighted where they were weaker than they thought and emphasized the need to have the right people on board and the right processes in place,” he said. 
 
Polishing the investment pitch
Tyler offered this advice for entrepreneurs looking for support from investors:

  • Know how to tell your story
  • Know your numbers
  • Explain your governance processes
  • Be able to explain clearly what your ambition is

He said one common challenge for entrepreneurs is recognizing the moment when it’s time to stop  working on day-to-day operations and start working on the strategic plan. Having a comprehensive plan helps companies stand out to investors who hear hundreds of pitches a year.

“A company that is unprepared may not necessarily have  a tested  investment story or a growth story that will withstand the scrutiny of an investment market,” he said.

Tyler said that ELITE helps companies refine their corporate governance and risk management plans because investors are looking for resilient business models.

“Investors want to understand that there are good processes in place to protect the investment,” he said. “Companies need a solid five-year strategic  growth plan which can sometimes supersede whatever the business opportunity might be.”

“The process of building a business is one of constant learning and engagement and you have to have that mindset as you move through those processes,” he said.

Also see

Source: https://www.techrepublic.com/article/four-ai-companies-join-a-business-development-group-built-by-the-london-stock-exchange/#ftag=RSS56d97e7

AI

Increasing the relevance of your Amazon Personalize recommendations by leveraging contextual information

Avatar

Published

on

Getting relevant recommendations in front of your users at the right time is a crucial step for the success of your personalization strategy. However, your customer’s decision-making process shifts depending on the context at the time when they’re interacting with your recommendations. In this post, I show you how to set up and query a context-aware Amazon Personalize deployment.

Amazon Personalize allows you to easily add sophisticated personalization capabilities to your applications by using the same machine learning (ML) technology used on Amazon.com for over 20 years. No ML experience is required. Amazon Personalize supports the automatic adjustment of recommendations based on contextual information about your user, such as device type, location, time of day, or other information you provide.

The Harvard study How Context Affects Choice defines context as factors that can influence the choice outcome by altering the process by which a decision is made. As a business owner, you can identify this context by analyzing how your customers shop differently when accessing your catalog from a phone vs. a computer, or seeing the shift in your customer’s content consumption on rainy vs. sunny days.

Leveraging your user’s context allows you to provide a more personalized experience for existing users and helps decrease the cold-start phase for new or unidentified users. The cold-start phase refers to the period when your recommendation engine provides non-personalized recommendations due to the lack of historical information regarding that user.

Adding context to Amazon Personalize

You can set up and use context in Amazon Personalize in four simple steps:

  1. Include your user’s context in the historical user-item interactions dataset.
  2. Train a context aware solution with a User Personalization or Personalized Ranking recipe. A recipe refers to the algorithm your recommender is trained on using the behavioral data specified in your interactions dataset plus any user or items metadata.
  3. Specify the user’s context when querying for real-time recommendations using the GetRecommendations or GetPersonalizedRanking
  4. Include your user’s context when recording events using the event tracker.

The following diagram illustrates the architecture of these steps.

You want to be explicit about the context to consider when constructing datasets. A common example of context customers actively use is device type, such as a phone, tablet, or desktop. The study The Effect of Device Type on Buying Behavior in Ecommerce: An Exploratory Study from the University of Twente in the Netherlands has proven that device type has an influence on buying behavior and people might postpone a buying decision if they’re online with the wrong device type. Embedding device type context in your datasets allows Amazon Personalize to learn this pattern and, at inference time, recommend the most appropriate content with awareness of the user’s context.

Recommendations use case

For this use case, a travel enthusiast is our potential customer. They look at a few things when deciding which airline to travel with to their given destination. For example, is it a short or a long flight? Will the trip be booked with cash or with miles? Are they traveling alone? Where are they be departing and returning to? After they answer these initial questions, the next big decision is picking the cabin type to fly in. If our travel enthusiast is flying in a high-end cabin type, we can assume they’re looking at which airline provides the best experience possible. Now that we have a good idea on what our user is looking for, it’s shopping time!

Consider some of the variables that go into the decision-making process of this use case. We can’t control many of these factors, but we can use some to tailor our recommendations. First, identify common denominators that might affect a user’s behavior. In this case, flight duration and cabin type are good candidates to use as context, and traveler type and traveler residence are good candidates for user metadata when building our recommendation datasets. Metadata is information you know about your users and items that stays somewhat constant over a period of time, whereas context is environmental information that can shift rapidly across time, influencing your customer’s perception and behavior.

Selecting the most relevant metadata fields in your training datasets and enriching your interactions datasets with context is important for generating relevant user recommendations. In this post, we build an Amazon Personalize deployment that returns a list of airline recommendations for a customer. We add cabin type as the context and traveler residence as the metadata field and observe how recommendations shift based on context and metadata.

Prerequisites

We first need to set up the following Amazon Personalize resources. For full instructions, see Getting Started (Console) to complete the following steps:

  1. Create a dataset group. In this post, we name it airlines-blog-example.
  2. Create an Interactions dataset using the following schema and import data using the interactions_dataset.csv file:
    { "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [
    { "name": "ITEM_ID", "type": "string" }, { "name": "USER_ID", "type": "string" }, { "name": "TIMESTAMP", "type": "long" }, { "name":"CABIN_TYPE", "type": "string", "categorical": true }, { "name": "EVENT_TYPE", "type": "string" }, { "name": "EVENT_VALUE", "type": "float" } ], "version": "1.0"
    }

  3. Create a Users dataset using the following schema and import data using the users_dataset.csv file:
    { "type": "record", "name": "Users", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "USER_RESIDENCE", "type": "string", "categorical": true } ], "version": "1.0"
    }

  4. Create a solution. In this post, we use the default solution configurations, except for the following:
    1. Recipeaws-hrnn-metadata
    2. Event type – RATING
    3. Perform HPO – True

Hyperparameter optimization (HPO) is recommended if you want Amazon Personalize to run parallel trainings and experiments to identify the most performant hyperparameters. For more information, see Hyperparameters and HPO.

  1. Create a campaign.

You can set up the preceding resources on the Amazon Personalize console or by following the Jupyter notebook personalize_hrnn_metadata_contextual_example.ipynb example on the GitHub repo.

Exploring your Amazon Personalize resources

We have now created several Amazon Personalize resources, including a dataset group called airlines-blog-example. The dataset group contains two datasets: interactions and users, which contain the data used to train your Amazon Personalize model (also known as a solution). We also created a campaign to provide real-time recommendations.

We can now explore how the interactions and users dataset schemas help our model learn from the context and metadata embedded in the datasets.

Interactions dataset

We provide Amazon Personalize an interactions dataset with a numeric rating (combination of EVENT_TYPE + EVENT_VALUE) that a user (USER_ID) has given an airline (ITEM_ID) when flying in a certain cabin type (CABIN_TYPE) at a given time (TIMESTAMP). By providing this information to Amazon Personalize in the dataset and schema, we can add CABIN_TYPE as the context when querying the recommendations for a user and recording new interactions through the event tracker. At training time, the model automatically identifies important features from this data (for our use case, the highest rated airlines across cabin types).

The following screenshot showcases a small portion of the interactions_dataset.csv file.

User dataset

We also provide Amazon Personalize a user dataset with the users (USER_ID) who provided the ratings in the interactions dataset, assuming that they gave the rating from their country of residence (USER_RESIDENCE). In this use case, USER_RESIDENCE is the metadata we picked for these users. By providing USER_RESIDENCE as user metadata, the model can learn which airlines are interacted with the most by users across countries and regions, so when we query for recommendations, it takes USER_RESIDENCE in consideration. For example, users in Asia see different airline options compared to users in South America or Europe.

The following screenshot shows a small portion of the user_dataset.csv file.

The raw dataset of user airlines ratings from Skytrax contains 20 columns with over 40,000 records. In this post, we use a modified version of this dataset and split the most relevant columns of the raw dataset into two datasets (users and interactions). For more information about splitting the data in a Jupyter notebook, see personalize_hrnn_metadata_contextual_example.ipynb on the GitHub repo.

The next section shows how context and metadata influence the real-time recommendations provided by your Amazon Personalize campaign.

Applying context to your Amazon Personalize real-time recommendations queries

During this test, we observe the effect that context has on the recommendations provided to users. In our use case, we have an interactions dataset of numerical airline ratings from multiple users. In our schemas, the cabin type is included as a categorical value for the interactions dataset and the user residence as a metadata field in the users dataset. Our theory is that by adding the cabin type as context, the airline recommendations will shift to account for it.

  1. On your Amazon Personalize dataset group dashboard, choose View campaigns.
  2. Choose your newly created campaign.
  3. For User ID, enter JDowns.
  4. Choose Get recommendations.

You should see a Test campaign results page similar to the following screenshot.

We initially queried a list of airlines for our user without any context. We now focus on the top 10 recommendations and verify that they shift based on the context. We can add the context via the console by providing a key and value pair. In our use case, the key is CABIN_TYPE and the value can be one of the following:

  • Economy
  • Premium Economy
  • Business Class
  • First Class

The following two screenshots show our results for querying recommendations for the same user with Economy and First Class as values for the CABIN_TYPE context. The economy context doesn’t shift the top 10 list, but the first class context does have an effect—bumping Alaska Airlines to first place on the list.

You can explore your users_dataset.csv file for additional users to test your recommendations API, and a very similar shift of recommendations based on the context you include in the API call. You can also find that the airlines list shifts based on the User Residency metadata field. For example, the following screenshots show the top 10 recommendations for our JDowns user, who has United States as the value for User Residency, compared to the PhillipHarris user, who has France as the value for User Residency.

Conclusion

As shown in this post, adding context to your recommendation strategy is a very powerful and easy-to-implement exercise when using Amazon Personalize. The benefits of enriching your recommendations with context can result in an increase in your user engagement, which eventually leads to an increase in the revenue influenced by your recommendations.

This post showed you how to create an Amazon Personalize context-aware deployment and an end-to-end test of getting real-time recommendations applying context via the Amazon Personalize console. For instructions on using a Jupyter environment to set up the Amazon Personalize infrastructure and get recommendations using the Boto3 Python SDK, see personalize_hrnn_metadata_contextual_example.ipynb on the GitHub repo.

There’s even more that you can do with Amazon Personalize. For more information about core use cases and automation examples, see the GitHub repo.

If this post helps you or inspires you to solve a problem, share your thoughts and questions in the comments.


About the Author

Luis Lopez Soria is an AI/ML specialist solutions architect working with the AWS machine learning team. He works with AWS customers to help them adopt machine learning on a large scale. He enjoys playing sports, traveling around the world, and exploring new foods and cultures.

Source: https://aws.amazon.com/blogs/machine-learning/increasing-the-relevance-of-your-amazon-personalize-recommendations-by-leveraging-contextual-information/

Continue Reading

AI

Amazon Forecast can now use Convolutional Neural Networks (CNNs) to train forecasting models up to 2X faster with up to 30% higher accuracy

Avatar

Published

on

We’re excited to announce that Amazon Forecast can now use Convolutional Neural Networks (CNNs) to train forecasting models up to 2X faster with up to 30% higher accuracy. CNN algorithms are a class of neural network-based machine learning (ML) algorithms that play a vital role in Amazon.com’s demand forecasting system and enable Amazon.com to predict demand for over 400 million products every day. For more information about Amazon.com’s journey building demand forecasting technology using CNN models, watch the re:MARS 2019 keynote video. Forecast brings the same technology used at Amazon.com into the hands of everyday developers as a fully managed service. Anyone can start using Forecast, without any prior ML experience, by using the Forecast console or the API.

Forecasting is the science of predicting the future. By examining historical trends, businesses can make a call on what might happen and when, and build that into their future plans for everything from product demand to inventory to staffing. Given the consequences of forecasting, accuracy matters. If a forecast is too high, businesses over-invest in products and staff, which ends up as wasted investment. If the forecast is too low, they under-invest, which leads to a shortfall in inventory and a poor customer experience. Today, businesses try to use everything from simple spreadsheets to complex financial planning software to generate forecasts, but high accuracy remains elusive for two reasons:

  • Traditional forecasts struggle to incorporate very large volumes of historical data, missing out on important signals from the past that are lost in the noise.
  • Traditional forecasts rarely incorporate related but independent data, which can offer important context (such as sales, holidays, locations, and marketing promotions). Without the full history and the broader context, most forecasts fail to predict the future accurately.

At Amazon, we have learned over the years that no one algorithm delivers the most accurate forecast for all types of data. Traditional statistical models have been useful in predicting demand for products that have regular demand patterns, such as sunscreen lotions in the summer and woolen clothes in the winter. However, statistical models can’t deliver accurate forecasts for more complex scenarios, such as frequent price changes, differences between regional versus national demand, products with different selling velocities, and the addition of new products. Sophisticated deep learning models can provide higher accuracy in these use cases. Forecast automatically examines your data and selects the best algorithm across a set of statistical and deep learning algorithms to train the more accurate forecasting model for your data. With the addition of the CNN-based deep learning algorithm, Forecast can now further improve accuracy by up to 30% and train models up to 2X faster compared to the currently supported algorithms. This new algorithm can more accurately detect leading indicators of demand, such as pre-order information, product page visits, price changes, and promotional spikes, to build more accurate forecasts.

More Retail, a market leader in the fresh food and grocery category in India, participated in a beta test of the new CNN algorithm, with the help of Ganit, an analytics partner. Supratim Banerjee, Chief Transformation Officer at More Retail Limited, says, “At More, we rapidly innovate to sustain our business and beat competition. We have been looking for opportunities to reduce wastage due to over stocking, while continuing to meet customer demand. In our experiments for the fresh produce category, we found the new CNN algorithm in Amazon Forecast to be 1.7X more accurate compared to our existing forecasting system. This translates into massive cost savings for our business.”

Training a CNN predictor and creating forecasts

You can start using CNNs in Forecast through the CreatePredictor API or on the Forecast console. In this section, we walk through a series of steps required to train a CNN predictor and create forecasts within Forecast.

  1. On the Forecast console, create a dataset group.

  1. Upload your dataset.

  1. Choose Predictors from the navigation pane.
  2. Choose Train predictor.

  1. For Algorithm selection, select Manual.
  2. For Algorithm, choose CNN-QR.

To manually select CNN-QR through the CreatePredictor API, use arn:aws:forecast:::algorithm/CNN-QR for the AlgorithmArn.

When you choose CNN-QR from the drop-down menu, the Advanced Configuration section auto-expands.

  1. To let Forecast train the most optimized and accurate CNN model for your data, select Perform hyperparameter optimization (HPO).
  2. After you enter all your details on the Predictors page, choose Train predictor.

After your predictor is trained, you can view its details by choosing your predictor on the Predictors page. On the predictor’s details page, you can view the accuracy metrics and optimized hyperparameters for your model.

  1. Now that your model is trained, choose Forecasts from the navigation name.
  2. Choose Create a forecast.
  3. Create a forecast using your trained predictor.

You can generate forecasts at any quantile to balance your under-forecasting and over-forecasting costs.

Choosing the most accurate model with Forecast

With this launch, Forecast now supports one proprietary CNN model, one proprietary RNN model, and four other statistical models: Prophet, NPTS (Amazon proprietary), ARIMA, and ETS. The new CNN model is part of AutoML. We recommend always starting your experimentation with AutoML, in which Forecast finds the most optimized and accurate model for your dataset.

  1. On the Train predictor page, for Algorithm selection, select Automatic (AutoML).

  1. After your predictor is trained using AutoML, choose the predictor to see more details on the chosen algorithm.
  2. On the predictor’s details page, in the Algorithm metrics section, choose different algorithms from the drop-down menu to view their accuracy for comparison.

Tips and best practices

As you begin to experiment with CNNs and build your demand planning solutions on top of Forecast, consider the following tips and best practices:

  • For experimentation, start by identifying the most important item IDs for your business that you are looking to improve your forecasting accuracy. Measure the accuracy of your existing forecasting methodology as a baseline.
  • Use Forecast with only your target time series and assess the wQuantileLoss accuracy metric. We recommend selecting AutoML in Forecast to find the most optimized and accurate model for your data. For more information, see Evaluating Predictor Accuracy.
  • AutoML optimizes for accuracy and not training time, so AutoML may take longer to optimize your model. If training time is a concern for you, we recommend manually selecting CNN-QR and assessing its accuracy and training time. A slight degradation in accuracy may be an acceptable trade-off for considerable gains in training time.
  • After you see an increase in accuracy over your baseline, we recommend experimenting to find the right forecasting quantile that balances your under-forecasting and over-forecasting costs to your business.
  • We recommend deploying your model as a continuous workload within your systems to start reaping the benefits of more accurate forecasts. You can continue to experiment by adding related time series and item metadata to further improve the accuracy.
  • Incrementally add related time series or item metadata to train your model to assess whether additional information improves accuracy. Different combinations of related time series and item metadata can give you different results.

Conclusion

The new CNN algorithm is available in all Regions where Forecast is publicly available. For more information about Region availability, see Region Table. For more information about the CNN algorithm, see CNN-QR algorithm documentation.


About the authors

Namita Das is a Sr. Product Manager for Amazon Forecast. Her current focus is to democratize machine learning by building no-code/low-code ML services. She frequently advises startups and has started dabbling in baking.

Danielle Robinson is an Applied Scientist on the Amazon Forecast team. Her research is in time series forecasting and in particular how we can apply new neural network-based algorithms within Amazon Forecast. Her thesis research was focused on developing new, robust, and physically accurate numerical models for computational fluid dynamics. Her hobbies include cooking, swimming, and hiking.

Aaron Spieler is a working student in the Amazon Forecast team. He is starting his masters degree at the University of Tuebingen, and studied Data Engineering at Hasso Plattner Institute after obtaining a BS in Computer Science from University of Potsdam. His research interests span time series forecasting (especially using neural network models), machine learning, and computational neuroscience.

Gunjan Garg: Gunjan Garg is a Sr. Software Development Engineer in the AWS Vertical AI team. In her current role at Amazon Forecast, she focuses on engineering problems and enjoys building scalable systems that provide the most value to end-users. In her free time, she enjoys playing Sudoku and Minesweeper.

Chinmay Bapat is a Software Development Engineer in the Amazon Forecast team. His interests lie in the applications of machine learning and building scalable distributed systems. Outside of work, he enjoys playing board games and cooking.

Source: https://aws.amazon.com/blogs/machine-learning/amazon-forecast-can-now-use-convolutional-neural-networks-cnns-to-train-forecasting-models-up-to-2x-faster-with-up-to-30-higher-accuracy/

Continue Reading

AI

Securing Amazon Comprehend API calls with AWS PrivateLink

Avatar

Published

on

Amazon Comprehend now supports Amazon Virtual Private Cloud (Amazon VPC) endpoints via AWS PrivateLink so you can securely initiate API calls to Amazon Comprehend from within your VPC and avoid using the public internet.

Amazon Comprehend is a fully managed natural language processing (NLP) service that uses machine learning (ML) to find meaning and insights in text. You can use Amazon Comprehend to analyze text documents and identify insights such as sentiment, people, brands, places, and topics in text. No ML expertise required.

Using AWS PrivateLink, you can access Amazon Comprehend easily and securely by keeping your network traffic within the AWS network, while significantly simplifying your internal network architecture. It enables you to privately access Amazon Comprehend APIs from your VPC in a scalable manner by using interface VPC endpoints. A VPC endpoint is an elastic network interface in your subnet with a private IP address that serves as the entry point for all Amazon Comprehend API calls.

In this post, we show you how to set up a VPC endpoint and enforce the use of this private connectivity for all requests to Amazon Comprehend using AWS Identity and Access Management (IAM) policies.

Prerequisites

For this example, you should have an AWS account and sufficient access to create resources in the following services:

Solution overview

The walkthrough includes the following high-level steps:

  1. Deploy your resources.
  2. Create VPC endpoints.
  3. Enforce private connectivity with IAM.
  4. Use Amazon Comprehend via AWS PrivateLink.

Deploying your resources

For your convenience, we have supplied an AWS CloudFormation template to automate the creation of all prerequisite AWS resources. We use the us-east-2 Region in this post, so the console and URLs may differ depending on the Region you select. To use this template, complete the following steps:

  1. Choose Launch Stack:
  2. Confirm the following parameters, which you can leave at the default values:
    1. SubnetCidrBlock1 – The primary IPv4 CIDR block assigned to the first subnet. The default value is 10.0.1.0/24.
    2. SubnetCidrBlock2 – The primary IPv4 CIDR block assigned to the second subnet. The default value is 10.0.2.0/24.
  3. Acknowledge that AWS CloudFormation may create additional IAM resources.
  4. Choose Create stack.

The creation process should take roughly 10 minutes to complete.

The CloudFormation template creates the following resources on your behalf:

  • A VPC with two private subnets in separate Availability Zones
  • VPC endpoints for private Amazon S3 and Amazon Comprehend API access
  • IAM roles for use by Lambda and Amazon Comprehend
  • An IAM policy to enforce the use of VPC endpoints to interact with Amazon Comprehend
  • An IAM policy for Amazon Comprehend to access data in Amazon S3
  • An S3 bucket for storing open-source data

The next two sections detail how to manually create a VPC endpoint for Amazon Comprehend and enforce usage with an IAM policy. If you deployed the CloudFormation template and prefer to skip to testing the API calls, you can advance to the Using Amazon Comprehend via AWS PrivateLink section.

Creating VPC endpoints

To create a VPC endpoint, complete the following steps:

  1. On the Amazon VPC console, choose Endpoints.
  2. Choose Create Endpoint.
  3. For Service category, select AWS services.
  4. For Service Name, choose amazonaws.us-east-2.comprehend.
  5. For VPC, enter the VPC you want to use.
  6. For Availability Zone, select your preferred Availability Zones.
  7. For Enable DNS name, select Enable for this endpoint.

This creates a private hosted zone that enables you to access the resources in your VPC using custom DNS domain names, such as example.com, instead of using private IPv4 addresses or private DNS hostnames provided by AWS. The Amazon Comprehend DNS hostname that the AWS Command Line Interface (CLI) and Amazon Comprehend SDKs use by default (https://comprehend.Region.amazonaws.com) resolves to your VPC endpoint.

  1. For Security group, choose the security group to associate with the endpoint network interface.

If you don’t specify a security group, the default security group for your VPC is associated.

  1. Choose Create Endpoint.

When the Status changes to available, your VPC endpoint is ready for use.

  1. Choose the Policy tab to apply more restrictive access control to the VPC endpoint.

The following example policy limits VPC endpoint access to an IAM role used by a Lambda function in our deployment. You should apply the principle of least privilege when defining your own policy. For more information, see Controlling access to services with VPC endpoints.

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "comprehend:DetectEntities", "comprehend:CreateDocumentClassifier" ], "Resource": [ "*" ], "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::#########:role/ComprehendPrivateLink-LambdaExecutionRole" ] } } ] }

Enforcing private connectivity with IAM

To allow or deny access to Amazon Comprehend based on the use of a VPC endpoint, we include an aws:sourceVpce condition in the IAM policy. The following example policy provides access specifically to the DetectEntities and CreateDocumentClassifier APIs only when the request utilizes your VPC endpoint. You can include additional Amazon Comprehend APIs in the “Action” section of the policy or use “comprehend:*” to include them all. You can attach this policy to an IAM role to enable compute resources hosted within your VPC to interact with Amazon Comprehend.

{ "Version": "2012-10-17", "Statement": [ { "Sid": "ComprehendEnforceVpce", "Effect": "Allow", "Action": [ "comprehend:CreateDocumentClassifier", "comprehend:DetectEntities" ], "Resource": "*", "Condition": { "StringEquals": { "aws:SourceVpce": "vpce-xxxxxxxx" } } }, { "Sid": "PassRole", "Effect": "Allow", "Action": "iam:PassRole", "Resource": "arn:aws:iam::#########:role/ComprehendDataAccessRole" } ]
}

You should replace the VPC endpoint ID with the endpoint ID you created earlier. Permission to invoke the PassRole API is required for asynchronous operations in Amazon Comprehend like CreateDocumentClassifer and should be scoped to your specific data access role.

Using Amazon Comprehend via AWS PrivateLink

To start using Amazon Comprehend with AWS PrivateLink, you perform the following high-level steps:

  1. Review the Lambda function for API testing.
  2. Create the DetectEntities test event.
  3. Train a custom classifier.

Reviewing the Lambda function

To review your Lambda function, on the Lambda console, choose the Lambda function that contains ComprehendPrivateLink in its name.

The VPC section of the Lambda console provides links to the various networking components automatically created for you during the CloudFormation deployment.

The function code includes a sample program that takes user input to invoke the specific Amazon Comprehend APIs supported by our example IAM policy.

Creating a test event

In this section, we create an event to detect entities within sample text using a pretrained model.

  1. From the Test drop-down menu, choose Create new test event.
  2. For Event name, enter a name (for example, DetectEntities).
  3. Replace the event JSON with the following code:
    { "comprehend_api": "DetectEntities", "language_code": "en", "text": "Amazon.com, Inc. is located in Seattle, WA and was founded July 5th, 1994 by Jeff Bezos, allowing customers to buy everything from books to blenders."
    }

  4. Choose Save to store the test event.
  5. Choose Save to update the Lambda function.
  6. Choose Test to invoke the DetectEntities API.

The response should include results similar to the following code:

{ "Entities": [ { "Score": 0.9266431927680969, "Type": "ORGANIZATION", "Text": "Amazon.com, Inc.", "BeginOffset": 0, "EndOffset": 16 }, { "Score": 0.9952651262283325, "Type": "LOCATION", "Text": "Seattle, WA", "BeginOffset": 31, "EndOffset": 42 }, { "Score": 0.9998188018798828, "Type": "DATE", "Text": "July 5th, 1994", "BeginOffset": 59, "EndOffset": 73 }, { "Score": 0.9999810457229614, "Type": "PERSON", "Text": "Jeff Bezos", "BeginOffset": 77, "EndOffset": 87 } ]
}

You can update the test event to identify entities from your own text.

Training a custom classifier

We now demonstrate how to build a custom classifier. For training data, we use a version of the Yahoo answers corpus that is preprocessed into the format expected by Amazon Comprehend. This corpus, available on the AWS Open Data Registry, is cited in the paper Text Understanding from Scratch by Xiang Zhang and Yann LeCun. It is also used in the post Building a custom classifier using Amazon Comprehend.

  1. Retrieve the training data from Amazon S3.
  2. On the Amazon S3 console, choose the example S3 bucket created for you.
  3. Choose Upload and add the file you retrieved.
  4. Choose the uploaded object and note the Key.
  5. Return to the test function on the Lambda console.
  6. From the Test drop-down menu, choose Create new test event.
  7. For Event name, enter a name (for example, TrainCustomClassifier).
  8. Replace the event input with the following code:
    { "comprehend_api": "CreateDocumentClassifier", "custom_classifier_name": "custom-classifier-example", "language_code": "en", "training_data_s3_key": "comprehend-train.csv"
    }

  9. If you changed the default file name, update the training_data_s3_key to match.
  10. Choose Save to store the test event.
  11. Choose Save to update the Lambda function.
  12. Choose Test to invoke the CreateDocumentClassifier API.

The response should include results similar to the following code:

{ "DocumentClassifierArn": "arn:aws:comprehend:us-east-2:0123456789:document-classifier/custom-classifier-example"
}

  1. On the Amazon Comprehend console, choose Custom classification to check the status of the document classifier training.

After approximately 20 minutes, the document classifier is trained and available for use.

Cleaning Up

To avoid incurring future charges, delete the resources you created during this walkthrough after concluding your testing.

  1. On the Amazon Comprehend console, delete the custom classifier.
  2. On the Amazon S3 console, empty the bucket created for you.
  3. If you launched the automated deployment, on the AWS CloudFormation console, delete the appropriate stack.

The deletion process takes approximately 10 minutes.

Conclusion

You have now successfully invoked Amazon Comprehend APIs using AWS PrivateLink. The use of IAM policies prevents requests from leaving your VPC and further improves your security posture. You can extend this solution to securely test additional features like Amazon Comprehend custom entity recognition real-time endpoints.

All Amazon Comprehend API calls are now supported via AWS PrivateLink. This feature exists in all commercial Regions where AWS PrivateLink and Amazon Comprehend are available. To learn more about securing Amazon Comprehend, see Security in Amazon Comprehend.


About the Authors

Dave Williams is a Cloud Consultant for AWS Professional Services. He works with public sector customers to securely adopt AI/ML services. In his free time, he enjoys spending time with his family, traveling, and watching college football.

Adarsha Subick is a Cloud Consultant for AWS Professional Services based out of Virginia. He works with public sector customers to help solve their AI/ML-focused business problems. In his free time, he enjoys archery and hobby electronics.

Saman Zarandioon is a Sr. Software Development Engineer for Amazon Comprehend. He earned a PhD in Computer Science from Rutgers University.

Source: https://aws.amazon.com/blogs/machine-learning/securing-amazon-comprehend-api-calls-with-aws-privatelink/

Continue Reading
Nano Technology6 hours ago

SEMI Partners with GLOBALFOUNDRIES to Offer Apprenticeship Program Aimed at Building the Electronics Talent Pipeline

Fisher Yu, University of Arkansas CREDIT University of Arkansas
Nano Technology6 hours ago

Materials science researchers develop first electrically injected laser: The diode laser uses semiconducting material germanium tin and could improve micro-processing speed and efficiency at much lower costs

Nano Technology6 hours ago

Advance in programmable synthetic materials: Reading sequence of metal atoms in MOFs allows encoding of multiple chemical functions

Blockchain6 hours ago

Invest 3% in Bitcoin to Avoid COVID-19 Lockdown Devaluation — BitGo CEO

Blockchain6 hours ago

Cointelegraph Launches Newsletter for Professional Investors

Blockchain6 hours ago

Bitcoin Cash short-term Price Analysis: 12 August

Blockchain7 hours ago

Token Launches From Ethereum to Telegram: Where Do We Go From Here?

AR/VR7 hours ago

Enterprise VR Hardware Specialist Varjo Raises $54 Million in Latest Funding Round

Blockchain7 hours ago

Grayscale Bitcoin Trust Saw Surge in Investor Interest After March

Blockchain7 hours ago

VeChain & Oxford Announce New Framework to Assess Consensus Protocols

Blockchain7 hours ago

Championing Blockchain Education in Africa: Women Leading the Bitcoin Cause

Gaming7 hours ago

Evening Reading – August 11, 2020

Blockchain8 hours ago

Chainlink: Traders under zero loss, but why?

Blockchain9 hours ago

The Babylon Project: A Blockchain Focused Hackathon with a Commitment to Diversity & Inclusion

AR/VR9 hours ago

Varjo Raises $54M Financing to Support Its Retina-Quality VR/AR Headsets for Enterprise

Blockchain9 hours ago

Ethereum, Zcash, Dogecoin Price Analysis: 12 August

Blockchain9 hours ago

Peer-to-Peer Exchange CryptoLocally Now Offers Instant Credit Card Payment

Blockchain10 hours ago

Cardano (ADA) Holds On to Crucial Support By a Thread

Blockchain11 hours ago

Bitcoin Creates Double-Top After Failing Close Above $12,000

Blockchain11 hours ago

DeFi Farmers Rush to Yam and Serum for Explosive Yields

Energy12 hours ago

Copper Foil Market Size Worth $10.3 Billion By 2027 | CAGR: 9.7%: Grand View Research, Inc.

Energy13 hours ago

Corundum Market Size Worth $3.5 Billion By 2027 | CAGR: 4.0%: Grand View Research, Inc.

AR/VR13 hours ago

Mozilla is Shuttering its XR Team Amidst Major Layoff, But ‘Hubs’ Will Continue

Energy13 hours ago

New Energy Challenger, Rebel Energy, Places Blue Prism Digital Workers at the Heart of its Launch Plans

Science13 hours ago

Teknosa grows by 580 percent in e-commerce and pulls its operating profit into positive territory in Q2, despite the pandemic

Science13 hours ago

Novo Ventures Portfolio Company F2G Closes US$60.8 Million Financing

Science13 hours ago

F2G Closes US$60.8 Million Financing to fund late stage development of novel mechanism antifungal agent

Blockchain14 hours ago

LocalCryptos Integrates Inbuilt Crypto-To-Crypto Exchanges, Powered by ChangeNOW

Publications14 hours ago

Putin’s plan for Russia’s coronavirus vaccine is at ‘high risk of backfiring,’ expert says

Publications14 hours ago

UK enters recession after GDP plunged by a record 20.4% in the second quarter

Gaming14 hours ago

Another Steam Game Festival Is Coming In October

Science14 hours ago

Top 25 Nationally Ranked Carr, Riggs & Ingram (CRI) Welcomes Cookeville-Based Firm, Duncan, Wheeler & Wilkerson, P.C.

Science14 hours ago

Avast plc Half Year Results For The Six-Months Ended 30 June 2020

Cyber Security15 hours ago

Russian hackers steal Prince Harry and Meghan Markle photos via Cyber Attack

Gaming15 hours ago

Oddworld: New ‘N Tasty Coming To Switch In October

Gaming15 hours ago

Linkin Park’s Mike Shinoda Is Writing A Song For Gamescom 2020

Cyber Security15 hours ago

Texas School District experiences DDoS Cyber Attack

Gaming15 hours ago

‘EVE: Echoes’ from CCP Games and Netease Is Now Available Early on the App Store, Servers Go Live Tomorrow

Gaming15 hours ago

Hans Zimmer Created An Extended Netflix “Ta Dum” Sound For Theatres

Cannabis15 hours ago

Everything you need to know about the Exxus Snap VV

Trending