Connect with us


Power of AI With Cloud Computing is “Stunning” to Microsoft’s Nadella 




Microsoft CEO Satya Nadella said at the AI and the Future of Work Conference from MIT that the ability of cloud computing to harness massive computing power is ‘transformative.’ (Photo by Mohammad Rezaie on Unsplash.)

By AI Trends Staff  

Asked what in the march of technology he is most impressed with, Microsoft CEO Satya Nadella said at MIT’s AI and the Work of the Future Congress 2020 held virtually last week that he is struck by the ability of cloud computing to provision massive computing power.   

Satya Nadella, CEO, Microsoft

“The computing available to do AI is transformative,” Nadella said to David Autor, the Ford Professor of Economics at MIT, who conducted the Fireside Chat session.   

Nadella mentioned the GPT-3 general purpose language model from OpenAI, an AI lab searching for a commercial business model. GPT-3 is an autoregressive language model with 175 billion parameters. OpenAI agreed to license GPT-3 to Microsoft for their own products and services, while continuing to offer OpenAI’s API to the market. Today the API is in a limited beta as OpenAI and academic partners test and assess its capabilities.  

The Microsoft license is exclusive however, meaning Microsoft’s cloud computing competitors cannot access it in the same way. The agreement was seen as important to helping OpenAI with the expense of getting GPT-3 up and running and maintaining it, according to an account in TechTalks. These include an estimated $10 million in expenses to research GPT-3 and train the model, tens of thousands of dollars in monthly cloud computing and electricity costs to run the models, an estimated one million dollars annually to retrain the model to prevent decay, and additional costs of customer support, marketing, IT, legal and other requirements to put a software product on the market.  

Earlier this year at its Build developers conference, Microsoft announced it worked with OpenAI to assemble what Microsoft said was “one of the top five publicly disclosed supercomputers in the world,” according to an account on the Microsoft AI blog. The infrastructure will be available in Azure, Microsoft’s cloud computing offering, to train “extremely large” AI models.   

The partnership between Microsoft and OpenAI aims to “jointly create new supercomputing technologies in Azure,” the blog post stated.  

And it’s not just happening in the cloud, it’s happening on the edge,” Nadella said.  

Applications for cloud and edge computing working together—such as natural language generation, image completion, or virtual simulations from wearable sensors that see the work—are very compute-intensive. “It’s stunning to see the capability,” of the GPT-3 model applied to this work, Nadella said. “Something in the model architecture gives me confidence we will have more breakthroughs at an accelerating pace,” he said.  

Potential Strategic Advantage in Search, Voice Assistants from GPT-3 Models  

Strategically, it could be that the GPT-3 models will give Microsoft a real advantage, the article in TechTalks suggested. For example in the search engine market, Microsoft’s Bing has just over a 6% market share, behind Google’s 87%. Whether GPT-3 will enable Microsoft to roll out new features that redefine how search is used remains to be seen.   

Microsoft is also likely to explore potential advantages GPT-3 could bring to the voice assistant market, where Microsoft’s Cortana sees a 22% share, behind Apple’s Siri, which has 35%.  

Nadella does have concerns related to the power of AI and automation. “We need a set of design principles, from ethics to actual engineering and design and a process to allow us to be accountable, so the models are fair and not biased. We need to ‘de-bias’ the models and that is hard engineering work,” he said. “Unintended consequences” and “bad use cases” are also challenges, he said, without elaborating. [Ed. Note: A ‘misuse case” or bad use case describes a function the system should not allow, from Wikipedia.]  

Moderator Autor asked Nadella how Microsoft makes decisions on what problems to work on using AI. Nadella mentioned “real world small AI” and the company’s Power Platform tools, which enables several products to work well together as part of a business application platform. This foundation is built on what had been called the Common Data Service for apps, and as of this month (November), is called “Dataverse.” Data is stored in tables which can reside on the cloud. 

Using the tools, “People can take their domain expertise and turn it into automation using AI capabilities,” Nadella said. 

Asked what new job opportunities are being created from the use of AI he anticipates in the future, Nadella compared the transition going on today to the onset of computer spreadsheets and word processors. “The same thing is happening today,” as computing is getting embedded in manufacturing plants, retail settings, hospitals, and farms. “This will shape new jobs and change existing jobs,” he said. 

‘Democratization of AI’ Seen as Having Potential to Lower Barriers  

The two discussed whether the opportunities from AI extend to those workers without abstract skills like programming. Discussion ensued on “democratization of AI” which lowers barriers for individuals and organizations to gain experience with AI, allowing them, for example, to leverage publicly available data and algorithms to build AI models on a cloud infrastructure. 

Relating it to education, Autor wondered if access to education could be “democratized” more. Nadella said, “STEM is important, but we don’t need everyone to get a master’s in computer science. If you can democratize the expertise to help the productivity of the front line worker, that is the problem to solve.” 

Autor asked if technology has anything to do with the growing gap between low-wage and high-wage workers, and what could be done about it. Nadella said Microsoft is committed to making education that leads to credentials available. “We need a real-time feedback loop between the jobs of the future and the skills required,” Nadella said. “To credential those skills, we are seeing more companies invest in corporate training as part of their daily workflow. Microsoft is super focused on that.” 


A tax credit for corporations that invest in training would be a good idea, Nadella suggested. “We need an incentive mechanism,” he said, adding that a feedback loop would help training programs to be successful.  

Will “telepresence” remain after the pandemic is over? Autor asked. Nadella outlined four thoughts: first, the collaboration between front line workers and knowledge workers will continue, since the collaboration has proved to be more productive in some ways; second, meetings will change but collaboration will continue before, during, and after meetings; third, learning and the delivery of training will be better assisted with virtual tools; and “video fatigue” will be recognized as a real thing.   

“We need to get people out of their square boxes and into a shared sense of presence, to reduce cognitive load,” Nadella said. “One of my worries is that we are burning the social capital that got built up. We need to learn new techniques for building social capital back.”  

Learn more about AI and the Work of the Future Congress 2020, GPT-3 inTechTalks and on the Microsoft AI blog, the Power Platform and Dataverse. 



This Week’s Awesome Tech Stories From Around the Web (Through January 23)





This Chinese Lab Is Aiming for Big AI Breakthroughs
Will Knight | Wired
“China produces as many artificial intelligence researchers as the US, but it lags in key fields like machine learning. The government hopes to make up ground. …It set AI researchers the goal of making ‘fundamental breakthroughs by 2025’ and called for the country to be ‘the world’s primary innovation center by 2030.’ BAAI opened a year later, in Zhongguancun, a neighborhood of Beijing designed to replicate US innovation hubs such as Boston and Silicon Valley.”


What Elon Musk’s $100 Million Carbon Capture Prize Could Mean
James Temple | MIT Technology Review
“[Elon Musk] announced on Twitter that he plans to give away $100 million of [his $180 billion net worth] as a prize for the ‘best carbon capture technology.’ …Another $100 million could certainly help whatever venture, or ventures, clinch Musk’s prize. But it’s a tiny fraction of his wealth and will also only go so far. …Money aside, however, one thing Musk has a particular knack for is generating attention. And this is a space in need of it.”


Synthetic Cornea Helped a Legally Blind Man Regain His Sight
Steve Dent | Engadget
“While the implant doesn’t contain any electronics, it could help more people than any robotic eye. ‘After years of hard work, seeing a colleague implant the CorNeat KPro with ease and witnessing a fellow human being regain his sight the following day was electrifying and emotionally moving, there were a lot of tears in the room,’ said CorNeat Vision co-founder Dr. Gilad Litvin.”


MIT Develops Method for Lab-Grown Plants That May Eventually Lead to Alternatives to Forestry and Farming
Darrell Etherington | TechCrunch
“If the work of these researchers can eventually be used to create a way to produce lab-grown wood for use in construction and fabrication in a way that’s scalable and efficient, then there’s tremendous potential in terms of reducing the impact on forestry globally. Eventually, the team even theorizes you could coax the growth of plant-based materials into specific target shapes, so you could also do some of the manufacturing in the lab, by growing a wood table directly for instance.”


FAA Approves First Fully Automated Commercial Drone Flights
Andy Pasztor and Katy Stech Ferek | The Wall Street Journal
“US aviation regulators have approved the first fully automated commercial drone flights, granting a small Massachusetts-based company permission to operate drones without hands-on piloting or direct observation by human controllers or observers. …The company’s Scout drones operate under predetermined flight programs and use acoustic technology to detect and avoid drones, birds, and other obstacles.”


China’s Surging Private Space Industry Is Out to Challenge the US
Neel V. Patel | MIT Technology Review
“[The Ceres-1] was a commercial rocket—only the second from a Chinese company ever to go into space. And the launch happened less than three years after the company was founded. The achievement is a milestone for China’s fledgling—but rapidly growing—private space industry, an increasingly critical part of the country’s quest to dethrone the US as the world’s preeminent space power.”


Janet Yellen Will Consider Limiting Use of Cryptocurrency
Timothy B. Lee | Ars Technica
“Cryptocurrencies could come under renewed regulatory scrutiny over the next four years if Janet Yellen, Joe Biden’s pick to lead the Treasury Department, gets her way. During Yellen’s Tuesday confirmation hearing before the Senate Finance Committee, Sen. Maggie Hassan (D-N.H.) asked Yellen about the use of cryptocurrency by terrorists and other criminals. ‘Cryptocurrencies are a particular concern,’ Yellen responded. ‘I think many are used—at least in a transactions sense—mainly for illicit financing.’i


Secret Ingredient Found to Power Supernovas
Thomas Lewton | Quanta
“…Only in the last few years, with the growth of supercomputers, have theorists had enough computing power to model massive stars with the complexity needed to achieve explosions. …These new simulations are giving researchers a better understanding of exactly how supernovas have shaped the universe we see today.”

Image Credit: Ricardo Gomez Angel / Unsplash


Continue Reading


Multi-account model deployment with Amazon SageMaker Pipelines




Amazon SageMaker Pipelines is the first purpose-built CI/CD service for machine learning (ML). It helps you build, automate, manage, and scale end-to-end ML workflows and apply DevOps best practices of CI/CD to ML (also known as MLOps).

Creating multiple accounts to organize all the resources of your organization is a good DevOps practice. A multi-account strategy is important not only to improve governance but also to increase security and control of the resources that support your organization’s business. This strategy allows many different teams inside your organization, to experiment, innovate, and integrate faster, while keeping the production environment safe and available for your customers.

Pipelines makes it easy to apply the same strategy to deploying ML models. Imagine a use case in which you have three different AWS accounts, one for each environment: data science, staging, and production. The data scientist has the freedom to run experiments and train and optimize different models any time in their own account. When a model is good enough to be deployed in production, the data scientist just needs to flip the model approval status to Approved. After that, an automated process deploys the model on the staging account. Here you can automate testing of the model with unit tests or integration tests or test the model manually. After a manual or automated approval, the model is deployed to the production account, which is a more tightly controlled environment used to serve inferences on real-world data. With Pipelines, you can implement a ready-to-use multi-account environment.

In this post, you learn how to use Pipelines to implement your own multi-account ML pipeline. First, you learn how to configure your environment and prepare it to use a predefined template as a SageMaker project for training and deploying a model in two different accounts: staging and production. Then, you see in detail how this custom template was created and how to create and customize templates for your own SageMaker projects.

Preparing the environment

In this section, you configure three different AWS accounts and use SageMaker Studio to create a project that integrates a CI/CD pipeline with the ML pipeline created by a data scientist. The following diagram shows the reference architecture of the environment that is created by the SageMaker custom project and how AWS Organizations integrates the different accounts.

The following diagram shows the reference architecture of the environment that is created by the SageMaker custom project and how AWS Organizations integrates the different accounts.

The diagram contains three different accounts, managed by Organizations. Also, three different user roles (which may be the same person) operate this environment:

  • ML engineer – Responsible for provisioning the SageMaker Studio project that creates the CI/CD pipeline, model registry, and other resources
  • Data scientist – Responsible for creating the ML pipeline that ends with a trained model registered to the model group (also referred to as model package group)
  • Approver – Responsible for testing the model deployed to the staging account and approving the production deployment

It’s possible to run a similar solution without Organizations, if you prefer (although not recommended). But you need to prepare the permissions and the trust relationship between your accounts manually and modify the template to remove the Organizations dependency. Also, if you’re an enterprise with multiple AWS accounts and teams, it’s highly recommended that you use AWS Control Tower for provisioning the accounts and Organizations. AWS Control Tower provides the easiest way to set up and govern a new and secure multi-account AWS environment. For this post, we only discuss implementing the solution with Organizations.

But before you move on, you need to complete the following steps, which are detailed in the next sections:

  1. Create an AWS account to be used by the data scientists (data science account).
  2. Create and configure a SageMaker Studio domain in the data science account.
  3. Create two additional accounts for production and staging.
  4. Create an organizational structure using Organizations, then invite and integrate the additional accounts.
  5. Configure the permissions required to run the pipelines and deploy models on external accounts.
  6. Import the SageMaker project template for deploying models in multiple accounts and make it available for SageMaker Studio.

Configuring SageMaker Studio in your account

Pipelines provides built-in support for MLOps templates to make it easy for you to use CI/CD for your ML projects. These MLOps templates are defined as Amazon CloudFormation templates and published via AWS Service Catalog. These are made available to data scientists via SageMaker Studio, an IDE for ML. To configure Studio in your account, complete the following steps:

  1. Prepare your SageMaker Studio domain.
  2. Enable SageMaker project templates and SageMaker JumpStart for this account and Studio users.

If you have an existing domain, you can simply edit the settings for the domain or individual users to enable this option. Enabling this option creates two different AWS Identity and Account Management (IAM) roles in your AWS account:

  • AmazonSageMakerServiceCatalogProductsLaunchRole – Used by SageMaker to run the project templates and create the required infrastructure resources
  • AmazonSageMakerServiceCatalogProductsUseRole – Used by the CI/CD pipeline to run a job and deploy the models on the target accounts

If you created your SageMaker Studio domain before re:Invent 2020, it’s recommended that you refresh your environment by saving all the work in progress. On the File menu, choose Shutdown, and confirm your choice.

  1. Create and prepare two other AWS accounts for staging and production, if you don’t have them yet.

Configuring Organizations

You need to add the data science account and the two additional accounts to a structure in Organizations. Organizations helps you to centrally manage and govern your environment as you grow and scale your AWS resources. It’s free and benefits your governance strategy.

Each account must be added to a different organizational unit (OU).

  1. On the Organizations console, create a structure of OUs like the following:
  • Root
    • multi-account-deployment (OU)
      • 111111111111 (data science account—SageMaker Studio)
      • production (OU)
        • 222222222222 (AWS account)
      • staging (OU)
        • 333333333333 (AWS account)

After configuring the organization, each account owner receives an invite. The owners need to accept the invites, otherwise the accounts aren’t included in the organization.

  1. Now you need to enable trusted access with AWS organizations (“Enable all features” and “Enable trusted access in the StackSets”).

This process allows your data science account to provision resources in the target accounts. If you don’t do that, the deployment process fails. Also, this feature set is the preferred way to work with Organizations, and it includes consolidating billing features.

  1. Next, on the Organizations console, choose Organize accounts.
  2. Choose staging.
  3. Note down the OU ID.
  4. Repeat this process for the production OU.

Repeat this process for the production OU.

Configuring the permissions

You need to create a SageMaker execution role in each additional account. These roles are assumed by AmazonSageMakerServiceCatalogProductsUseRole in the data science account to deploy the endpoints in the target accounts and test them.

  1. Sign in to the AWS Management Console with the staging account.
  2. Run the following CloudFormation template.

This template creates a new SageMaker role for you.

  1. Provide the following parameters:
    1. SageMakerRoleSuffix – A short string (maximum 10 lowercase with no spaces or alpha-numeric characters) that is added to the role name after the following prefix: sagemaker-role-. The final role name is sagemaker-role-<<sagemaker_role_suffix>>.
    2. PipelineExecutionRoleArn – The ARN of the role from the data science account that assumes the SageMaker role you’re creating. To find the ARN, sign in to the console with the data science account. On the IAM console, choose Roles and search for AmazonSageMakerServiceCatalogProductsUseRole. Choose this role and copy the ARN (arn:aws:iam::<<data_science_acccount_id>>:role/service-role/AmazonSageMakerServiceCatalogProductsUseRole).
  2. After creating this role in the staging account, repeat this process for the production account.

In the data science account, you now configure the policy of the Amazon Simple Storage Service (Amazon S3) bucket used to store the trained model. For this post, we use the default SageMaker bucket of the current Region. It has the following name format: sagemaker-<<region>>-<<aws_account_id>>.

  1. On the Amazon S3 console, search for this bucket, providing the Region you’re using and the ID of the data science account.

If you don’t find it, create a new bucket following this name format.

  1. On the Permissions tab, add the following policy:
    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::<<staging_account_id>>:root", "arn:aws:iam::<<production_account_id>>:root" ] }, "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::sagemaker-<<region>>-<<aws_account_id>>", "arn:aws:s3:::sagemaker-<<region>>-<<aws_account_id>>/*" ] } ]

  1. Save your settings.

The target accounts now have permission to read the trained model during deployment.

The next step is to add new permissions to the roles AmazonSageMakerServiceCatalogProductsUseRole and AmazonSageMakerServiceCatalogProductsLaunchRole.

  1. In the data science account, on the IAM console, choose Roles.
  2. Find the AmazonSageMakerServiceCatalogProductsUseRole role and choose it.
  3. Add a new policy and enter the following JSON code.
  4. Save your changes.
  5. Now, find the AmazonSageMakerServiceCatalogProductsLaunchRole role, choose it and add a new policy with the following content:
    { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::aws-ml-blog/artifacts/sagemaker-pipeline-blog-resources/*" } ]

  1. Save your changes.

That’s it! Your environment is almost ready. You only need one more step and you can start training and deploying models in different accounts.

Importing the custom SageMaker Studio project template

In this step, you import your custom project template.

  1. Sign in to the console with the data science account.
  2. On the AWS Service Catalog console, under Administration, choose Portfolios.
  3. Choose Create a new portfolio.
  4. Name the portfolio SageMaker Organization Templates.
  5. Download the following template to your computer.
  6. Choose the new portfolio.
  7. Choose Upload a new product.
  8. For Product name¸ enter Multi Account Deployment.
  9. For Description, enter Multi account deployment project.
  10. For Owner, enter your name.
  11. Under Version details, for Method, choose Use a template file.
  12. Choose Upload a template.
  13. Upload the template you downloaded.
  14. For Version title, choose 1.0.

The remaining parameters are optional.

  1. Choose Review.
  2. Review your settings and choose Create product.
  3. Choose Refresh to list the new product.
  4. Choose the product you just created.
  5. On the Tags tab, add the following tag to the product:
    1. Keysagemaker:studio-visibility
    2. ValueTrue

Back in the portfolio details, you see something similar to the following screenshot (with different IDs).

Back in the portfolio details, you see something similar to the following screenshot (with different IDs).

  1. On the Constraints tab, choose Create constraint.
  2. For Product, choose Multi Account Deployment (the product you just created).
  3. For Constraint type, choose Launch.
  4. Under Launch Constraint, for Method, choose Select IAM role.
  5. Choose AmazonSageMakerServiceCatalogProductsLaunchRole.
  6. Choose Create.
  7. On the Groups, roles, and users tab, choose Add groups, roles, users.
  8. On the Roles tab, select the role you used when configuring your SageMaker Studio domain.
  9. Choose Add access.

If you don’t remember which role you selected, in your data science account, go to the SageMaker console and choose Amazon SageMaker Studio. In the Studio Summary section, locate the attribute Execution role. Search for the name of this role in the previous step.

You’re done! Now it’s time to create a project using this template.

Creating your project

In the previous sections, you prepared the multi-account environment. The next step is to create a project using your new template.

  1. Sign in to the console with the data science account.
  2. On the SageMaker console, open SageMaker Studio with your user.
  3. Choose the Components and registries
  4. On the drop-down menu, choose Projects.
  5. Choose Create project.

Choose Create project.

On the Create project page, SageMaker templates is chosen by default. This option lists the built-in templates. However, you want to use the template you prepared for the multi-account deployment.

  1. Choose Organization templates.
  2. Choose Multi Account Deployment.
  3. Choose Select project template.

If you can’t see the template, make sure you completed all the steps correctly in the previous section.

If you can’t see the template, make sure you completed all the steps correctly in the previous section.

  1. In the Project details section, for Name, enter iris-multi-01.

The project name must have 15 characters or fewer.

  1. In the Project template parameters, use the names of the roles you created in each target account (staging and production) and provide the following properties:
    1. SageMakerExecutionRoleStagingName
    2. SageMakerExecutionRoleProdName
  2. Retrieve the OU IDs you created earlier for the staging and production OUs and provide the following properties:
    1. OrganizationalUnitStagingId
    2. OrganizationalUnitProdId
  3. Choose Create project.

Choose Create project.

Provisioning all the resources takes a few minutes, after which the project is listed in the Projects section. When you choose the project, a tab opens with the project’s metadata. The Model groups tab chows a model group with the same name as your project. It was also created during the project provisioning.

Provisioning all the resources takes a few minutes, after which the project is listed in the Projects section.

The environment is now ready for the data scientist to start training the model.

Training a model

Now that your project is ready, it’s time to train a model.

  1. Download the example notebook to use for this walkthrough.
  2. Choose the Folder icon to change the work area to file management.
  3. Choose the Create folder
  4. Enter a name for the folder.
  5. Choose the folder name.
  6. Choose the Upload file
  7. Choose the Jupyter notebook you downloaded and upload it to the new directory.
  8. Choose the notebook to open a new tab.

Choose the notebook to open a new tab.

You’re prompted to choose a kernel.

  1. Choose Python3 (Data Science).
  2. Choose Select.

Choose Select.

  1. In the second cell of the notebook, replace the project_name variable with the name you gave your project (for this post, iris-multi-01).

You can now run the Jupyter notebook. This notebook creates a very simple pipeline with only two steps: train and register model. It uses the iris dataset and the XGBoost built-in container as the algorithm.

  1. Run the whole notebook.

The process takes some time after you run the cell containing the following code:

start_response = pipeline.start(parameters={ "TrainingInstanceCount": "1"

This starts the training job, which takes approximately 3 minutes to complete. After the training is finished, the next cell of the Jupyter notebook gets the latest version of the model in the model registry and marks it as Approved. Alternatively, you can approve a model from the SageMaker Studio UI. On the Model groups tab, choose the model group and desired version. Choose Update status and Approve before saving.

Choose Update status and Approve before saving

This is the end of the data scientist’s job but the beginning of running the CI/CD pipeline.

Amazon EventBridge monitors the model registry. The listener starts a new deployment job with the provisioned AWS CodePipeline workflow (created with you launched the SageMaker Studio project).

  1. On the CodePipeline console, choose the pipeline starting with the prefix sagemaker-, followed by the name of your project.

On the CodePipeline console, choose the pipeline starting with the prefix sagemaker-, followed by the name of your project.

Shortly after you approve your model, the deployment pipeline starts running. Wait for the pipeline to reach the state DeployStaging. That stage can take approximately 10 minutes to complete. After deploying the first endpoint in the staging account, the pipeline is tested, and then moves to the next step, ApproveDeployment. In this step, it waits for manual approval.

  1. Choose Review.
  2. Enter an approval reason in the text box.
  3. Choose Approve.

The model is now deployed in the production account.

You can also monitor the pipeline on the AWS CloudFormation console, to see the stacks and stack sets the pipeline creates to deploy endpoints in the target accounts. To see the deployed endpoints for each account, sign in to the SageMaker console as either the staging account or production account and choose Endpoints on the navigation pane.

Cleaning up

To clean up all the resources you provisioned in this example, complete the following steps:

  1. Sign in to the console with your main account.
  2. On the AWS CloudFormation console, click on StackSets and delete the following items (endpoints):
    1. Prod sagemaker-<<sagemaker-project-name>>-<<project-id>>-deploy-prod
    2. Stagingsagemaker-<<sagemaker-project-name>>-<<project-id>>-deploy-staging
  3. In your laptop or workstation terminal, use the AWS Command Line Interface (AWS CLI) and enter the following code to delete your project:
    aws sagemaker delete-project --project-name iris-multi-01

Make sure you’re using the latest version of the AWS CLI.

Building and customizing a template for your own SageMaker project

SageMaker projects and SageMaker MLOps project templates are powerful features that you can use to automatically create and configure the whole infrastructure required to train, optimize, evaluate, and deploy ML models. A SageMaker project is an AWS Service Catalog provisioned product that enables you to easily create an end-to-end ML solution. For more information, see the AWS Service Catalog Administrator Guide.

A product is a CloudFormation template managed by AWS Service Catalog. For more information about templates and their requirements, see AWS CloudFormation template formats.

ML engineers can design multiple environments and express all the details of this setup as a CloudFormation template, using the concept of infrastructure as code (IaC). You can also integrate these different environments and tasks using a CI/CD pipeline. SageMaker projects provide an easy, secure, and straightforward way of wrapping the infrastructure complexity around in the format of a simple project, which can be launched many times by the other ML engineers and data scientists.

The following diagram illustrates the main steps you need to complete in order to create and publish your custom SageMaker project template.

The following diagram illustrates the main steps you need to complete in order to create and publish your custom SageMaker project template.

We described these steps in more detail in the sections Importing the custom SageMaker Studio Project template and Creating your project.

As an ML engineer, you can design and create a new CloudFormation template for the project, prepare an AWS Service Catalog portfolio, and add a new product to it.

Both data scientists and ML engineers can use SageMaker Studio to create a new project with the custom template. SageMaker invokes AWS Service Catalog and starts provisioning the infrastructure described in the CloudFormation template.

As a data scientist, you can now start training the model. After you register it in the model registry, the CI/CD pipeline runs automatically and deploys the model on the target accounts.

If you look at the CloudFormation template from this post in a text editor, you can see that it implements the architecture we outline in this post.

The following code is a snippet of the template:

Description: Toolchain template which provides the resources needed to represent infrastructure as code. This template specifically creates a CI/CD pipeline to deploy a given inference image and pretrained Model to two stages in CD -- staging and production.
Parameters: SageMakerProjectName: Type: String SageMakerProjectId: Type: String
<<other parameters>>
Resources: MlOpsArtifactsBucket: Type: AWS::S3::Bucket DeletionPolicy: Retain Properties: BucketName: …
… ModelDeployCodeCommitRepository: Type: AWS::CodeCommit::Repository Properties: RepositoryName: … RepositoryDescription: … Code: S3: Bucket: … Key: …
… ModelDeployBuildProject: Type: AWS::CodeBuild::Project
… ModelDeployPipeline: Type: AWS::CodePipeline::Pipeline

The template has two key sections: Parameters (input parameters of the template) and Resources. SageMaker project templates require that you add two input parameters to your template: SageMakerProjectName and SageMakerProjectId. These parameters are used internally by SageMaker Studio. You can add other parameters if needed.

In the Resources section of the snippet, you can see that it creates the following:

  • A new S3 bucket used by the CI/CD pipeline to store the intermediary artifacts passed from one stage to another.
  • An AWS CodeCommit repository to store the artifacts used during the deployment and testing stages.
  • An AWS CodeBuild project to get the artifacts, and validate and configure them for the project. In the multi-account template, this project also creates a new model registry, used by the CI/CD pipeline to deploy new models.
  • A CodePipeline workflow that orchestrates all the steps of the CI/CD pipelines.

Each time you register a new model to the model registry or push a new artifact to the CodeCommit repo, this CodePipeline workflow starts. These events are captured by an EventBridge rule, provisioned by the same template. The CI/CD pipeline contains the following stages:

  • Source – Reads the artifacts from the CodeCommit repository and shares with the other steps.
  • Build – Runs the CodeBuild project to do the following:
    • Verify if a model registry is already created, and create one if needed.
    • Prepare a new CloudFormation template that is used by the next two deployment stages.
  • DeployStaging – Contains the following components:
    • DeployResourcesStaging – Gets the CloudFormation template prepared in the Build step and deploys a new stack. This stack deploys a new SageMaker endpoint in the target account.
    • TestStaging – Invokes a second CodeBuild project that runs a custom Python script that tests the deployed endpoint.
    • ApproveDeployment – A manual approval step. If approved, it moves to the next stage to deploy an endpoint in production, or ends the workflow if not approved.
  • DeployProd – Similar to DeployStaging, it uses the same CloudFormation template but with different input parameters. It deploys a new SageMaker endpoint in the production account. 

You can start a new training process and register your model to the model registry associated with the SageMaker project. Use the Jupyter notebook provided in this post and customize your own ML pipeline to prepare your dataset and train, optimize, and test your models before deploying them. For more information about these features, see Automate MLOps with SageMaker Projects. For more Pipelines examples, see the GitHub repo.

Conclusions and next steps

In this post, you saw how to prepare your own environment to train and deploy ML models in multiple AWS accounts by using SageMaker Pipelines.

With SageMaker projects, the governance and security of your environment can be significantly improved if you start managing your ML projects as a library of SageMaker project templates.

As a next step, try to modify the SageMaker project template and customize it to address your organization’s needs. Add as many steps as you want and keep in mind that you can capture the CI/CD events and notify users or call other services to build comprehensive solutions.

About the Author

Samir Araújo is an AI/ML Solutions Architect at AWS. He helps customers creating AI/ML solutions solve their business challenges using the AWS platform. He has been working on several AI/ML projects related to computer vision, natural language processing, forecasting, ML at the edge, and more. He likes playing with hardware and automation projects in his free time, and he has a particular interest for robotics.


Continue Reading

Artificial Intelligence

Extra Crunch roundup: Digital health VC survey, edtech M&A, deep tech marketing, more




I had my first telehealth consultation last year, and there’s a high probability that you did, too. Since the pandemic began, consumer adoption of remote healthcare has increased 300%.

Speaking as an unvaccinated urban dweller: I’d rather speak to a nurse or doctor via my laptop than try to remain physically distanced on a bus or hailed ride traveling to/from their office.

Even after things return to (rolls eyes) normal, if I thought there was a reliable way to receive high-quality healthcare in my living room, I’d choose it.

Clearly, I’m not alone: a May 2020 McKinsey study pegged yearly domestic telehealth revenue at $3 billion before the coronavirus, but estimated that “up to $250 billion of current U.S. healthcare spend could potentially be virtualized” after the pandemic abates.

That’s a staggering number, but in a category that includes startups focused on sexual health, women’s health, pediatrics, mental health, data management and testing, it’s clear to see why digital-health funding topped more than $10 billion in the first three quarters of 2020.

Drawing from The TechCrunch List, reporter Sarah Buhr interviewed eight active health tech VCs to learn more about the companies and industry verticals that have captured their interest in 2021:

  • Bryan Roberts and Bob Kocher, partners, Venrock
  • Nan Li, managing director, Obvious Ventures
  • Elizabeth Yin, general partner, Hustle Fund
  • Christina Farr, principal investor and health tech lead, OMERS Ventures
  • Ursheet Parikh, partner, Mayfield Ventures
  • Nnamdi Okike, co-founder and managing partner, 645 Ventures
  • Emily Melton, founder and managing partner, Threshold Ventures

Full Extra Crunch articles are only available to members
Use discount code ECFriday to save 20% off a one- or two-year subscription

Since COVID-19 has renewed Washington’s focus on healthcare, many investors said they expect a friendly regulatory environment for telehealth in 2021. Additionally, healthcare providers are looking for ways to reduce costs and lower barriers for patients seeking behavioral support.

“Remote really does work,” said Elizabeth Yin, general partner at Hustle Fund.

We’ll cover digital health in more depth this year through additional surveys, vertical reporting, founder interviews and much more.

Thanks very much for reading Extra Crunch this week; I hope you have a relaxing weekend.

Walter Thompson
Senior Editor, TechCrunch

8 VCs agree: Behavioral support and remote visits make digital health a strong bet for 2021

Woman having a medicine video conferencing with her doctor using digital tablet. Senior woman on a video call with a doctor using her tablet computer at home.

Image Credits: Luis Alvarez (opens in a new window) / Getty Images

Lessons from Top Hat’s acquisition spree

Image Credits: Bryce Durbin

In the last year, edtech startup Top Hat acquired three publishing companies: Fountainhead Press, Bludoor and Nelson HigherEd.

Natasha Mascarenhas interviewed CEO and founder Mike Silagadze to learn more about his content acquisition strategy, but her story also discussed “some rumblings of consolidation and exits in edtech land.”

How VCs invested in Asia and Europe in 2020

Last year, U.S.-based VCs invested an average of $428 million each day in domestic startups, with much of the benefits flowing to fintech companies.

This morning, Alex Wilhelm examined Q4 VC totals for Europe, which had its lowest deal count since Q1 2019, despite a record $14.3 billion in investments.

Asia’s VC industry, which saw $25.2 billion invested across 1,398 deals is seeing “a muted recovery,” says Alex.

“Falling seed volume, lots of big rounds. That’s 2020 VC around the world in a nutshell.”

Decrypted: With more SolarWinds fallout, Biden picks his cybersecurity team

Image Credits: Treedeo (opens in a new window) / Getty Images

In this week’s Decrypted, security reporter Zack Whittaker covered the latest news in the unfolding SolarWinds espionage campaign, now revealed to have impacted the U.S. Bureau of Labor Statistics and Malwarebytes.

In other news, the controversy regarding WhatsApp’s privacy policy change appears to be driving users to encrypted messaging app Signal, Zack reported. Facebook has put changes at WhatsApp on hold “until it could figure out how to explain the change without losing millions of users,” apparently.

Hot IPOs hang onto gains as investors keep betting on tech

A big IPO debut is a juicy topic for a few news cycles, but because there’s always another unicorn ready to break free from its corral and leap into the public markets, it doesn’t leave a lot of time to reflect.

Alex studied companies like Lemonade, Airbnb and Affirm to see how well these IPO pop stars have retained their value. Not only have most held steady, “many have actually run up the score in the ensuing weeks,” he found.

Dear Sophie: What are Biden’s immigration changes?

lone figure at entrance to maze hedge that has an American flag at the center

Image Credits: Bryce Durbin / TechCrunch

Dear Sophie:

I work in HR for a tech firm. I understand that Biden is rolling out a new immigration plan today.

What is your sense as to how the new administration will change business, corporate and startup founder immigration to the U.S.?

—Free in Fremont

Hello, Extra Crunch community!

Hello in Different Languages

Image Credits: atakan (opens in a new window) / Getty Images

I began my career as an avid TechCrunch reader and remained one even when I joined as a writer, when I left to work on other things and now that I’ve returned to focus on better serving our community.

I’ve been chatting with some of the folks in our community and I’d love to talk to you, too. Nothing fancy, just 5-10 minutes of your time to hear more about what you want to see from us and get some feedback on what we’ve been doing so far.

If you would be so kind as to take a minute or two to fill out this form, I’ll drop you a note and hopefully we can have a chat about the future of the Extra Crunch community before we formally roll out some of the ideas we’re cooking up.

Drew Olanoff

In 2020, VCs invested $428m into US-based startups every day

Last year was a disaster across the board thanks to a global pandemic, economic uncertainty and widespread social and political upheaval.

But if you were involved in the private markets, however, 2020 had some very clear upside — VCs flowed $156.2 billion into U.S.-based startups, “or around $428 million for each day,” reports Alex Wilhelm.

“The huge sum of money, however, was itself dwarfed by the amount of liquidity that American startups generated, some $290.1 billion.”

Using data sourced from the National Venture Capital Association and PitchBook, Alex used Monday’s column to recap last year’s seed, early-stage and late-stage rounds.

How and when to build marketing teams at deep tech companies

Pole lifting rubber duck with hook in its head

Image Credits: Andy Roberts (opens in a new window) / Getty Images

Building a marketing team is one of the most opaque parts of spinning up a startup, but for a deep tech company, the stakes couldn’t be higher.

How can technical founders working on bleeding-edge technology find the right people to tell their story?

If you work at a post-revenue, early-stage deep tech startup (or know someone who does), this post explains when to hire a team, whether they’ll need prior industry experience, and how to source and evaluate talent.

Bustle CEO Bryan Goldberg explains his plans for taking the company public

Bustle Digital Group CEO Bryan Goldberg

Bustle Digital Group CEO Bryan Goldberg. Image Credits: Bustle Digital Group

Senior Writer Anthony Ha interviewed Bustle Digital Group CEO Bryan Goldberg to get his thoughts on the state of digital media.

Their conversation covered a lot of ground, but the biggest news it contained focuses on Goldberg’s short-term plans.

“Where do I want to see the company in three years? I want to see three things: I want to be public, I want to see us driving a lot of profits and I want it to be a lot bigger, because we’ve consolidated a lot of other publications,” he said.

It may not be as glamorous as D2C, but beauty tech is big money

Directly Above Shot Of Razors On Green Background

Image Credits: Laia Divols Escude/EyeEm (opens in a new window) / Getty Images

The U.S. Federal Trade Commission is not a huge fan of personal-care D2C brands merging with traditional consumer product companies.

This month, razor startup Billie and Proctor & Gamble announced they were calling off their planned merger after the FTC filed suit.

For similar reasons, Edgewell Personal Care dropped its plans last year to buy Harry’s for $1.37 billion.

In a harsher regulatory environment, “the path to profitability has become a more important part of the startup story versus growth at all costs,” it seems.

Twilio CEO says wisdom lies with your developers

SAN FRANCISCO, CA – SEPTEMBER 12: Founder and CEO of Twilio Jeff Lawson speaks onstage during TechCrunch Disrupt SF 2016 at Pier 48 on September 12, 2016 in San Francisco, California. Image Credits: Steve Jennings/Getty Images for TechCrunch

Companies that build their own tools “tend to win the hearts, minds and wallets of their customers,” according to Twilio CEO Jeff Lawson.

In an interview with enterprise reporter Ron Miller for his new book, “Ask Your Developer,” Lawson says founders should use developer teams as a sounding board when making build-versus-buy decisions.

“Lawson’s basic philosophy in the book is that if you can build it, you should,” says Ron.


Continue Reading


Expert: Manpower is a huge cybersecurity issue in 2021




Changing threats, volume of threats, and ransomware plague organizations. Having some autonomous AI tools to help pros do their jobs can help.

More about cybersecurity

TechRepublic’s Karen Roby spoke with Marcus Fowler, director of strategic threat for Darktrace and former CIA officer, about ways to help cybersecurity professionals get their jobs done more easily. The following is an edited transcript of their conversation.

SEE: Identity theft protection policy (TechRepublic Premium)

Karen Roby: Marcus, talk a little bit about conversations that you’ve been having with clients, regardless of the size of their businesses. How have those conversations about cybersecurity changed since the start of the pandemic?

Marcus Fowler: I think it is a little unique to industry in terms of whether they’re still in survival mode and trying to make it through because their industry was specifically hit very hard by that, or they’re in leaning in and transformation mode. I think, certainly, a really new sense of dependency on cyber and digital, and with that dependency, a recognized vulnerability. I think there is more invitation for security being in the room at the most senior levels, more appreciation for understanding, “Could we have prevented that? What does this mean for us?” So, a broader discussion there. I hope that continues. It was a trend even before the pandemic, but I think it’s accelerated, too. Out of the gate, you had changes in business operations, such that the security team had to catch up. Now there’s dependency on the security team for ensuring that what’s in place is sound and secure. Business resilience, not just business success, is really tied together. I think there’s a very visceral understanding of that.

Karen Roby: As we roll into 2021 now, what are, in your opinion, the biggest threats we’re facing in cybersecurity?

Marcus Fowler: You put a good point out there in threats because I think there are two areas where people are concerned. One is the changing threat space and one is how changes or security requirements change. I think from the threat space the greatest threat I often hear about is ransomware and what they’re really worried about. You really can sense that nobody wants to be kept at ransom, so you can get that. For me, personally, a more concerning threat is an insider threat, which can come through something like a supply chain, but it is somebody already behind the walls and kind of taking advantage of intellectual property, or espionage, or doing damage to a company. Those two, to me, are the leading candidates of concern.

SEE: Predicting 2021 in cybersecurity: DDoS attacks, 5G speed, AI security, and more (TechRepublic)

In terms of what I’m hearing, not from a threat actor space, but in terms of my digital environment and where I’m worried, it’s probably two areas. One, what changes in visibility and understanding have occurred as I’ve moved to things like SaaS [Software-as-a-Service] or in the cloud, and how has that changed from a security team? I mean, it is one of the areas certainly where Darktrace, as a company, has leaned in, in terms of being able to be on the end point, being able to bring in those VPN logs, and all of those different data ingestion points because, again, without visibility and understanding, you’re really not going to do very well in that security fight.

I think the other is how to augment that human security team. I think across the industry, we often hear about the skills gap, which I laughingly call, really, a unicorn gap because companies are looking for these amazing potential employees that have every certificate under the sun, to include four years of experience on something that was released two years ago. The reality is when I talk to companies, it’s actually a cycle shortage. There’s more work than the human team can do. We’re using AI [artificial intelligence] to augment the human team by doing autonomous investigation, by doing autonomous triage, by arming the human team, those human experts to be 20 to 30 minutes into every investigation the second they start the investigation, because that commodity and heavy lifting has already occurred because the AI is doing that behind the background. That’s an area we continue to expand.

You mentioned my time at the CIA, I did a decade of counterterrorism. My greatest stress every day was as a manager and as a leader, am I using my critical human resources in the most efficient way against the most credible and imminent threats? It was really hard for me day in, day out to say with confidence that I was, but something like an AI analyst or this autonomous (tool) helping you with threat prioritization, helping your people get further in their day, in terms of all the different investigations they’re going is really an enabler and powerful.

Also see

TechRepublic’s Karen Roby spoke with Marcus Fowler, director of strategic threat for Darktrace and former CIA officer, about ways to help cybersecurity professionals get their jobs done more easily.

” data-credit=”Image: Mackenzie Burke” rel=”noopener noreferrer nofollow”>20210122-tr-darktrace-karen.jpg

TechRepublic’s Karen Roby spoke with Marcus Fowler, director of strategic threat for Darktrace and former CIA officer, about ways to help cybersecurity professionals get their jobs done more easily.

Image: Mackenzie Burke


Continue Reading
Blockchain5 days ago

5 Best Bitcoin Alternatives in 2021

Cyber Security3 days ago

Critical Cisco SD-WAN Bugs Allow RCE Attacks

Medical Devices4 days ago

Elcam Medical Joins Serenno Medical as Strategic Investor and Manufacturer of its Automatic Monitoring of Kidney Function Device

Blockchain2 days ago

TA: Ethereum Starts Recovery, Why ETH Could Face Resistance Near $1,250

SPAC Insiders5 days ago

Churchill Capital IV (CCIV) Releases Statement on Lucid Motors Rumor

SPACS3 days ago

Intel Chairman Gets Medtronic Backing for $750 Million SPAC IPO

Cyber Security4 days ago

SolarWinds Malware Arsenal Widens with Raindrop

PR Newswire4 days ago

Global Laboratory Information Management Systems Market (2020 to 2027) – Featuring Abbott Informatics, Accelerated Technology Laboratories & Autoscribe Informatics Among Others

SPAC Insiders4 days ago

Queen’s Gambit Growth Capital (GMBT.U) Prices Upsized $300M IPO

SPAC Insiders4 days ago

FoxWayne Enterprises Acquisition Corp. (FOXWU) Prices $50M IPO

SPACS3 days ago

Payments Startup Payoneer in Merger Talks With SPAC

Medical Devices5 days ago

FDA’s Planning for Coronavirus Medical Countermeasures

SPACS5 days ago

Why Clover Health Chose a SPAC, Not an IPO, to Go Public

SPACS5 days ago

With the Boom in SPACs, Private Companies Are Calling the Shots

NEWATLAS4 days ago

New Street Bob 114 heads Harley-Davidson’s 2021 lineup

NEWATLAS4 days ago

World-first biomarker test can predict depression and bipolar disorder

SPACS3 days ago

Michael Moe, fresh from raising $225M for education-focused SPAC, set for another free Startup Bootcamp

Aerospace5 days ago

Aurora Insight to launch cubesats for RF sensing

Blockchain2 days ago

Bitcoin Cash Analysis: Strong Support Forming Near $400

University of Minnesota Professor K. Andre Mkhoyan and his team used analytical scanning transmission electron microscopy (STEM), which combines imaging with spectroscopy, to observe metallic properties in the perovskite crystal barium stannate (BaSnO3). The atomic-resolution STEM image, with a BaSnO3 crystal structure (on the left), shows an irregular arrangement of atoms identified as the metallic line defect core. CREDIT Mkhoyan Group, University of Minnesota
Nano Technology4 days ago

Conductive nature in crystal structures revealed at magnification of 10 million times: University of Minnesota study opens up possibilities for new transparent materials that conduct electricity