“AWS DeepComposer is a 32-key, 2-octave keyboard designed for developers to get hands on with Generative AI, with either pretrained models or your own,” AWS’ Julien Simon wrote in a blog post introducing the company’s latest machine learning hardware.
The keyboard is supposed to help developers learn about machine learning in a fun way, and maybe create some music along the way. The area involved in generating creative works in artificial intelligence is called “generative AI.” In other words, it helps you teach machines to generate something creative using “generative adversarial networks.”
“Developers, regardless of their background in ML or music, can get started with Generative Adversarial Networks (GANs). This Generative AI technique pits two different neural networks against each other to produce new and original digital works based on sample inputs. With AWS DeepComposer, you can train and optimize GAN models to create original music,” according to Amazon.
AWS DeepComposer keyboard
Developers can train their own machine learning models or use ones supplied by Amazon to get started. Either way, you create the music based on the model, tweak it in the DeepComposer console on the AWS cloud, then generate your music. If you wish, you can share your machine-generated composition on SoundCloud when you’re done.
This is the third machine learning teaching device from Amazon, joining the DeepLens camera introduced in 2017 and the DeepRacer racing cars introduced last year. It’s worth noting that this is just an announcement. The device isn’t quite ready yet, but Amazon is allowing account holders to sign up for a preview when it is.
Despite some fears that the current crisis would kill off legal tech incubators and accelerators, most have still gone ahead ‘as normal’, albeit virtually.
Artificial Lawyer has always been a big supporter of such projects as they enable startups to really learn what their clients want, and in turn their engagement with lawyers helps increase legal market understanding of what a wide variety of technology can do. It’s a win/win.
Here are three legal tech programmes that have announced new cohorts recently, from London, UK; to Vienna in Austria; and North Carolina in the US. And there are many more, from Australia, to Singapore, and beyond.
First up, elite firm Slaughter and May’s Collaborate incubator announced its second cohort in late June. The new group of companies are:
Della AI accelerates contract review. Users ask their own questions in their own words to analyse the points that matter to them. Della is used in due diligence and internal audits as well as day-to-day analysis of contracts, leases and other legal documents.
thedocyard is a deal management and transactional workflow platform which digitises and automates deals by standardising repeatable processes, providing real time status updates, providing virtual data rooms and allowing collaboration between parties.
Immediation is a confidential online dispute resolution platform, providing advanced hearing and mediation technology to courts, and an alternative fixed-fee, easy, secure and highly efficient method of resolving disputes outside court.
Juralio enables lawyers and clients to map legal work collaboratively so as to plan, execute, control and report more effectively. This helps to deliver better value for clients, better margins for lawyers and less pain for everyone.
The Lexical Labs system is designed to review documents, identify problems or negotiation points and provide solutions, by combining advanced technology and embedded expertise.
Novastone is a secure instant messaging platform integrated with public IM such as WhatsApp and WeChat. It is designed for firms to deliver a personalised client experience through relationship teams.
Office & Dragons is a document automation startup on a mission to make documenting transactions simple, reliable and fast. It empowers lawyers to transform documents from a mess of static text into dynamic representations of data.
There is no minimum size/shape/age/financial position for acceptance into Slaughter and May Collaborate. The decision is primarily on the strength, uniqueness and promise of the concept or product, and on the team involved, they said.
The programme is primarily aimed at early and mid-stage ventures rather than established businesses, but applications are accepted from all businesses developing exciting new legal tech. Maybe have a think about joining in 2021…?
In Vienna we have the LTHV accelerator programme, backed by a number of Austrian law firms and other businesses. Their new cohort was announced in May.
One of the cohort, Della AI – which is also a member of the new Collaborate group above – spoke to Artificial Lawyer about the experience.
Founder Christophe Frèrebeau said: ‘Vienna has been incredible! They offer courses [on startup development] and we have got six pilots with large firms very quickly through LTHV.’
He added that LTHV sent the cohort VR headsets so that they could work together virtually, which Frèrebeau confirmed was a lot of fun, and most importantly: effective.
And in the US, Duke Law School’s ‘Tech Lab’ has announced its fourth cohort. These had a strong Access to Justice focus:
Don’t Get Mad Get Paid – Helps women get paid their back child support and collect what’s rightfully theirs by tracking down child support evaders and generating customised legal documents
JusticeText –Strengthens the ability of public defenders to serve low-income criminal defendants through video evidence management software that leverages AI to process body-worn camera footage, interrogation videos, and more
People Clerk – Guides California litigants throughout the small claims process giving them the tools to prepare, settle, and litigate their dispute.
Yo Tengo Bot – Automates the interaction between immigration law firms and potential clients through a white label chatbot powered by artificial intelligence and machine learning (available in both English and Spanish).
The Lab began on June 24 and runs for three months. It’s supported by LexisNexis, Travelers, and global law firm Latham & Watkins.
LeeAnn Black, Chief Operating Officer at Latham commented: ‘[We are] committed to fostering this innovative programme that provides crucial support to entrepreneurs and gives law students exposure to emerging and impactful technologies.’
So, there you go. Incubators are alive and well, all across the world, and this is just a sample of what is out there.
P.S. Mishcon de Reya’s now famous MDR LAB will also be eventually returning, the firm tells Artificial Lawyer, but will operate with a quite different pace and structure. Watch this space.
Several chipmakers are making some major changes in the characterization/metrology lab, adding more fab-like processes in this group to help speed up chip development times.
The characterization/metrology lab, which is generally under the radar, is a group that works with the R&D organization and the fab. The characterization lab is involved in the early analytical work for next-generation devices, packages and materials. Using advanced metrology equipment, the lab’s goal is to characterize or gain a better understanding of the make-up of new technologies at the atomic scale. The lab pinpoints defects and other problems in devices, which eventually could boost product yields.
Traditionally, there has been a delineation between the lab and fab, and the two organizations often work in silos. The lab provides analytical capabilities with advanced but slower equipment. The fab, which manufactures the chips, also has advanced metrology tools and other equipment, which tend to have faster throughputs.
For some companies, though, the role of the metrology lab is changing. The devices, packages and materials are becoming more complex, but chip and/or packaging vendors are under pressure to maintain their production schedules or even accelerate them. Otherwise, they may miss the market window.
The lab continues to handle the traditional analytical work, but the fab wants the results much faster. So several chipmakers are implementing more fab-like processes in the lab. For example, Intel is automating some tools, conducting more fab-like measurements, and deploying machine learning techniques. More importantly, Intel’s R&D, lab and fab teams are working more closely together to speed up the characterization process. Intel refers to this as a “holistic measurement strategy.”
“One critical purpose of the lab is to provide fundamental learning to drive data-driven decisions early in the process development cycle, especially as device and process interactions are becoming more complex,” said Markus Kuhn, a technical director and engineering manager at Intel. “The technology challenges are driving the fab to reach for lab capabilities to meet the metrology needs. To meet fab demand, lab capabilities need to improve in regards to automation and productivity.”
Samsung, TSMC and others are moving in similar directions, according to analysts. The goal is to speed up the cycles of learning and beat the competition to the punch. “It’s all about information turns,” said Dan Hutcheson, CEO of VLSI Research. “The faster you can learn, the faster you can get to the next node. It’s whoever gets there first.”
As a byproduct of this shift, chipmakers are seeing a related trend. With devices becoming more complex, they are moving some of the advanced metrology lab tools into the fab. That trend isn’t new and is well documented.
Nonetheless, chipmakers face some challenges. In the lab, they want more automation and hardware improvements with the existing platforms. Tool costs are also an issue in both the lab and fab.
Technology challenges Basically, a chip consists of three parts — transistor, contacts and interconnects. The transistor serves as a switch for the device. A leading-edge chip incorporates billions of tiny transistors.
The interconnects, which are on top of the transistor, consist of tiny copper wiring schemes that transfer electrical signals from one transistor to another. Then, a layer called the middle-of-line connects the transistor and interconnect pieces using tiny contact structures.
Starting in 2011, chipmakers migrated from traditional planar transistors to finFETs at 22nm. Foundries moved to finFETs at 16nm/14nm. In finFETs, the control of the current is accomplished by implementing a gate on each of the three sides of a fin.
FinFETs are faster with lower power than planar transistors, but they are harder and more expensive to make. As a result, process R&D and design costs have skyrocketed. Now, the cadence for a fully scaled node has extended from 18 to 30 months.
Fig. 2: FinFET vs. planar. Source: Lam Research
Still, fueled by AI, 5G, data centers and mobile apps, chipmakers are migrating to the next nodes. They are ramping up 5nm with 3nm in R&D.
At these nodes, the manufacturing challenges escalate. Other issues are also cropping up. “At bleeding-edge nodes, some of these chips are huge. The reticle field, in some cases, can maybe only sustain a handful of these chips. In some cases, the yields are not very good,” said Walter Ng, vice president of business development at UMC.
Nonetheless, starting at 3nm and/or 2nm, chipmakers plan to migrate from finFETs to a next-generation transistor called nanosheet FETs. A nanosheet FET is a finFET on its side with a gate wrapped around it.
“There is a lot more complexity in a nanowire or nanosheet than in a finFET. There are new processes, and those are very challenging,” said Rick Gottscho, CTO of Lam Research.
The challenges aren’t limited to logic. For example, vendors are shipping various next-generation memories, such as phase-change memory and STT-MRAM. These memories are fast with unlimited endurance, but they require some new innovations to get a bigger foothold in the market.
“Complexity, among other things, will come with the introduction of new materials, particularly for something like the MRAM stack, which is not only complicated, but also sensitive to process conditions, and therefore difficult to etch vertically,” Gottscho said. “That’s why, to date, you don’t see any high-density standalone MRAM. You see it all being embedded into logic, which is a consequence of the materials.”
Then, in R&D, chipmakers are working on other technologies, including 2D materials, carbon nanotubes and complementary FETs. Next-generation packages are also in R&D.
Not all technologies in R&D will make it into production. The eventual winners and losers are determined by cost, functionality and manufacturability.
Inside the metrology lab For current and future technologies, the characterization lab plays a big role here. The lab is on the front lines. Next-generation chips, packages and materials tend to go to the analytical lab first for characterization and early integration work. At times, the fab may run into problems with a device. So the fab will call on the lab to handle the failure analysis tasks.
The lab uses various metrology and failure analysis equipment. Some equipment is exclusively used in the lab, while others are found in both the lab and fab.
Using these tools, the goal is to characterize the devices and image them at the atomic scale in three dimensions. Take a new transistor, for example. “We’re looking at various functions, interfaces, materials and electrical components of that transistor. We’re trying to engineer gate stacks and the source/drain. We’re looking at alternative channel materials. We’re looking at patterning capabilities and fidelity at the nanoscale. And that’s just the transistor,” Intel’s Kuhn said. “We also want to know where every atom is and what it is. It’s not just where the atom is and what it is, but how is it bonded or what are the implications to the electronic state within that interface or that collection of atoms.”
Traditionally, the lab provides the data and then hands off the results to the fab. Once the devices are in the fab, the lab is less involved.
That’s starting to change at some companies. “The technology challenges with all these new materials and architectures are pushing us. It’s pushing us into being much more fab-like in terms of how we can generate data,” Kuhn said.
In a presentation at the recent Symposia on VLSI Technology and Circuits, Kuhn outlined how the characterization lab is becoming more fab like. They are:
More tool automation.
More fab-like measurements.
Hybrid metrology, where different metrology tools are combined to provide measurements.
Implementing machine learning techniques.
More importantly, the lab is no longer a silo — at least for some. At Intel and others, the R&D teams, as well as the lab and fab, are working closer together to help speed up the process.
Still, the lab and fab are separate and each group has different charters. But to characterize a given device, both the lab and fab generally will segment a device into various categories, such as the dimensions, composition, dopants and strain.
No one metrology tool can handle all requirements. So the lab and fab may require one or more tools for a given category. In the lab some systems are slow, but they still do the job. Others are being automated. Some are still not up to speed.
All labs are equipped with tools for dimensional metrology. For this, a given tool provides the critical dimensional (CD) measurements for devices, such as height, length and width. For CDs, a lab might use the critical-dimension scanning electron microscope (CD-SEM), which takes top-down measurements of structures. They also use optical CD equipment, which utilizes polarized light to characterize devices.
The lab also uses various X-ray metrology systems. Perhaps the most promising and frustrating technology is called critical-dimension small-angle X-ray scattering (CD-SAXS). Using X-rays with wavelengths less than 0.1nm, CD-SAXS makes use of variable-angle transmission scattering techniques from a small beam size to provide measurements.
“CD-SAXS can solve the CD, disorder in the CD, and differences in electron density between layers,” said Joseph Kline, a materials engineer at NIST. “CD-SAXS can also measure buried structures and optically opaque layers.”
Several companies sell CD-SAXS tools, mostly for R&D. Intel, Samsung, TSMC and others have CD-SAXS tools in the lab.
CD-SAXS provides advanced measurements, but it’s too slow and expensive for the fab. The X-ray source is the big issue. It’s not bright or powerful enough, which impacts the throughput. Some but not all of the other X-ray metrology tools face similar issues.
Still, CD-SAXS is making inroads in some applications, such as high-aspect ratio structures in memory. “For memory, the structures are deep. The scattering is good, so there is a clear roadmap to about 1 minute or less per site,” said Paul Ryan, director of product management at Bruker. “For logic, the technique is still in the concept phase. There are expected to be challenges for the X-ray intensity.”
Other lab tools are making more progress, such as APT, SIMS, among others. These systems are widely used in the lab for traditional characterization work.
Generally, in the lab, atom probe tomography (APT) and secondary ion mass spectrometry (SIMS) are used to examine the dopants in chips. Dopants are elements that modify the conductivity in chips. They include boron and phosphorus.
APT also is used for metals, strain and other applications. “APT is the only material analysis technique offering extensive capabilities for both 3D imaging and chemical composition measurements at the near atomic scale, around 0.1 to 0.3nm resolution in depth and 0.3 to 0.5nm laterally,” said David Larson, director of scientific marketing at Cameca.
In APT, a sample is prepared in the form of a sharp needle-shaped specimen. The tip is biased at a high DC voltage on a cryogenic specimen stage in a chamber. “The very small radius of the tip and the voltage together induce a very high electrostatic field at the tip surface, just below the threshold for atom removal by what is called field evaporation,” Larson said. “When the specimen is subjected to laser or, for some materials, voltage pulsing, one or more atoms are evaporated by the high electric field from the surface and projected onto a position-sensitive detector, with up to 80% of the atoms being detected and reconstructed in 3D with sub-nanometer spatial resolution.”
APT isn’t new and been used for years. The next step for APT is a push toward automation and operator-independent data output.
SIMS, meanwhile, is used for dopants and other apps. “When a solid sample is sputtered by primary ions of few keV energy, a fraction of the particles emitted from the target are ionized,” Larson said. “SIMS consists of analyzing these secondary ions with a mass spectrometer. Secondary ion emission by a solid surface under ion bombardment supplies information about the elemental, isotopic and molecular composition of its uppermost atomic layers. The secondary ion yields will vary greatly according to the chemical environment and the sputtering conditions (ion, energy, angle). This can add complexity to the quantitative aspect of the technique. SIMS is nevertheless recognized as the most sensitive elemental and isotopic surface analysis technique.”
SIMS is attempting to move out of the lab. Cameca recently developed a SIMS platform, which can provide process monitoring at line, if not directly in-line, in an automatic mode.
More measurements Indeed, automation is a big shift in the lab. Generally, in the past, lab tools were manual. Today, some but not all lab tools are being automated to help speed up the process.
The transmission electron microscope (TEM) is one tool that’s being automated in the lab. A TEM is used for dimensional and strain metrology. Strain involves the channel materials in chips.
In operation, a TEM generates electrons and sends them through a sample. The electrons interact with the sample, which provide information about the structure at the nanoscale. However, a TEM is also a destructive technique. A sample is created by cutting part of a device. Chipmakers would prefer not to cut a device in production. That’s why TEMs are found in the lab, but they also are used in the fab to generate reference data.
In the lab, device makers sometimes combine a TEM with a technique called electron energy loss spectroscopy (EELS). In EELS, the system reduces the incident electrons as they pass through the sample, according to EAG Laboratories.
This in turn provides a 3D tomography image. “It provides us with elemental composition and the dimensions,” Intel’s Kuhn said. “What it lacks is bonding. The next step would be to look at EELS, and look at the near edge fine features and near edge structures, and then isolate the bonding. That activity is in progress.”
Besides automation, other fab-like processes are moving in the lab. For example, in the fab, chipmakers use metrology tools with model-based approaches. For this, the tools don’t measure the actual device. Instead, they measure test structures that mimic the device.
In the lab, Intel is implementing this fab-like approach. “There are automation efforts and full wafer capability being developed across lab tools, as well as the emerging use of these techniques like on-die patterned structures,” Kuhn said.
For strain measurements, a device maker might use test structures. For this, the lab may use high-resolution X-ray diffraction (HRXRD) and Raman spectroscopy. HRXRD characterizes single-crystal thin-film materials. Raman spectroscopy identifies chemical structures and compounds. The TEM is also involved here.
“Raman is an optical method. We can back correlate that to what we see in the TEM. The TEM provides a very local picture of the XRD,” Kuhn said. “Raman provides a more comprehensive means of looking at statistically an array in an ensemble of these nanoscale devices and how they are behaving. This allows for process targeting.”
Meanwhile, if that’s not enough, machine learning also is moving to the lab. Machine learning uses advanced algorithms in systems to recognize patterns in data as well as to learn and make predictions about the information.
Chipmakers are combining various tools with machine learning to find and classify defects in chips. “Machine learning can automate choosing the parameters of the model to make it much faster for the human to explore various model forms,” said Aki Fujimura, chief executive of D2S. “Fabs and mask shops also use classical machine learning in the big data analysis of all the operation data available to look for ways to improve yield and prevent downtime.”
So how can the lab can become more engaged with the fab? That’s where hybrid metrology fits in. In hybrid metrology, you take the measurement data from several metrology tools and combine them.
“In hybrid metrology, you can take any tool. It can be in-line, near-fab or in the lab. You take their output, typically using machine learning, to incorporate it into in-line metrology. And what that enables you to do is have that in-line metrology tool extend its capabilities,” Intel’s Kuhn said.
In the fab Finally, once the devices are characterized and qualified, they move from the lab to the fab. Fabs are automated facilities that process wafers using various equipment in a cleanroom.
It’s a complex process. To make an advanced logic device, the wafer undergoes anywhere from 600 to 1,000 steps, or more, in the fab.
That’s not the only challenge. Logic and memory devices are more complex. The equipment must process smaller and more exact features at each node. And defects might surface during the flow.
So during the flow, a wafer undergoes several inspection and metrology steps in the fab. Take next-generation nanosheet transistors for example. “Nanosheets present big inspection and metrology challenges,” said Mark Shirey, vice president of marketing and applications at KLA, in a presentation at the recent Symposia on VLSI Technology and Circuits. “These 3D structures introduce new buried defects and noise sources. It looks like a combination of optical and e-beam will be needed to inspect these. And in metrology, there is a lot of local variability that needs to be measured with many new measurements.”
Optical and e-beam tools are wafer inspection systems, which find tiny defects in chips. For metrology, chipmakers will use more than a dozen systems for the latest devices in the fab.
For CD measurements, chipmakers use CD-SEMs, OCD, TEMs and other tools. For nanosheets, chipmakers will also use various X-ray metrology tools.
For example, chipmakers use X-ray photoelectron spectroscopy (XPS) in both the lab and fab. “XPS is a surface-sensitive quantitative spectroscopic technique that measures the elemental composition of thin films to determine the composition of materials in devices,” said Kavita Shah, senior director of strategic marketing at Nova.
Conclusion Clearly, characterizing structures is a challenging process in the lab. It requires a slew of complex measurements.
That’s only half the battle. Now, the lab must learn how to speed up the process. This will take more time and money, if not a new mindset.
This is becoming a requirement, though. Otherwise, chipmakers may end up falling behind in a competitive landscape.
Text classification is a technique for putting text into different categories, and has a wide range of applications: email providers use text classification to detect spam emails, marketing agencies use it for sentiment analysis of customer reviews, and discussion forum moderators use it to detect inappropriate comments.
In the past, data scientists used methods such as tf-idf, word2vec, or bag-of-words (BOW) to generate features for training classification models. Although these techniques have been very successful in many natural language processing (NLP) tasks, they don’t always capture the meanings of words accurately when they appear in different contexts. Recently, we see increasing interest in using Bidirectional Encoder Representations from Transformers (BERT) to achieve better results in text classification tasks, due to its ability to encode the meaning of words in different contexts more accurately.
Amazon SageMaker is a fully managed service that provides developers and data scientists the ability to build, train, and deploy machine learning (ML) models quickly. Amazon SageMaker removes the heavy lifting from each step of the ML process to make it easier to develop high-quality models. The Amazon SageMaker Python SDK provides open-source APIs and containers that make it easy to train and deploy models in Amazon SageMaker with several different ML and deep learning frameworks.
Our customers often ask for quick fine-tuning and easy deployment of their NLP models. Furthermore, customers prefer low inference latency and low model inference cost. Amazon Elastic Inference enables attaching GPU-powered inference acceleration to endpoints, which reduces the cost of deep learning inference without sacrificing performance.
This post demonstrates how to use Amazon SageMaker to fine-tune a PyTorch BERT model and deploy it with Elastic Inference. The code from this post is available in the GitHub repo. For more information about BERT fine-tuning, see BERT Fine-Tuning Tutorial with PyTorch.
What is BERT?
First published in November 2018, BERT is a revolutionary model. First, one or more words in sentences are intentionally masked. BERT takes in these masked sentences as input and trains itself to predict the masked word. In addition, BERT uses a next sentence prediction task that pretrains text-pair representations.
One of the biggest challenges data scientists face for NLP projects is lack of training data; you often have only a few thousand pieces of human-labeled text data for your model training. However, modern deep learning NLP tasks require a large amount of labeled data. One way to solve this problem is to use transfer learning.
Transfer learning is an ML method where a pretrained model, such as a pretrained ResNet model for image classification, is reused as the starting point for a different but related problem. By reusing parameters from pretrained models, you can save significant amounts of training time and cost.
BERT was trained on BookCorpus and English Wikipedia data, which contains 800 million words and 2,500 million words, respectively . Training BERT from scratch would be prohibitively expensive. By taking advantage of transfer learning, you can quickly fine-tune BERT for another use case with a relatively small amount of training data to achieve state-of-the-art results for common NLP tasks, such as text classification and question answering.
In this post, we walk through our dataset, the training process, and finally model deployment.
For this post, we use Corpus of Linguistic Acceptability (CoLA), a dataset of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature. In our notebook, we download and unzip the data using the following code:
if not os.path.exists("./cola_public_1.1.zip"): !curl -o ./cola_public_1.1.zip https://nyu-mll.github.io/CoLA/cola_public_1.1.zip
if not os.path.exists("./cola_public/"): !unzip cola_public_1.1.zip
In the training data, the only two columns we need are the sentence itself and its label:
If we print out a few sentences, we can see how sentences are labeled based on their grammatical completeness. See the following code:
print(labels[20:25]) ["The professor talked us." "We yelled ourselves hoarse." "We yelled ourselves." "We yelled Harry hoarse." "Harry coughed himself into a fit."]
[0 1 0 0 1]
We then split the dataset for training and testing before uploading both to Amazon S3 for use later. The SageMaker Python SDK provides a helpful function for uploading to Amazon S3:
from sagemaker.session import Session
from sklearn.model_selection import train_test_split train, test = train_test_split(df)
test.to_csv("./cola_public/test.csv", index=False) session = Session()
inputs_train = session.upload_data("./cola_public/train.tsv", key_prefix="sagemaker-bert/training/data")
inputs_test = session.upload_data("./cola_public/test.tsv", key_prefix="sagemaker-bert/testing/data")
For this post, we use the PyTorch-Transformers library, which contains PyTorch implementations and pretrained model weights for many NLP models, including BERT. See the following code:
model = BertForSequenceClassification.from_pretrained( "bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab. num_labels=2, # The number of output labels--2 for binary classification. output_attentions=False, # Whether the model returns attentions weights. output_hidden_states=False, # Whether the model returns all hidden-states.
Our training script should save model artifacts learned during training to a file path called model_dir, as stipulated by the Amazon SageMaker PyTorch image. Upon completion of training, Amazon SageMaker uploads model artifacts saved in model_dir to Amazon S3 so they are available for deployment. The following code is used in the script to save trained model artifacts:
model_2_save = model.module if hasattr(model, "module") else model
We save this script in a file named train_deploy.py, and put the file in a directory named code/, where the full training script is viewable.
Because PyTorch-Transformer isn’t included natively in Amazon SageMaker PyTorch images, we have to provide a requirements.txt file so that Amazon SageMaker installs this library for training and inference. A requirements.txt file is a text file that contains a list of items that are installed by using pip install. You can also specify the version of an item to install. To install PyTorch-Transformer, we add the following line to the requirements.txt file:
You can view the entire file in the GitHub repo, and it also goes into the code/ directory. For more information about the format of a requirements.txt file, see Requirements Files.
Training on Amazon SageMaker
We use Amazon SageMaker to train and deploy a model using our custom PyTorch code. The Amazon SageMaker Python SDK makes it easier to run a PyTorch script in Amazon SageMaker using its PyTorch estimator. After that, we can use the SageMaker Python SDK to deploy the trained model and run predictions. For more information about using this SDK with PyTorch, see Using PyTorch with the SageMaker Python SDK.
To start, we use the PyTorch estimator class to train our model. When creating the estimator, we make sure to specify the following:
entry_point – The name of the PyTorch script
source_dir – The location of the training script and requirements.txt file
framework_version: The PyTorch version we want to use
The PyTorch estimator supports multi-machine, distributed PyTorch training. To use this, we just set train_instance_count to be greater than 1. Our training script supports distributed training for only GPU instances.
After creating the estimator, we call fit(), which launches a training job. We use the Amazon S3 URIs we uploaded the training data to earlier. See the following code:
After training starts, Amazon SageMaker displays training progress (as shown in the following code). Epochs, training loss, and accuracy on test data are reported:
2020-06-10 01:00:41 Starting - Starting the training job...
2020-06-10 01:00:44 Starting - Launching requested ML instances......
2020-06-10 01:02:04 Starting - Preparing the instances for training............
2020-06-10 01:03:48 Downloading - Downloading input data...
2020-06-10 01:04:15 Training - Downloading the training image..
2020-06-10 01:05:03 Training - Training image download completed. Training in progress.
Train Epoch: 1 [0/3207 (0%)] Loss: 0.626472
Train Epoch: 1 [350/3207 (98%)] Loss: 0.241283
Average training loss: 0.5248292144022736
Test set: Accuracy: 0.782608695652174
We can monitor the training progress and make sure it succeeds before proceeding with the rest of the notebook.
After training our model, we host it on an Amazon SageMaker endpoint by calling deploy on the PyTorch estimator. The endpoint runs an Amazon SageMaker PyTorch model server. We need to configure two components of the server: model loading and model serving. We implement these two components in our inference script train_deploy.py. The complete file is available in the GitHub repo.
model_fn() is the function defined to load the saved model and return a model object that can be used for model serving. The SageMaker PyTorch model server loads our model by invoking model_fn:
def model_fn(model_dir): device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = BertForSequenceClassification.from_pretrained(model_dir) return model.to(device)
input_fn() deserializes and prepares the prediction input. In this use case, our request body is first serialized to JSON and then sent to model serving endpoint. Therefore, in input_fn(), we first deserialize the JSON-formatted request body and return the input as a torch.tensor, as required for BERT:
def input_fn(request_body, request_content_type): if request_content_type == "application/json": sentence = json.loads(request_body) input_ids =  encoded_sent = tokenizer.encode(sentence,add_special_tokens = True) input_ids.append(encoded_sent) # pad shorter sentences input_ids_padded = for i in input_ids: while len(i) < MAX_LEN: i.append(0) input_ids_padded.append(i) input_ids = input_ids_padded # mask; 0: added, 1: otherwise [int(token_id > 0) for token_id in sent] for sent in input_ids # convert to PyTorch data types. train_inputs = torch.tensor(input_ids) train_masks = torch.tensor(attention_masks) # train_data = TensorDataset(train_inputs, train_masks) return train_inputs, train_masks
predict_fn() performs the prediction and returns the result. See the following code:
Finally, we use the returned predictor object to call the endpoint:
result = predictor.predict("Somebody just left - guess who.")
print(np.argmax(result, axis=1)) 
The predicted class is 1, which is expected because the test sentence is a grammatically correct sentence.
Deploying the endpoint with Elastic Inference
Selecting the right instance type for inference requires deciding between different amounts of GPU, CPU, and memory resources. Optimizing for one of these resources on a standalone GPU instance usually leads to underutilization of other resources. Elastic Inference solves this problem by enabling you to attach the right amount of GPU-powered inference acceleration to your endpoint. In March 2020, Elastic Inference support for PyTorch became available for both Amazon SageMaker and Amazon EC2.
Loading the TorchScript model and using it for prediction requires small changes in our model loading and prediction functions. We create a new script deploy_ei.py that is slightly different from train_deploy.py script.
For model loading, we use torch.jit.load instead of the BertForSequenceClassification.from_pretrained call from before:
 Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE international conference on computer vision, pages 19–27.
About the Authors
Qingwei Li is a Machine Learning Specialist at Amazon Web Services. He received his Ph.D. in Operations Research after he broke his advisor’s research grant account and failed to deliver the Noble Prize he promised. Currently he helps customers in financial service and insurance industry build machine learning solutions on AWS. In his spare time, he likes reading and teaching.
David Ping is a Principal Solutions Architect with the AWS Solutions Architecture organization. He works with our customers to build cloud and machine learning solutions using AWS. He lives in the NY metro area and enjoys learning the latest machine learning technologies.
Lauren Yu is a Software Development Engineer at Amazon SageMaker. She works primarily on the SageMaker Python SDK, as well as toolkits for integrating PyTorch, TensorFlow, and MXNet with Amazon SageMaker. In her spare time, she enjoys playing viola in the Amazon Symphony Orchestra and Doppler Quartet.