Connect with us

Publications

How To Build and Deploy an NLP Model with FastAPI: Part 1

Published

on

Davis David Hacker Noon profile picture

@davisdavidDavis David

Data Scientist | AI Practitioner | Software Developer. Giving talks, teaching, writing.

Model deployment is one of the most important skills you should have if you’re going to work with NLP models.

Model deployment is the process of integrating your model into an existing production environment. The model will receive input and predict an output for decision-making for a specific use case.

“Only when a model is fully integrated with the business systems, we can extract real value from its predictions”. – Christopher Samiullah

There are different ways you can deploy your NLP model into production, you can use Flask, Django, Bottle e.t.c .But in today’s article, you will learn how to build and deploy your NLP model with FastAPI.

In this series of  articles, you will learn:

  • How to build a NLP model that classifies IMDB Movies reviews into different sentiments.
  • What is FastAPI and how to install it.
  • How to deploy your model with FastAPI.
  • How to use your deployed NLP model in any Python application.

In part 1, we will focus on building an NLP model that can classify movie reviews into different sentiments. So let’s get started!

How to Build the NLP Model

First, we need to build our NLP model. We are going to use the IMDB Movie dataset to build a simple model that can classify if the review about the movie is Positive or Negative. Here are the steps you should follow to do that.

Import Important packages 

First, we import important python packages to load data, clean the data, create a machine learning model (classifier), and save the model for deployment.

# import important modules
import numpy as np
import pandas as pd # sklearn modules
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB # classifier  from sklearn.metrics import ( accuracy_score, classification_report, plot_confusion_matrix,
)
from sklearn.feature_extraction.text import TfidfVectorizer # text preprocessing modules
from string import punctuation # text preprocessing modules
from nltk.tokenize import word_tokenize import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer import re #regular expression # Download dependency
for dependency in ( "brown", "names", "wordnet", "averaged_perceptron_tagger", "universal_tagset",
): nltk.download(dependency) import warnings
warnings.filterwarnings("ignore")
# seeding
np.random.seed(123)

Load the dataset from the data folder.

# load data
data = pd.read_csv("../data/labeledTrainData.tsv", sep='t')

Show sample of the dataset.

# show top five rows of data
data.head() 

Our dataset has 3 columns.

  • Id – This is the id of the review
  • Sentiment – either positive(1) or negative(0)
  • Review – comment about the movie

Check the shape of the dataset.

# check the shape of the data
data.shape

(25000, 3)

The dataset has 25,000 reviews.

We need to check if the dataset has any missing values.

# check missing values in data
data.isnull().sum()

id           0
sentiment    0
review       0
dtype: int64

The output shows that our dataset does not have any missing values.

How to Evaluate Class Distribution

We can use the value_counts() method from the pandas package to evaluate the class distribution from our dataset.

# evalute news sentiment distribution
data.sentiment.value_counts()

1    12500
0    12500
Name: sentiment, dtype: int64

In this dataset, we have an equal number of positive and negative reviews.

How to Process the Data

After analyzing the dataset, the next step is to preprocess the dataset into the right format before creating our machine learning model.

The reviews  in this dataset contain a lot of unnecessary words and characters that we don’t need when creating a machine learning model.

We will clean the messages by removing stopwords, numbers, and punctuation. Then we will convert each word into its base form by using the lemmatization process in the NLTK package.

The text_cleaning() function will handle all necessary steps to clean our dataset.

stop_words = stopwords.words('english') def text_cleaning(text, remove_stop_words=True, lemmatize_words=True): # Clean the text, with the option to remove stop_words and to lemmatize word # Clean the text text = re.sub(r"[^A-Za-z0-9]", " ", text) text = re.sub(r"'s", " ", text) text = re.sub(r'httpS+',' link ', text) text = re.sub(r'bd+(?:.d+)?s+', '', text) # remove numbers # Remove punctuation from text text = ''.join([c for c in text if c not in punctuation]) # Optionally, remove stop words if remove_stop_words: text = text.split() text = [w for w in text if not w in stop_words] text = " ".join(text) # Optionally, shorten words to their stems if lemmatize_words: text = text.split() lemmatizer = WordNetLemmatizer() lemmatized_words = [lemmatizer.lemmatize(word) for word in text] text = " ".join(lemmatized_words) # Return a list of words return(text)

Now we can clean our dataset by using the text_cleaning() function.

#clean the review
data["cleaned_review"] = data["review"].apply(text_cleaning)

Then split data into feature and target variables.

#split features and target from data 
X = data["cleaned_review"]
y = data.sentiment.values

Our feature for training is the cleaned_review variable and the target is the sentiment variable.

The TfidfVectorizer method from scikit-learn will help us transform our cleaned reviews into numerical values. The method converts a collection of text documents to a matrix of TF-IDF features.

# Transform data 
tfidf_transformer = TfidfVectorizer(lowercase=False)
tfidf_transformer.fit(X) #transform data 
X_transformed = tfidf_transformer.transform(X)

We then split our dataset into train and test data. The test size is 15% of the entire dataset.

# split data into train and validate
X_train, X_valid, y_train, y_valid = train_test_split( X_transformed, Y,
test_size=0.15, random_state=42, shuffle=True,
stratify=y,
)

How to Actually Create Our NLP Model

We will train the Multinomial Naive Bayes algorithm to classify if a review is positive or negative. This is one of the most common algorithms used for text classification.

# Create a classifier
sentiment_classifier = MultinomialNB()

Then we train our classifier.

# train the sentiment classifier  sentiment_classifier.fit(X_train,y_train)

We then create a prediction from the validation set.

# test model performance on valid data 
y_preds = sentiment_classifier.predict(X_valid)

The model’s performance will be evaluated by using the accuracy_score evaluation metric. We use accuracy_score because we have an equal number of classes in the sentiment variable.

accuracy_score(y_valid,y_preds)

0.8629333333333333

The accuracy of our model is around 86.29% which is a good performance.

Save Model

The model will be saved in the model’s directory by using the joblib python package.

#save model 
import joblib joblib.dump(sentiment_classifier,'../models/sentiment-model.pkl')

Our TfidfVectorizer will also be saved in the preprocessing directory.

#save Vectorizer joblib.dump(TfidfVectorizer,'../preprocessing/tfidf_vectorizer.pkl')

Congratulations 👏👏, you have made it to the end of this part 1. I hope you have learned something new on how to build a NLP model. In part 2 we will learn how to deploy our NLP model with FastAPI and run it in python applications.

If you learned something new or enjoyed reading this article, please share it so that others can see it. Until then, see you in part 2!

You can also find me on Twitter @Davis_McDavid.

And you can read more articles like this here.

For more AI and machine learning guides, be sure to subscribe to our newsletter in the footer below.

Tags

Join Hacker Noon

Create your free account to unlock your custom reading experience.

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://hackernoon.com/how-to-build-and-deploy-an-nlp-model-with-fastapi-part-1-n5w35cj?source=rss

CNBC

Airspeeder completes the first test flight for its electric flying race car

Published

on

Electric air racing just took a significant step forward. The Verge reports that Airspeeder recently completed the first test flight for its electric flying race car, the Alauda Aeronautics Mk3. A remote pilot flew an uncrewed version of the eVTOL aircraft over southern Australia with the country’s Civil Aviation Safety Authority watching over the test.

The machine can reach altitudes up to 1,640 feet and hit 62MPH in 2.8 seconds. Remote pilots fly in a cockpit-like environment through virtual courses, with LiDAR and radar helping to prevent collisions. Crucially, the design is meant to minimize downtime. While the Airspeeder racer can only fly for up to 15 minutes on a charge, teams can swap batteries in as little as 20 seconds.

The test flight clears the path for a three-event EXA uncrewed racing series, starting later in 2021, that will feature up to four teams with two pilots each. Data from those competitions, including dummy “tele-robotic” avatars in the cockpits, will ideally lead to directly human-piloted races in 2022.

Airspeeder will still deal with many of the challenges of electric flight, including the short-lived batteries. All the same, this test and the subsequent races suggest EV air races are quickly becoming practical. It may be more a question of refining the technology than getting it into the skies in the first place.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://www.engadget.com/airspeeder-electric-flying-race-car-test-flight-175132594.html?src=rss_b2c

Continue Reading

CNBC

Apple’s iPad Air returns to a record low $539 at Amazon

Published

on

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.

It’s a good time to start shopping if you’ve been eying an iPad but feel the M1 iPad Pro is overkill for your needs. Amazon is once more selling the fourth-generation iPad Air with WiFi and 64GB of storage at a record low $539, well under its $599 official sticker. The tablet hasn’t reached that price since February, and this suggests you won’t have to wait until Prime Day to score a big deal. You’ll also see a particularly large discount on the 256GB rose gold Air, which is selling for $660 instead of its usual $749.

Buy iPad Air at Amazon – $539

The iPad Air won’t be as speedy as the latest Pro, but it’s still one of the fastest tablets around. You’ll also get strong battery life and support for both the Magic Keyboard as well as the second-generation Apple Pencil. If you like the idea of the 11-inch iPad Pro but don’t need the M1, 120Hz display or multi-camera setup, you’ll get a very similar experience for much less money.

The main catch, aside from those tradeoffs, remains the software. The iPad Air can handle a surprisingly large number of tasks that would normally be reserved for laptops, but iPadOS 14 remains more of a mobile platform than a full desktop OS substitute. iPad OS 15 should improve multitasking, but you’ll still want a conventional computer for many heavy duty tasks. With that said, the discounted iPad Air with a Magic Keyboard is significantly less expensive than a MacBook Air — that’s worth considering if you’re more interested in touch support and tablet flexibility than performance or sophisticated software.

Get the latest Amazon Prime Day offers by visiting our deals homepage and following @EngadgetDeals on Twitter.

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://www.engadget.com/apple-ipad-air-sale-amazon-june-2021-164012510.html?src=rss_b2c

Continue Reading

CNBC

AirPods Pro are on sale for $190 ahead of Prime Day

Published

on

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.

You don’t have to wait until Prime Day to get a solid discount on AirPods Pro. Amazon is selling Apple’s true wireless earbuds for $190, a healthy $60 below the official price. That’s one of the lowest prices we’ve seen outside of Black Friday, and the best price we’ve seen in months — they’re worth grabbing if you want a taste of spatial audio in Apple Music.

Buy AirPods Pro at Amazon – $190

There’s a chance you know the AirPods Pro story by now. They’re Apple’s best-sounding earbuds, and they offer strong active noise cancellation, a comfortable fit and workout-friendly water resistance. If you can get past their Apple-centric support (though you can use them with Android) and don’t mind the fiddly on-bud audio controls, they’re an easy choice.

However, they’ve gained a lot of value in recent months. Spatial audio in movies, and now Apple Music, gives you a level of immersion that’s still rare in wireless earbuds. You’ll have to be content with a selection of optimized content, and you can forget about Apple Lossless, but it may give you a reason to pick the AirPods Pro ahead of other rivals. Just remember there are alternatives you might prefer if you don’t mind their fit, such as Sony’s new WF-1000XM4.

Get the latest Amazon Prime Day offers by visiting our deals homepage and following @EngadgetDeals on Twitter.

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://www.engadget.com/apple-airpods-pro-sale-june-2021-151050367.html?src=rss_b2c

Continue Reading

CNBC

This $80 quadcopter has a 4K camera and fits in your hand

Published

on

This content is made possible by our sponsor; it is not written by and does not necessarily reflect the views of Engadget’s editorial staff.

Drones make fun toys, but they also find their way into photographers’ and videographers’ gear boxes because they can capture aerial footage in a way that standard cameras can’t reproduce. They’ve also dropped in price considerably. For example, the Ninja Dragon Vortex 9 quadcopter drone features a 4K camera that gives you a bird’s eye view of the world, and you can buy one on sale now for $80.

The Ninja Dragon is a quadcopter that pairs excellent flight control with high-quality video. Headless mode removes the need to adjust the positioning of the aircraft before you use it. A six-axis gyroscope allows for smoother travel and more responsive controls, excellent for quick adjustments and flying in unfamiliar terrain. The drone even has an altitude hold setting to stabilize its position, ideal for capturing an even view to observe the terrain around you or record it to review later. When you’re ready to end your flight, use the one-key automatic return so the Vortex 9 will find its way back to you.

The 4K camera on this little quadcopter is extremely detailed, allowing you to capture sharp aerial images that you simply can’t take from the ground. You can also get a live view from the perspective of the Ninja Dragon by connecting your smartphone and app. The drone is compact: under two inches tall and barely five inches long. You’ll be able to easily transport it using the included carrying case, which also fits the included controller and keeps both devices safe.

This tiny, handheld drone is an excellent tool for both recreational and professional use, giving you easy controls, responsive flight and 4K images you can see live or record for later. For a limited time, you can get the Ninja Dragon Vortex 9 for $80, a 46 percent discount.

Prices subject to change.

Engadget is teaming up with StackSocial to bring you deals on the latest headphones, gadgets, tech toys, and tutorials. This post does not constitute editorial endorsement, and we earn a portion of all sales. If you have any questions about the products you see here or previous purchases, please contact StackSocial support here.

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://www.engadget.com/ninja-dragon-vortex-9-rc-sale-145545146.html?src=rss_b2c

Continue Reading
Esports4 days ago

World of Warcraft 9.1 Release Date: When is it?

Energy4 days ago

Biocides Market worth $13.6 billion by 2026 – Exclusive Report by MarketsandMarkets™

Esports4 days ago

Here are the patch notes for Brawl Stars’ Jurassic Splash update

Esports1 day ago

Select Smart Genshin Impact: How to Make the Personality Quiz Work

Blockchain4 days ago

Former PayPal Employees Launch Cross-Border Payment System

Blockchain4 days ago

PancakeSwap (CAKE) Price Prediction 2021-2025: Will CAKE Hit $60 by 2021?

Esports4 days ago

Here are the patch notes for Call of Duty: Warzone’s season 4 update

Energy4 days ago

XCMG dostarcza ponad 100 sztuk żurawi dostosowanych do regionu geograficznego dla międzynarodowych klientów

Esports3 days ago

How to complete Path to Glory Update SBC in FIFA 21 Ultimate Team

Blockchain3 days ago

Will Jeff Bezos & Kim Kardashian Take “SAFEMOON to the Moon”?

Blockchain5 days ago

Civic Ledger awarded as Technology Pioneers by World Economic Forum

Gaming4 days ago

MUCK: Best Seeds To Get Great Loot Instantly | Seeds List

Esports3 days ago

How to Get the Valorant ‘Give Back’ Skin Bundle

Blockchain2 days ago

Digital Renminbi and Cash Exchange Service ATMs Launch in Beijing

Esports4 days ago

How to unlock the MG 82 and C58 in Call of Duty: Black Ops Cold War season 4

Esports4 days ago

How to unlock the Call of Duty: Black Ops Cold War season 4 battle pass

Aviation2 days ago

Southwest celebrates 50 Years with a new “Freedom One” logo jet on N500WR

Blockchain4 days ago

CUHK Pairs with ConsenSys To Launch Blockchain-based Covid Digital Health Passport

Blockchain2 days ago

Bitcoin isn’t as Anonymous as People Think it is: Cornell Economist

Aerospace4 days ago

TU Delft unveils databank to predict future of composite aerostructures

Trending