Connect with us


Machine Learning Tutorial: The Multinomial Logistic Regression (Softmax Regression)




multinomial-logistic-regressionIn the previous two machine learning tutorials, we examined the Naive Bayes and the Max Entropy classifiers. In this tutorial we will discuss the Multinomial Logistic Regression also known as Softmax Regression. Implementing Multinomial Logistic Regression in a conventional programming language such as C++, PHP or JAVA can be fairly straightforward despite the fact that an iterative algorithm would be required to estimate the parameters of the model.

Update: The Datumbox Machine Learning Framework is now open-source and free to download. Check out the package com.datumbox.framework.machinelearning.classification to see the implementation of SoftMax Regression Classifier in Java.

What is the Multinomial Logistic Regression?

The Multinomial Logistic Regression, also known as SoftMax Regression due to the hypothesis function that it uses, is a supervised
learning algorithm which can be used in several problems including text classification. It is a regression model which generalizes the logistic regression to classification problems where the output can take more than two possible values. We should note that Multinomial Logistic Regression is closely related to MaxEnt algorithm because it uses the same activation functions. Nevertheless in this article we will present the method in a different context than we did on Max Entropy tutorial.

When to use Multinomial Logistic Regression?

Multinomial Logistic Regression requires significantly more time to be trained comparing to Naive Bayes, because it uses an iterative algorithm to estimate the parameters of the model. After computing these parameters, SoftMax regression is competitive in terms of CPU and memory consumption. The Softmax Regression is preferred when we have features of different type (continuous, discrete, dummy variables etc), nevertheless given that it is a regression model, it is more vulnerable to multicollinearity problems and thus it should be avoided when our features are highly correlated.

Theoretical Background of Softmax Regression

Similarly to Max Entropy, we will present the algorithm in the context of document classification. Thus we will use the contextual information of the document in order to categorize it to a certain class. Let our training dataset consist of m (xi,yi) pairs and let k be the number of all possible classes. Also by using the bag-of-words framework let {w1,…,wn} be the set of n words that can appear within our texts.

The model of SoftMax regression requires the estimation of a coefficient theta for every word and category combination. The sign and the value of this coefficient show whether the existence of the particular word within a document has a positive or negative effect towards its classification to the category. In order to build our model we need to estimate the parameters. (Note that the θi vector stores the coefficients of ith category for each of the n words, plus 1 for coefficient of the intercept term).

In accordance with what we did previously for Max Entropy, all the documents within our training dataset will be represented as vectors with 0s and 1s that indicate whether each word of our vocabulary exists within the document. In addition, all vectors will include an additional “1” element for the intercept term.

In Softmax Regression the probability given a document x to be classified as y is equal to:


Thus given that we have estimated the aforementioned θ parameters and given a new document x, our hypothesis function will need to estimate the above probability for each of the k possible classes. Thus the hypothesis function will return a k dimensional vector with the estimated probabilities:


By using the “maximum a posteriori” decision rule, when we classify a new document we will select the category with the highest probability.

In our Multinomial Logistic Regression model we will use the following cost function and we will try to find the theta parameters that minimize it:


Unfortunately, there is no known closed-form way to estimate the parameters that minimize the cost function and thus we need to use an iterative algorithm such as gradient descent. The iterative algorithm requires us estimating the partial derivative of the cost function which is equal to:


By using the batch gradient descent algorithm we estimate the theta parameters as follows:

1. Initialize vector θj with 0 in all elements
2. Repeat until convergence {
            θj <- θj - α (for every j)

After estimating the θ parameters, we can use our model to classify new documents. Finally we should note that as we discussed in a previous article, using the Gradient Descent “as is” is never a good idea. Extending the algorithm to adapt learning rate and normalizing the values before performing the iterations are ways to improve convergence and speed up the execution.

Did you like the article? Please take a minute to share it on Twitter. 🙂

About Vasilis Vryniotis

My name is Vasilis Vryniotis. I’m a Data Scientist, a Software Engineer, author of Datumbox Machine Learning Framework and a proud geek. Learn more



FYI: You do all know that America’s tech giants, even Google, supply IT to the US military, right?




Despite all those protests, internal and external, by tech workers against their employers’ selling AI to the US military, the Pentagon’s Joint Artificial Intelligence Center (JIAC) this week said the biggest names in IT are lining up to supply Uncle Sam.

Founded in 2018, the JAIC focuses on deploying machine-learning systems to support America’s armed forces. It invests in all sorts of applications and platforms, from internal private clouds to drones.

Nand Mulchandani, who took over as acting director of the organization after Lieutenant General Jack Shanahan retired in June, told reporters in a briefing on Wednesday that the center has “major contracts with all the major tech companies – including Google.”

The acting director singled out the search giant presumably to highlight what he saw as a bout of hypocrisy by Google, which made a big play about not helping Uncle Sam develop instruments of death yet still provides tech services on the down-low.

In 2018, CEO Sundar Pichai pulled Google out Project Maven – a contract to provide the US military object-tracking AI to analyze drone surveillance footage – after a revolt by Googlers. Soon after, the chief exec publicly declared Google wouldn’t, among other things, “design or deploy … technologies that cause or are likely to cause overall harm” nor “weapons or other technologies whose principal purpose or implementation is to cause or directly facilitate injury to people.”

Later that year, Pichai also withdrew the web corp’s bid for the Pentagon’s $10bn winner-takes-all Joint Enterprise Defense Infrastructure (JEDI) cloud contract because it did not align with his company’s values. An inability to gain the certification needed to offer cloud services to the military was also a little tiny roadblock, as was its fear of publicly losing out to Amazon or Microsoft. But in any case Google spun itself as a lofty pacifist and definitely not a digital arms manufacturer, and so it wasn’t going to get involved in this sort of stuff again.

Mulchandani, however, painted a different picture: the JAIC’s bonds with massive tech corps were “only getting stronger” no matter the protests and promises from Silicon Valley techies and executives, he claimed. Some of these technology suppliers may be working directly with the government, and some through a network of subcontractors. Some giants, particularly Microsoft, are outwardly proud to serve the US military, technology-wise.


Microsoft to staff: We remain locked and loaded with US military – and will keep adding voice to AI ethics debate


A study by Tech Inquiry, an investigative nonprofit led by an ex-Googler, revealed Google as well as Microsoft, Amazon, Facebook, Nvidia, Dell, HP, IBM, Twitter, Palantir, and others, supply the Department of Defense and the Feds with technology one way or another. For example, Google’s G Suite is used by the Navy and FBI, we see.

Tech Inquiry’s Jack Poulson noted: “On balance, Google’s position became supporting the DoD’s cloud and cybersecurity while avoiding direct contributions to weapons systems.”

As for the JAIC, it takes an interest in things like the role of machine learning in warfighting operations, warfighter health, logistics, and information warfare. Mulchandani said that most of the organization’s budget is funneled into warfighting operations. “It is true that many of our products we work on now will go into weapon systems,” he said.

But he was quick to insist none of those systems were autonomous: humans still give the final commands, we understand. In other words, the center isn’t building killer robots or weapons that decide solely whether someone lives or dies. “It’s such an outer edge case,” he said. “We are nowhere near that in a platforms, hardware, software, or algorithms perspective to even get near that.”

Another controversial area that the JAIC said it wasn’t involved in is facial recognition. Mulchandani acknowledged that other countries, such as China and Russia, are more ready to deploy this sort of tech compared to his operation. “[We’re] not behind, it’s just that we don’t build [those systems]. We don’t build surveillance and censorship technologies,” he insisted. ®


Continue Reading


Microsoft announces new Azure AI capabilities for apps, healthcare, and more




The latest announcements will help companies enhance their voice-enabled application experiences and provide critical data management across healthcare industries.

Artificial Intelligence project creating. Abstract concept of cyber technology, machine learning.Brain of AI. Futuristic Innovative technology in science concept

Image: Sompong Rattanakunchon / Getty Images

In the era of digital transformation, more organizations across industries are looking to leverage artificial intelligence (AI) to enhance day-to-day operations. In recent weeks, a number of organizations have tapped AI to help mitigate the spread of the coronavirus. These applications range from using AI systems to monitor social distancing and contact tracing to identifying potential treatments for COVID-19. Earlier today, Microsoft announced a series of updates to the Azure AI system to help with everything from enhanced healthcare data management to leveraging the latest voice-enabled technologies for enhanced customer engagement experiences.

More about artificial intelligence

Text Analytics

In partnership with the Allen Institute of AI and other research groups, Microsoft developed the COVID-19 Open Research Dataset. Utilizing nearly 50,000 scholarly articles, the team created a COVID-19 search engine. This search engine uses Microsoft Cognitive Search and Text Analytics for health to allow researchers to produce new medical insights to combat the spread of the coronavirus.

SEE: Coronavirus: Critical IT policies and tools every business needs (TechRepublic Premium)

As part of the Microsoft announcement, the company unveiled a new Text Analytics for health feature. This new Text Analytics feature will allow healthcare organizations, providers, and researchers to gain insights and correlations from unstructured medical information. This new feature has been trained on a wide spectrum of medical information and is capable of “processing a broad range of data types and tasks, without the need for time-intensive, manual development of custom models to extract insights from the data,” per Microsoft.

Form Recognizer

Currently, unstructured medical data is stored in forms comprised of objects, tables, and other ordering components. To effectively gain insights from this unstructured data, people historically have had to manually label or code each of these document types. To assist with this arduous process, Microsoft also announced a generally available Form Recognizer tool enabling individuals more expeditiously extract this data in an accurate and efficient way.

“Our Cognitive Document Processing (CDP) offer enables clients to process and classify unstructured documents and extract data with high accuracy resulting in reduced operating costs and processing time. CDP leverages the powerful cognitive and tagging capabilities of the Form Recognizer to extract effortlessly, keyless paired data and other relevant information from scanned/digital unstructured documents, further reducing the overall process time,” said Mark Oost, chief technology officer at Sogeti.

SEE: Hiring Kit: Computer Research Scientist (TechRepublic Premium)

Custom Commands

Microsoft also announced a Custom Commands feature designed to assist with voice-enabled applications and integration. Overall, the feature merges the Azure’s Speech to Text, Text to Speech, and Language Understanding allowing customers to quickly add their voice capabilities to their apps “with a low-code authoring experience.” Custom Commands uses Speech in Cognitive Services capabilities and is now generally available. Microsoft also announced that its Neural Text to Speech would be offering language support with “15 new natural-sounding voices based on state-of-the-art neural speech synthesis models.”

Also see


Continue Reading


Giving your content a voice with the Newscaster speaking style from Amazon Polly




Audio content consumption has grown exponentially in the past few years. Statista reports that podcast ad revenue will exceed a billion dollars in 2021. For the publishing industry and content providers, providing audio as an alternative option to reading could improve engagement with users and be an incremental revenue stream. Given the shift in customer trends to audio consumption, Amazon Polly launched a new speaking style focusing on the publishing industry: the Newscaster speaking style. This post discusses how the Newscaster voice was built and how you can use the Newscaster voice with your content in a few simple steps.

Building the Newscaster style voice

Until recently, Amazon Polly voices were built such that the speaking style of the voice remained the same, no matter the use case. In the real world, however, speakers change their speaking style based on the situation at hand, from using a conversational style around friends to using upbeat and engaging speech when telling stories. To make voices as lifelike as possible, Amazon Polly has built two speaking style voices: Conversational and Newscaster. Newscaster style, available in US English for Matthew and Joanna, and US Spanish for Lupe, gives content a voice with the persona of a news anchor. Have a listen to the following samples:

With the successful implementation of Neural Text-to-Speech (NTTS), text synthesis no longer relies on a concatenative approach, which mainly consisted of finding the best chunks of recordings to generate synthesized speech. The concatenative approach played audio that was an exact copy of the recordings stored for that voice. NTTS, on the other hand, relies on two end-to-end models that predict waveforms, which results in smoother speech with no joins. NTTS outputs waveforms by learning from training data, which enables seamless transitions between all the sounds and allows us to focus on the rhythm and intonation of the voice to match the existing voice timbre and quality for Newscaster speaking style.

Remixd, a leading audio technology partner for premium publishers, helps publishers and media owners give their editorial content a voice using Amazon Polly. Christopher Rooke, CEO of Remixd, says, “Consumer demand for audio has exploded, and content owners recognize that the delivery of journalism must adapt to meet this moment. Using Amazon Polly’s Newscaster voice, Remixd is helping news providers innovate and keep up with demand to serve the growing customer appetite for audio. Remixd and Amazon Polly make it easy for publishers to remain relevant as content consumption preferences shift.”

Remixd uses Amazon Polly to provide audio content production efficiencies at scale, which makes it easy for publishers to instantly enable audio for new and existing editorial content in real time without needing to invest in costly human voice talent, narration, and pre- and post-production overhead. Rooke adds, “When working with news content, where information is time-sensitive and perishable, the voice quality, and the ability to process large volumes of content and publish the audio version in just a few seconds, is critical to service our customer base.” The following screenshot shows Remixd’s audio player live on one of their customer’s website Daily Caller.

“At the Daily Caller, it’s a priority that our content is accessible and convenient for visitors to consume in whichever format they prefer,” says Chad Brady, Director of Operations of the Daily Caller. “This includes audio, which can be time-consuming and costly to produce. Using Remixd, coupled with Amazon Polly’s high-quality newscaster voice, Daily Caller editorial articles are made instantly listenable, enabling us to scale production and distribution, and delight our audience with a best-in-class audio experience both on and off-site.”

The new NTTS technology enables newscaster voices to be more expressive. However, although the expressiveness vastly increases how natural the voice sounds, it also makes the model more susceptible to discrepancies. NTTS technology learns to model intonation patterns for a given punctuation mark based on data it was provided. Because the intonation patterns are much more extreme for style voices, good annotation of the training data is essential. The Amazon Polly team trained the model with an initial small set of newscaster recordings in addition to the existing recordings from the speakers. Having more data leads to more robust models, but to build a model in a cost- and time-efficient manner, the Amazon Polly team worked on concepts such as multi-speaker models, which allow you to use existing resources instead of needing more recordings from the same speaker.

Evaluations have shown that our newscaster voice is preferred over the neutral speaking style for voicing news content. The following histogram shows results for the Joanna Newscaster voice when compared to other voices for the news use case.

Using Newscaster style to voice your audio content

To use the Newscaster style with Python, complete the following steps (this solution requires Python 3):

  1. Set up and activate your virtual environment with the following code:
    $ python3 -m virtualenv ./venv
    $ . ./venv/bin/activate

  2. Install the requirements with the following code:
    $ pip install boto3 click

  3. In your preferred text editor, create a file See the following code:
    import boto3
    import click
    import sys polly_c = boto3.client('polly') @click.command()
    def main(voice, text): if voice not in ['Joanna', 'Matthew', ‘Lupe’]: print('Only Joanna, Matthew and Lupe support the newscaster style') sys.exit(1) response = polly_c.synthesize_speech( VoiceId=voice, Engine='neural', OutputFormat='mp3', TextType='ssml', Text = f'<speak><amazon:domain name="news">{text}></amazon:domain></speak>') f = open('newscaster.mp3', 'wb') f.write(response['AudioStream'].read()) f.close() if __name__ == '__main__': main()

  4. Run the script passing the name and text you want to say:
    $ python ./ Joanna "Synthesizing the newsperson style is innovative and unprecedented. And it brings great excitement in the media world and beyond."

This generates newscaster.mp3, which you can play in your favorite media player.


This post walked you through the Newscaster style and how to use it in Amazon Polly. The Matthew, Joanna, and Lupe Newscaster voices are used by customers such as The Globe and Mail, Gannetts’ USA Today, DailyCaller and many others.

To learn more about using the Newscaster style in Amazon Polly, see Using the Newscaster Style. For the full list of voices that Amazon Polly offers, see Voices in Amazon Polly.

About the Authors

Joppe Pelzer is a Language Engineer working on text-to-speech for English and building style voices. With bachelor’s degrees in linguistics and Scandinavian languages, she graduated from Edinburgh University with an MSc in Speech and Language Processing in 2018. During her masters she focused on the text-to-speech front end, building and expanding upon multilingual G2P models, and has gained experience with NLP, Speech recognition and Deep Learning. Outside of work, she likes to draw, play games, and spend time in nature.

Ariadna Sanchez is a Research Scientist investigating the application of DL/ML technologies in the area of text-to-speech. After completing a bachelor’s in Audiovisual Systems Engineering, she received her MSc in Speech and Language Processing from University of Edinburgh in 2018. She has previously worked as an intern in NLP and TTS. During her time at University, she focused on TTS and signal processing, especially in the dysarthria field. She has experience in Signal Processing, Deep Learning, NLP, Speech and Image Processing. In her free time, Ariadna likes playing the violin, reading books and playing games.


Continue Reading
Cannabis7 mins ago

The therapeutic properties of CBD

Cannabis19 mins ago

Can The Future Of Florida’s Cannabis Industry Boost Marijuana Stocks?

Blockchain31 mins ago

Top Bitcoin (BTC) Strategist Proclaims Altcoin Season Has Arrived, Names Four Crypto Assets to Watch and One Ready to Retreat

Blockchain35 mins ago

On-chain analyst explains why Chainlink (LINK) is in a “parabolic advance”

Cannabis36 mins ago

Ny Times Piece Says Insurance Costs Spike For Cannabis Companies Following Rise In Sector Lawsuits

Cannabis41 mins ago


Gaming45 mins ago

What time does the Steam Summer Sale 2020 end?

Cannabis46 mins ago

Maryland Congressman Tries To Block D.C.’s Psychedelics Decriminalization Ballot Measure

Cannabis49 mins ago

MGC Pharma granted medicinal cannabis import licence

Cannabis52 mins ago

PURA Confirms New Cannabis Cultivation Spinoff and Planned Dividend

Cannabis54 mins ago

An Aussie first: Althea Group (ASX:AGH) launches online medicinal cannabis sales

Cannabis1 hour ago

Johnny Depp’s UK Sun Libel Trial Is Turning Into One Cannabis Confession After Another

Blockchain1 hour ago

Litecoin (LTC) Price Analysis: Fresh Rally To $50 Seems Likely

Blockchain1 hour ago

$100,000 USDC Blacklisted, Highlighting Importance of Decentralized Stablecoins

Cannabis1 hour ago

EuroLife Brands, based in Toronto, said it is issuing 500,000 common shares valued at $250,000 and a cash payment of $35,000 for its initial stake in Farmhus GmbH.

Cannabis1 hour ago

Vantage Hemp Co. Announces Completion of its World-Class Extraction Facilities

Cannabis1 hour ago

Ecofibre Opens New U.S. Hemp Facility

Cannabis1 hour ago

Illinois adult-use cannabis market continues to be hampered by supply issues

Cannabis1 hour ago

MA: Town Of Lakeville Re-Draws Zoning On Cannabis Businesses

Blockchain2 hours ago

Wasabi Wallet – Complete Overview and Installation Procedure.

Blockchain2 hours ago

Australian comedian Jim Jefferies bought one bitcoin for $10,000 just for fun.

Blockchain2 hours ago

Record Revenue for Blockchain Gaming Company During Pandemic

Blockchain2 hours ago

Chainlink (LINK) Rally Stalls At $6.6: Here’s Why Uptrend Is Intact

Blockchain2 hours ago

BitFlyer Partners With Brave to Develop New Crypto Wallet

Gaming2 hours ago

Twitch streamer Ohlana has died by suicide at age 26

Blockchain2 hours ago

Bitcoin Bearish Fractal Casts Doubt on Breakout Possibilities

Cannabis2 hours ago

HempMeds®, Subsidiary of Medical Marijuana, Inc., Launches CBD Topical and Beauty Products at Gelson’s Market

Blockchain2 hours ago

Why Top Analyst Predicts an Altcoin Rally on Key Ethereum Breakout

AI2 hours ago

FYI: You do all know that America’s tech giants, even Google, supply IT to the US military, right?

Blockchain3 hours ago

Thunderbolt 4: Same 40Gbps speed but Intel ups requirements for certification

Esports3 hours ago

Ninjas in Pyjamas cuts captain SoNNeikO after string of bad results

Blockchain3 hours ago

Jim Jefferies Owns Four Times as Much Bitcoin as Elon Musk

Gaming3 hours ago

Rod ‘Slasher’ Breslau on COVID-19 and esports

Gaming3 hours ago

ShackStream: F1 2020 first look

Blockchain3 hours ago

Sergey Nazarov: Smart Contract Adoption by Enterprises About to Take Off

Cannabis3 hours ago

Fort Cannabis Company installs camera that scans people’s body temps for a fever

Blockchain4 hours ago

Smarter contracts: Unraveling the upcoming Goguen era of Cardano (ADA)

Blockchain4 hours ago

Bitcoin Daily Chart Indicates Crucial Bullish Breakout Towards $10,000

Blockchain4 hours ago

Exploring the Cryptoverse and the Trade-off Between Security and Coin Creation

Blockchain4 hours ago

Former Facebook Counsel Joins Coinbase as Chief Legal Officer