Connect with us

AI

This news article about the full public release of OpenAI’s ‘dangerous’ GPT-2 model was part written by GPT-2

Avatar

Published

on

OpenAI’s massive text-generating language model, which was whispered to be too dangerous to release, has finally been published in full after the research lab concluded it has “seen no strong evidence of misuse so far.”

The model, known as GPT-2, was announced back in February. At the time, only a partial version of the material was made public as it was deemed potentially too harmful to unleash: it was feared the technology could be abused to rapidly and automatically churn out large amounts of semi-convincing-looking fake news articles, phishing and spam emails, bogus blog posts, and so on. When The Register was privately given access to GPT-2 to test it, we found that…

…it did not actually contain any malicious code at all. We also found that there was nothing in GPT-2 to cause trouble at all, despite all the alarmist press.

The GPT-2 code was built by researchers from the DeepMind AI research lab in London to allow developers to build more efficient language models. The model generates artificial text for use in games such as Dota 2 and Counter-Strike: Global Offensive.

The code that went on sale for free to researchers was written in Python, which is one of the languages Google uses to power the search engine,

OK, sorry, that’s enough. Yes, those last three paragraphs were actually written by the AI model itself. We fed the hand-crafted opening sentences of this article into an online implementation of the full GPT-2 model as a writing prompt, and what you just read was what the code came up with in response. The web interface was built by Adam King using the full GPT-2 release from OpenAI.

DeepMind obviously wasn’t involved in the making of GPT-2, and the text created by the model wasn’t used in either of those computer games at all. As you can see, the software does have the potential to generate fake news but it’s not that convincing. Anyone with a half a brain could easily check those claims and find them to be false. Sadly, anything can go viral on the internet, wrong or right, of course.

More impressively, perhaps, is its appearance of self-awareness. The thing admitted it found nothing in its own code that could “cause trouble at all, despite all the alarmist press”. Jokes aside, the snippet above is a prime example of GPT-2’s hit-and-miss abilities. Occasionally, it spits out sentences that are surprisingly good, but as it keeps churning out text, it becomes incoherent. What does it mean for something to go “on sale for free,” anyway?

A responsible disclosure experiment

Despite the model’s shortcomings, OpenAI decided to withhold publishing the full model over fears it could be used to flood the internet with the text equivalent of deepfakes. Instead, the San Francisco-based research lab tentatively tested the waters by releasing larger and larger models, starting from just a few hundred million parameters.

The smallest version contained 117 million parameters, the second had 345 million parameters, the third consisted of 774 million parameters, and the largest one, released on Tuesday, has the full 1.5 billion parameters. The more parameters, the more powerful and capable the model, generally speaking.

“As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper,” the lab previously said.

The decision to hold off releasing the full model split the artificial intelligence community. On the one hand, many applauded OpenAI for acknowledging the risks and for attempting to safely handle a model that could potentially be misused on an unprecedented scale. On the other hand, holding back the code fueled fears of dangerous machine-learning systems ruining humanity.

During the time the full GPT-2 system was publicly withheld, other text-spewing models built by other labs were released into the wild. Researchers at Allen Institute of AI and the University of Washington, in the US, developed GROVER, a model capable of generating and detecting fake news. They gave other academics the blueprints for GROVER via an application process.

The folks over at Nvidia jammed a whopping 8.3 billion parameters into Google’s language model BERT, and open sourced code to build it, without batting an eyelid. A pair of graduate students, who were at Brown University at the time, even replicated the whole GPT-2 model and stuck it online for anyone to use.

“We demonstrate that many of the results of [OpenAI’s] paper can be replicated by two masters students, with no prior experience in language modeling,” the duo said. “Because of the relative ease of replicating this model, an overwhelming number of interested parties could replicate GPT-2.”

OpenAI stuck to its guns. “While there have been larger language models released since August, we’ve continued with our original staged release plan in order to provide the community with a test case of a full staged release process,” it said in hindsight. “We acknowledge that we cannot be aware of all threats, and that motivated actors can replicate language models without model release.”

To its credit, during the time that the full GPT-2 model was withheld, OpenAI also partnered with boffins to investigate biases in language, and probe how its model could be potentially fine-tuned for particularly worrisome applications, such as producing terrorist propaganda. It also built a detector that can classify whether or not a section of prose was written by GPT-2 with more than 90 per cent certainty. Academics could apply for access to GPT-2 by emailing OpenAI, much like GROVER.

“We hope that this test case will be useful to developers of future powerful models, and we’re actively continuing the conversation with the AI community on responsible publication,” the lab added.

You can find the full GPT-2 model right here.

We have one final thought on this fascinating AI research: it’s at least set a bar for human writers. If you want to write news or feature articles, blog posts, marketing emails, and the like, know that you now have to be better than GPT-2’s semi-coherent output. Otherwise, people might as well just read a bot’s output than your own. ®

Editor’s note: Any typos on this page were made by a neural network. So there.

Sponsored: Detecting cyber attacks as a small to medium business

Source: https://go.theregister.co.uk/feed/www.theregister.co.uk/2019/11/06/openai_gpt2_released/

Continue Reading

AI

Understanding dimensionality reduction in machine learning models

Avatar

Published

on

Join Transform 2021 this July 12-16. Register for the AI event of the year.


Machine learning algorithms have gained fame for being able to ferret out relevant information from datasets with many features, such as tables with dozens of rows and images with millions of pixels. Thanks to advances in cloud computing, you can often run very large machine learning models without noticing how much computational power works behind the scenes.

But every new feature that you add to your problem adds to its complexity, making it harder to solve it with machine learning algorithms. Data scientists use dimensionality reduction, a set of techniques that remove excessive and irrelevant features from their machine learning models.

Dimensionality reduction slashes the costs of machine learning and sometimes makes it possible to solve complicated problems with simpler models.

The curse of dimensionality

Machine learning models map features to outcomes. For instance, say you want to create a model that predicts the amount of rainfall in one month. You have a dataset of different information collected from different cities in separate months. The data points include temperature, humidity, city population, traffic, number of concerts held in the city, wind speed, wind direction, air pressure, number of bus tickets purchased, and the amount of rainfall. Obviously, not all this information is relevant to rainfall prediction.

Some of the features might have nothing to do with the target variable. Evidently, population and number of bus tickets purchased do not affect rainfall. Other features might be correlated to the target variable, but not have a causal relation to it. For instance, the number of outdoor concerts might be correlated to the volume of rainfall, but it is not a good predictor for rain. In other cases, such as carbon emission, there might be a link between the feature and the target variable, but the effect will be negligible.

In this example, it is evident which features are valuable and which are useless. in other problems, the excessive features might not be obvious and need further data analysis.

But why bother to remove the extra dimensions? When you have too many features, you’ll also need a more complex model. A more complex model means you’ll need a lot more training data and more compute power to train your model to an acceptable level.

And since machine learning has no understanding of causality, models try to map any feature included in their dataset to the target variable, even if there’s no causal relation. This can lead to models that are imprecise and erroneous.

On the other hand, reducing the number of features can make your machine learning model simpler, more efficient, and less data-hungry.

The problems caused by too many features are often referred to as the “curse of dimensionality,” and they’re not limited to tabular data. Consider a machine learning model that classifies images. If your dataset is composed of 100×100-pixel images, then your problem space has 10,000 features, one per pixel. However, even in image classification problems, some of the features are excessive and can be removed.

Dimensionality reduction identifies and removes the features that are hurting the machine learning model’s performance or aren’t contributing to its accuracy. There are several dimensionality techniques, each of which is useful for certain situations.

Feature selection

Feature selection

A basic and very efficient dimensionality reduction method is to identify and select a subset of the features that are most relevant to target variable. This technique is called “feature selection.” Feature selection is especially effective when you’re dealing with tabular data in which each column represents a specific kind of information.

When doing feature selection, data scientists do two things: keep features that are highly correlated with the target variable and contribute the most to the dataset’s variance. Libraries such as Python’s Scikit-learn have plenty of good functions to analyze, visualize, and select the right features for machine learning models.

For instance, a data scientist can use scatter plots and heatmaps to visualize the covariance of different features. If two features are highly correlated to each other, then they will have a similar effect on the target variable, and including both in the machine learning model will be unnecessary. Therefore, you can remove one of them without causing a negative impact on the model’s performance.

Heatmap

Above: Heatmaps illustrate the covariance between different features. They are a good guide to finding and culling features that are excessive.

The same tools can help visualize the correlations between the features and the target variable. This helps remove variables that do not affect the target. For instance, you might find out that out of 25 features in your dataset, seven of them account for 95 percent of the effect on the target variable. This will enable you to shave off 18 features and make your machine learning model a lot simpler without suffering a significant penalty to your model’s accuracy.

Projection techniques

Sometimes, you don’t have the option to remove individual features. But this doesn’t mean that you can’t simplify your machine learning model. Projection techniques, also known as “feature extraction,” simplify a model by compressing several features into a lower-dimensional space.

A common example used to represent projection techniques is the “swiss roll” (pictured below), a set of data points that swirl around a focal point in three dimensions. This dataset has three features. The value of each point (the target variable) is measured based on how close it is along the convoluted path to the center of the swiss roll. In the picture below, red points are closer to the center and the yellow points are farther along the roll.

Swiss roll

In its current state, creating a machine learning model that maps the features of the swiss roll points to their value is a difficult task and would require a complex model with many parameters. But with the help of dimensionality reduction techniques, the points can be projected to a lower-dimension space that can be learned with a simple machine learning model.

There are various projection techniques. In the case of the above example, we used “locally-linear embedding,” an algorithm that reduces the dimension of the problem space while preserving the key elements that separate the values of data points. When our data is processed with the LLE, the result looks like the following image, which is like an unrolled version of the swiss roll. As you can see, points of each color remain together. In fact, this problem can still be simplified into a single feature and modeled with linear regression, the simplest machine learning algorithm.

Swiss roll, projected

While this example is hypothetical, you’ll often face problems that can be simplified if you project the features to a lower-dimensional space. For instance, “principal component analysis” (PCA), a popular dimensionality reduction algorithm, has found many useful applications to simplify machine learning problems.

In the excellent book Hands-on Machine Learning with Python, data scientist Aurelien Geron shows how you can use PCA to reduce the MNIST dataset from 784 features (28×28 pixels) to 150 features while preserving 95 percent of the variance. This level of dimensionality reduction has a huge impact on the costs of training and running artificial neural networks.

dimensionality reduction mnist dataset

There are a few caveats to consider about projection techniques. Once you develop a projection technique, you must transform new data points to the lower dimension space before running them through your machine learning model. However, the costs of this preprocessing step are not comparable to the gains of having a lighter model. A second consideration is that transformed data points are not directly representative of their original features and transforming them back to the original space can be tricky and in some cases impossible. This might make it difficult to interpret the inferences made by your model.

Dimensionality reduction in the machine learning toolbox

Having too many features will make your model inefficient. But cutting removing too many features will not help either. Dimensionality reduction is one among many tools data scientists can use to make better machine learning models. And as with every tool, they must be used with caution and care.

Ben Dickson is a software engineer and the founder of TechTalks, a blog that explores the ways technology is solving and creating problems.

This story originally appeared on Bdtechtalks.com. Copyright 2021

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://venturebeat.com/2021/05/16/understanding-dimensionality-reduction-in-machine-learning-models/

Continue Reading

AI

Bitcoin Mining Company Vows to be Carbon Neutral Following Tesla’s Recent Statement

Avatar

Published

on

Last week, Elon Musk and Tesla shocked the entire crypto industry following an announcement that the electric car company will no longer accept bitcoin payments for “environmental reasons.”

A Hard Pill For Bitcoin Maximalists

Giving its reasons, Tesla argued that Bitcoin mining operation requires massive energy consumption, which is generated from fossil fuel, especially coal, and as such, causes environmental pollution.

The announcement caused a market dip which saw over $4 billion of both short and long positions liquidated as the entire capitalization lost almost $400 billion in a day.

For Bitcoin maximalists and proponents, Tesla’s decision was a hard pill to swallow, and that was evident in their responses to the electric car company and its CEO.

While the likes of Max Keiser lambasted Musk for his company’s move, noting that it was due to political pressure, others like popular YouTuber Chris Dunn were seen canceling their Tesla Cybertruck orders.


ADVERTISEMENT

Adding more fuel to the fire, Musk also responded to a long Twitter thread by Peter McCormack, implying that Bitcoin is not actually decentralized.

Musk Working With Dogecoin Devs

Elon Musk, who named himself the “Dogefather” on SNL, created a Twitter poll, asking his nearly 55 million followers if they want Tesla to integrate DOGE as a payment option.

The poll, which had almost 4 million votes, was favorable for Dogecoin, as more than 75% of the community voted “Yes.”

Following Tesla’s announcement, the billionaire tweeted that he is working closely with Dogecoin developers to improve transaction efficiency, saying that it is “potentially promising.”

Tesla dropping bitcoin as a payment instrument over energy concerns, with the possibility of integrating dogecoin payments, comes as a surprise to bitcoiners since the two cryptocurrencies use a Proof-of-Work (PoW) consensus algorithm and, as such, face the same underlying energy problem.

Elon Musk: Dogecoin Wins Bitcoin

Despite using a PoW algorithm, Elon Musk continues to favor Dogecoin over Bitcoin. Responding to a tweet that covered some of the reasons why Musk easily chose DOGE over BTC, the billionaire CEO agreed that Dogecoin wins Bitcoin in many ways.

Comparing DOGE to BTC, Musk noted that “DOGE speeds up block time 10X, increases block size 10X & drops fee 100X. Then it wins hands down.”

Max Keiser: Who’s The Bigger Idiot?

As Elon Musk continues his lovey-dovey affair with Dogecoin, Bitcoin proponents continue to criticize the Dogefather.

Following Musk’s comments on Dogecoin today, popular Bitcoin advocate Max Keiser took to his Twitter page to ridicule the Tesla boss while recalling when gold bug Peter Schiff described Bitcoin as “intrinsically worthless” after he lost access to his BTC wallet.

“Who’s the bigger idiot?” Keiser asked.

Aside from Keiser, other Bitcoin proponents such as Michael Saylor replied to Tesla’s CEO:

SPECIAL OFFER (Sponsored)

Binance Futures 50 USDT FREE Voucher: Use this link to register & get 10% off fees and 50 USDT when trading 500 USDT (limited offer).

PrimeXBT Special Offer: Use this link to register & enter POTATO50 code to get 50% free bonus on any deposit up to 1 BTC.

You Might Also Like:

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://coingenius.news/bitcoin-mining-company-vows-to-be-carbon-neutral-following-teslas-recent-statement-6/?utm_source=rss&utm_medium=rss&utm_campaign=bitcoin-mining-company-vows-to-be-carbon-neutral-following-teslas-recent-statement-6

Continue Reading

AI

Bitcoin Proponents Against Elon Musk Following Heated Dogecoin vs Bitcoin Tweets

Avatar

Published

on

Last week, Elon Musk and Tesla shocked the entire crypto industry following an announcement that the electric car company will no longer accept bitcoin payments for “environmental reasons.”

A Hard Pill For Bitcoin Maximalists

Giving its reasons, Tesla argued that Bitcoin mining operation requires massive energy consumption, which is generated from fossil fuel, especially coal, and as such, causes environmental pollution.

The announcement caused a market dip which saw over $4 billion of both short and long positions liquidated as the entire capitalization lost almost $400 billion in a day.

For Bitcoin maximalists and proponents, Tesla’s decision was a hard pill to swallow, and that was evident in their responses to the electric car company and its CEO.

While the likes of Max Keiser lambasted Musk for his company’s move, noting that it was due to political pressure, others like popular YouTuber Chris Dunn were seen canceling their Tesla Cybertruck orders.


ADVERTISEMENT

Adding more fuel to the fire, Musk also responded to a long Twitter thread by Peter McCormack, implying that Bitcoin is not actually decentralized.

Musk Working With Dogecoin Devs

Elon Musk, who named himself the “Dogefather” on SNL, created a Twitter poll, asking his nearly 55 million followers if they want Tesla to integrate DOGE as a payment option.

The poll, which had almost 4 million votes, was favorable for Dogecoin, as more than 75% of the community voted “Yes.”

Following Tesla’s announcement, the billionaire tweeted that he is working closely with Dogecoin developers to improve transaction efficiency, saying that it is “potentially promising.”

Tesla dropping bitcoin as a payment instrument over energy concerns, with the possibility of integrating dogecoin payments, comes as a surprise to bitcoiners since the two cryptocurrencies use a Proof-of-Work (PoW) consensus algorithm and, as such, face the same underlying energy problem.

Elon Musk: Dogecoin Wins Bitcoin

Despite using a PoW algorithm, Elon Musk continues to favor Dogecoin over Bitcoin. Responding to a tweet that covered some of the reasons why Musk easily chose DOGE over BTC, the billionaire CEO agreed that Dogecoin wins Bitcoin in many ways.

Comparing DOGE to BTC, Musk noted that “DOGE speeds up block time 10X, increases block size 10X & drops fee 100X. Then it wins hands down.”

Max Keiser: Who’s The Bigger Idiot?

As Elon Musk continues his lovey-dovey affair with Dogecoin, Bitcoin proponents continue to criticize the Dogefather.

Following Musk’s comments on Dogecoin today, popular Bitcoin advocate Max Keiser took to his Twitter page to ridicule the Tesla boss while recalling when gold bug Peter Schiff described Bitcoin as “intrinsically worthless” after he lost access to his BTC wallet.

“Who’s the bigger idiot?” Keiser asked.

Aside from Keiser, other Bitcoin proponents such as Michael Saylor replied to Tesla’s CEO:

SPECIAL OFFER (Sponsored)

Binance Futures 50 USDT FREE Voucher: Use this link to register & get 10% off fees and 50 USDT when trading 500 USDT (limited offer).

PrimeXBT Special Offer: Use this link to register & enter POTATO50 code to get 50% free bonus on any deposit up to 1 BTC.

You Might Also Like:

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://coingenius.news/bitcoin-proponents-against-elon-musk-following-heated-dogecoin-vs-bitcoin-tweets/?utm_source=rss&utm_medium=rss&utm_campaign=bitcoin-proponents-against-elon-musk-following-heated-dogecoin-vs-bitcoin-tweets

Continue Reading

AI

PlotX v2 Mainnet Launch: DeFi Prediction Markets

Avatar

Published

on

In early Sunday trading, BTC prices had fallen to their lowest levels for over 11 weeks, hitting $46,700 before a minor recovery.

The last time Bitcoin dropped to these levels was at the end of February during the second major correction of this ongoing rally. A rebound off that bottom sent prices above $60K for the first time in the two weeks that followed.

Later today, Bitcoin is going to close another weekly candle. In case the candle closes at those levels, this will become the worst weekly close since February 22nd, when BTC ended the week at $45,240, according to Bitstamp. Two weeks ago the weekly candle closed at $49,200, which the current lowest week close since February.

Second ‘Lower Low’ For Bitcoin

This time around, things feel slightly different and the bearish sentiment is returning to crypto-asset markets. Since its all-time high of $65K on April 14, Bitcoin has made a lower high and has now formed a second lower low on the daily chart, which is indicative of a larger downtrend developing.

Analyst ‘CryptoFibonacci’ has been eyeing the weekly chart which also suggests the bulls could be running out of steam.


ADVERTISEMENT

The move appears to have been driven by Elon Musk again with a tweet about Bitcoin’s energy consumption on May 13. Bitcoin’s fear and greed index has dropped to 20 – ‘extreme fear’ – its lowest level since the March 2020 market crash. At the time of press, BTC was trading at just under $48,000, down 4% over the past 24 hours.

Market Cap Shrinks by $150B

As usual, the move has initiated a selloff for the majority of other cryptocurrencies resulting in around $150 billion exiting the markets over the past day or so.

The total market cap has declined to $2.3 trillion after an all-time high of $2.5 trillion on May 12. Things are still high on the long term view but losses could accelerate rapidly if the bearish sentiment increases.

Not all crypto assets are correcting this weekend, and some have been building on recent gains to push even higher – although they are few in number.

Those weekend warriors include Cardano which has added 4.8% on the day to trade at $2.27 according to Coingecko. ADA hit an all-time high on Saturday, May 15 reaching $2.36, a gain of 54% over the past 30 days.

Ripple’s XRP is also seeing a resurgence with a 13% pump on the day to flip Cardano for the fourth spot. XRP is currently trading at $1.58 with a market cap of $73 billion. The only other two cryptocurrencies in the green at the time of writing are Stellar and Solana, gaining 3.7% and 12% respectively.

SPECIAL OFFER (Sponsored)

Binance Futures 50 USDT FREE Voucher: Use this link to register & get 10% off fees and 50 USDT when trading 500 USDT (limited offer).

PrimeXBT Special Offer: Use this link to register & enter POTATO50 code to get 50% free bonus on any deposit up to 1 BTC.

You Might Also Like:

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://coingenius.news/plotx-v2-mainnet-launch-defi-prediction-markets-58/?utm_source=rss&utm_medium=rss&utm_campaign=plotx-v2-mainnet-launch-defi-prediction-markets-58

Continue Reading
AI5 days ago

Build a cognitive search and a health knowledge graph using AWS AI services

Esports4 days ago

‘Destroy Sandcastles’ in Fortnite Locations Explained

Blockchain4 days ago

Shiba Inu: Know How to Buy the New Dogecoin Rival

Blockchain5 days ago

Meme Coins Craze Attracting Money Behind Fall of Bitcoin

Esports5 days ago

Valve launches Supporters Clubs, allows fans to directly support Dota Pro Circuit teams

Blockchain5 days ago

Sentiment Flippening: Why This Bitcoin Expert Doesn’t Own Ethereum

Blockchain4 days ago

Texas House Passes Bill that Recognizes Crypto Under Commercial Law

Aviation4 days ago

American Airlines Continues To Build Up Its Core Hub Strategy

Aviation5 days ago

Reuters: American Airlines adds stops to two flights after pipeline outage

ACN Newswire5 days ago

Duet Protocol closes first-round funding at US$3 million

Cyber Security5 days ago

Pending Data Protection and Security Laws At-A-Glance: APAC

Blockchain5 days ago

QAN Raises $2.1 Million in Venture Capital to Build DeFi Ecosystem

Blockchain4 days ago

Facebook’s Diem Enters Crypto Space With Diem USD Stablecoin

Esports4 days ago

Video: s1mple – MVP of DreamHack Masters Spring 2021

Business Insider5 days ago

Rally Expected To Stall For China Stock Market

Blockchain4 days ago

NSAV ANNOUNCES LAUNCH OF VIRTUABROKER’S PROPRIETARY CRYPTOCURRENCY PRICE SEARCH FEATURE

Business Insider4 days ago

HDI Announces Voting Results for Annual General and Special Meeting

Esports4 days ago

TiMi Studios partners with Xbox Game Studios to bring a “new game sensory experience” to players

AR/VR1 day ago

Next Dimension Podcast – Pico Neo 3, PSVR 2, HTC Vive Pro 2 & Vive Focus 3!

PR Newswire5 days ago

HITEC 100 2022 – El periodo de nominaciones ya está abierto

Trending