Connect with us

AI

How Hasty uses automation and rapid feedback to train AI models and improve annotation

Avatar

Published

on

Computer vision is playing an increasingly pivotal role across industry sectors, from tracking progress on construction sites to deploying smart barcode scanning in warehouses. But training the underlying AI model to accurately identify images can be a slow, resource-intensive endeavor that isn’t guaranteed to produce results. Fledgling German startup Hasty wants to help with the promise of “next-gen” tools that expedite the entire model training process for annotating images.

Hasty, which was founded out of Berlin in 2019, today announced it has raised $3.7 million in a seed round led by Shasta Ventures. The Silicon Valley VC firm has a number of notable exits to its name, including Nest (acquired by Google), Eero (acquired by Amazon), and Zuora (IPO). Other participants in the round include iRobot Ventures and Coparion.

The global computer vision market was pegged at $11.4 billion in 2020, a figure that is projected to rise to more than $19 billion by 2027. Data preparation and processing is one of the most time-consuming tasks in AI, accounting for around 80% of time spent on related projects. In computer vision, annotation, or labeling, is a technique used to mark and categorize images to give machines the meaning and context behind the picture, enabling them to spot similar objects. Much of this annotation work falls to trusty old humans.

The problem Hasty is looking to fix is that the vast majority of data science projects never make it into production, with significant resources wasted in the process.

“Current approaches to data labeling are too slow,” Hasty cofounder and CEO Tristan Rouillard told VentureBeat. “Machine learning engineers often have to wait three to six months for first results to see if their annotation strategy and approach is working because of the delay between labeling and model training.”

Make haste

Hasty ships with 10 built-in automated AI assistants, each dedicated to reducing human spadework. Dextr, for instance, allows users to click just four extreme points on an object to highlight it and suggest annotations.

Above: Hasty’s Dextra AI assistant

And Hasty’s AI “instance segmentation” assistant creates swifter annotations when it finds multiple instances of an object within an image.

Above: Hasty AI instance segmentation

The assistant observes while users annotate and can make suggestions for labels once it reaches a specific confidence score. And the user can correct these suggestions to improve the model while receiving feedback on how effective the annotation strategy is.

“This gives the neural network a learning curve — it learns on the project as you label,” Rouillard said.

There are already countless tools designed to simplify this process, including Amazon’s SageMaker, Google-backed Labelbox, V7, and Dataloop, which announced a fresh $11 million round of funding just last month.

But Hasty claims it can make the entire process significantly faster with its combination of automation, model-training, and annotation.

As with similar platforms, Hasty uses an interface through which humans and machines collaborate. Hasty can make suggested annotations after having been exposed to just a few human-annotated images, with the user (e.g. the machine learning engineer) accepting, rejecting, or editing that suggestion. This real-time feedback means models improve the more they are used in what is often referred to as a “data flywheel.”

“Everyone is looking to build a self-improving data flywheel. The problem with (computer) vision AI is getting that flywheel to turn at all in the first place, [as] it’s super expensive and only works 50% of the time — this is where we come in,” Rouillard said.

Rapid feedback

In effect, Hasty’s neural networks learn while the engineers are building out their datasets, so the “build,” “deploy,” and “evaluate” facets of the process happen more or less concurrently. A typical linear approach may take months to arrive at a testable AI model, which could be deeply flawed due to errors in the data or blind assumptions made at the project’s inception. What Hasty promises is agility.

That isn’t entirely novel, but Rouillard said his company views automated labeling as similar to autonomous driving, in that different technologies operate at different levels. In the self-driving vehicle sphere, some cars can only brake or change lanes, while others are capable of nearly full autonomy. Translated to annotation, Rouillard said Hasty take automation further than many of its rivals, in terms of minimizing the number of clicks required to label an image or batches of images.

“Everyone preaches automation, but it is not obvious what is being automated,” Rouillard explained. “Almost all tools have good implementations of level 1 automation, but only a few of us take the trouble of providing level 2 and 3 in a way that produces meaningful results.”

As data is essentially the fuel for machine learning, getting more (accurate) data into an AI model at scale is key.

Above: Hasty: Automated labeling levels

In addition to a manual error finding tool, Hasty offers an AI-powered error finder that automatically identifies likely issues in a project’s training data. It’s a quality control feature that circumvents the need to search through data for errors.

“This allows you to spend your time fixing errors instead of looking for them and helps you to build confidence in your data quickly while you annotate,” Rouillard said.

Above: Hasty: Error finder

Hasty claims around 4,000 users, a fairly even mix of corporations, universities, startups, and app developers that span just about every industry. “We have three of the top 10 German companies in logistics, agriculture, and retail using Hasty,” Rouillard added.

A typical agriculture use case might involve training an AI model to identify crops, pests, or diseases. In logistics, the model can be used to train machines to automatically sort parcels by type. Rouillard added that Hasty is also being used in the sports realm to provide real-time game analysis and stats for soccer coverage.

With $3.7 million in the bank, the company plans to accelerate product development and expand its customer base across Europe and North America.

Sign up for Funding Weekly to start your week with VB’s top funding stories.

Source: https://venturebeat.com/2020/11/24/how-hasty-uses-automation-and-rapid-feedback-to-train-ai-models-and-improve-annotation/

AI

China Wants to Be the World’s AI Superpower. Does It Have What It Takes?

Avatar

Published

on

China’s star has been steadily rising for decades. Besides slashing extreme poverty rates from 88 percent to under 2 percent in just 30 years, the country has become a global powerhouse in manufacturing and technology. Its pace of growth may slow due to an aging population, but China is nonetheless one of the world’s biggest players in multiple cutting-edge tech fields.

One of these fields, and perhaps the most significant, is artificial intelligence. The Chinese government announced a plan in 2017 to become the world leader in AI by 2030, and has since poured billions of dollars into AI projects and research across academia, government, and private industry. The government’s venture capital fund is investing over $30 billion in AI; the northeastern city of Tianjin budgeted $16 billion for advancing AI; and a $2 billion AI research park is being built in Beijing.

On top of these huge investments, the government and private companies in China have access to an unprecedented quantity of data, on everything from citizens’ health to their smartphone use. WeChat, a multi-functional app where people can chat, date, send payments, hail rides, read news, and more, gives the CCP full access to user data upon request; as one BBC journalist put it, WeChat “was ahead of the game on the global stage and it has found its way into all corners of people’s existence. It could deliver to the Communist Party a life map of pretty much everybody in this country, citizens and foreigners alike.” And that’s just one (albeit big) source of data.

Many believe these factors are giving China a serious leg up in AI development, even providing enough of a boost that its progress will surpass that of the US.

But there’s more to AI than data, and there’s more to progress than investing billions of dollars. Analyzing China’s potential to become a world leader in AI—or in any technology that requires consistent innovation—from multiple angles provides a more nuanced picture of its strengths and limitations. In a June 2020 article in Foreign Affairs, Oxford fellows Carl Benedikt Frey and Michael Osborne argued that China’s big advantages may not actually be that advantageous in the long run—and its limitations may be very limiting.

Moving the AI Needle

To get an idea of who’s likely to take the lead in AI, it could help to first consider how the technology will advance beyond its current state.

To put it plainly, AI is somewhat stuck at the moment. Algorithms and neural networks continue to achieve new and impressive feats—like DeepMind’s AlphaFold accurately predicting protein structures or OpenAI’s GPT-3 writing convincing articles based on short prompts—but for the most part these systems’ capabilities are still defined as narrow intelligence: completing a specific task for which the system was painstakingly trained on loads of data.

(It’s worth noting here that some have speculated OpenAI’s GPT-3 may be an exception, the first example of machine intelligence that, while not “general,” has surpassed the definition of “narrow”; the algorithm was trained to write text, but ended up being able to translate between languages, write code, autocomplete images, do math, and perform other language-related tasks it wasn’t specifically trained for. However, all of GPT-3’s capabilities are limited to skills it learned in the language domain, whether spoken, written, or programming language).

Both AlphaFold’s and GPT-3’s success was due largely to the massive datasets they were trained on; no revolutionary new training methods or architectures were involved. If all it was going to take to advance AI was a continuation or scaling-up of this paradigm—more input data yields increased capability—China could well have an advantage.

But one of the biggest hurdles AI needs to clear to advance in leaps and bounds rather than baby steps is precisely this reliance on extensive, task-specific data. Other significant challenges include the technology’s fast approach to the limits of current computing power and its immense energy consumption.

Thus, while China’s trove of data may give it an advantage now, it may not be much of a long-term foothold on the climb to AI dominance. It’s useful for building products that incorporate or rely on today’s AI, but not for pushing the needle on how artificially intelligent systems learn. WeChat data on users’ spending habits, for example, would be valuable in building an AI that helps people save money or suggests items they might want to purchase. It will enable (and already has enabled) highly tailored products that will earn their creators and the companies that use them a lot of money.

But data quantity isn’t what’s going to advance AI. As Frey and Osborne put it, “Data efficiency is the holy grail of further progress in artificial intelligence.”

To that end, research teams in academia and private industry are working on ways to make AI less data-hungry. New training methods like one-shot learning and less-than-one-shot learning have begun to emerge, along with myriad efforts to make AI that learns more like the human brain.

While not insignificant, these advancements still fall into the “baby steps” category. No one knows how AI is going to progress beyond these small steps—and that uncertainty, in Frey and Osborne’s opinion, is a major speed bump on China’s fast-track to AI dominance.

How Innovation Happens

A lot of great inventions have happened by accident, and some of the world’s most successful companies started in garages, dorm rooms, or similarly low-budget, nondescript circumstances (including Google, Facebook, Amazon, and Apple, to name a few). Innovation, the authors point out, often happens “through serendipity and recombination, as inventors and entrepreneurs interact and exchange ideas.”

Frey and Osborne argue that although China has great reserves of talent and a history of building on technologies conceived elsewhere, it doesn’t yet have a glowing track record in terms of innovation. They note that of the 100 most-cited patents from 2003 to present, none came from China. Giants Tencent, Alibaba, and Baidu are all wildly successful in the Chinese market, but they’re rooted in technologies or business models that came out of the US and were tweaked for the Chinese population.

“The most innovative societies have always been those that allowed people to pursue controversial ideas,” Frey and Osborne write. China’s heavy censorship of the internet and surveillance of citizens don’t quite encourage the pursuit of controversial ideas. The country’s social credit system rewards people who follow the rules and punishes those who step out of line. Frey adds that top-down execution of problem-solving is effective when the problem at hand is clearly defined—and the next big leaps in AI are not.

It’s debatable how strongly a culture of social conformism can impact technological innovation, and of course there can be exceptions. But a relevant historical example is the Soviet Union, which, despite heavy investment in science and technology that briefly rivaled the US in fields like nuclear energy and space exploration, ended up lagging far behind primarily due to political and cultural factors.

Similarly, China’s focus on computer science in its education system could give it an edge—but, as Frey told me in an email, “The best students are not necessarily the best researchers. Being a good researcher also requires coming up with new ideas.”

Winner Take All?

Beyond the question of whether China will achieve AI dominance is the issue of how it will use the powerful technology. Several of the ways China has already implemented AI could be considered morally questionable, from facial recognition systems used aggressively against ethnic minorities to smart glasses for policemen that can pull up information about whoever the wearer looks at.

This isn’t to say the US would use AI for purely ethical purposes. The military’s Project Maven, for example, used artificially intelligent algorithms to identify insurgent targets in Iraq and Syria, and American law enforcement agencies are also using (mostly unregulated) facial recognition systems.

It’s conceivable that “dominance” in AI won’t go to one country; each nation could meet milestones in different ways, or meet different milestones. Researchers from both countries, at least in the academic sphere, could (and likely will) continue to collaborate and share their work, as they’ve done on many projects to date.

If one country does take the lead, it will certainly see some major advantages as a result. Brookings Institute fellow Indermit Gill goes so far as to say that whoever leads in AI in 2030 will “rule the world” until 2100. But Gill points out that in addition to considering each country’s strengths, we should consider how willing they are to improve upon their weaknesses.

While China leads in investment and the US in innovation, both nations are grappling with huge economic inequalities that could negatively impact technological uptake. “Attitudes toward the social change that accompanies new technologies matter as much as the technologies, pointing to the need for complementary policies that shape the economy and society,” Gill writes.

Will China’s leadership be willing to relax its grip to foster innovation? Will the US business environment be enough to compete with China’s data, investment, and education advantages? And can both countries find a way to distribute technology’s economic benefits more equitably?

Time will tell, but it seems we’ve got our work cut out for us—and China does too.

Image Credit: Adam Birkett on Unsplash

Source: https://singularityhub.com/2021/01/17/china-wants-to-be-the-worlds-ai-superpower-does-it-have-what-it-takes/

Continue Reading

AI

This Week’s Awesome Tech Stories From Around the Web (Through January 16)

Avatar

Published

on

CRYPTOCURRENCY

Lost Passwords Lock Millionaires Out of Their Bitcoin Fortunes
Nathaniel Popper | The New York Times
“Stefan Thomas, a German-born programmer living in San Francisco, has two guesses left to figure out a password that is worth, as of this week, about $220 million. The password will let him unlock a small hard drive, known as an IronKey, which contains the private keys to a digital wallet that holds 7,002 Bitcoin.”

GENETICS

Scientists Have Sequenced Dire Wolf DNA. Thanks, Science!
Angela Watercutter | Wired
“Dire wolves: First of their name, last of their kind. Yes, you read that correctly. According to new research published today in Nature, scientists have finally been able to sequence the DNA of dire wolves—and, to borrow a phrase from the 11 o’clock news, what they found might surprise you.”

FUTURE

These Scientists Have a Wildly Futuristic Plan to Harvest Energy From Black Holes
Luke Dormehl | Digital Trends
“The idea, in essence, is to extract energy from black holes by gathering charged plasma particles as they try to escape from the event horizon, the threshold surrounding a black hole at which escape velocity is greater than the speed of light. To put it in even broader terms: The researchers believe that it would be possible to obtain energy directly from the curvature of spacetime. (And you thought that your new solar panels were exciting!).”

ENERGY

US Grid Will See 80 Percent of Its New Capacity Go Emission-Free
John Timmer | Ars Technica
“Earlier this week, the US Energy Information Agency (EIA) released figures on the new generating capacity that’s expected to start operating over the course of 2021. While plans can obviously change, the hope is that, with its new additions, the grid will look radically different than it did just five years ago.”

ARTIFICIAL INTELLIGENCE

Worried About Your Firm’s AI Ethics? These Startups Are Here To Help
Karen Hao | MIT Technology Review
“Parity is among a growing crop of startups promising organizations ways to develop, monitor, and fix their AI models. They offer a range of products and services from bias-mitigation tools to explainability platforms. Initially most of their clients came from heavily regulated industries like finance and health care. But increased research and media attention on issues of bias, privacy, and transparency have shifted the focus of the conversation.”

GOVERNANCE

Who Should Make the Online Rules
Shira Ovide | The New York Times
“There has been lots of screaming about what these [big tech] companies did, but I want us all to recognize that there are few easy choices here. Because at the root of these disputes are big and thorny questions: Is more speech better? And who gets to decide? …The oddity is not that we’re struggling with age-old questions about the trade-offs of free expression. The weird thing is that companies like Facebook and Apple have become such essential judges in this debate.”

INTERNET

He Created the Web. Now He’s Out to Remake the Digital World.
Steve Lohr | The New York Times
“The big tech companies are facing tougher privacy rules in Europe and some American states, led by California. Google and Facebook have been hit with antitrust suits. But Mr. Berners-Lee is taking a different approach: His answer to the problem is technology that gives individuals more power. …The idea is that each person could control his or her own data—websites visited, credit card purchases, workout routines, music streamed—in an individual data safe, typically a sliver of server space.”

ENVIRONMENT

Evolution’s Engineers
Kevin Laland | Aeon
“Evolving populations are less like zombie mountaineers mindlessly climbing adaptive peaks, and more like industrious landscape designers, equipped with digging and building apparatuses, remodeling the topography to their own ends. At a time when human niche construction and ecological inheritance are ravaging the planet’s ecology and driving a human population explosion, understanding how organisms retool ecology for their own purposes has never been more pressing.”

Image Credit: Dewang Gupta / Unsplash

Source: https://singularityhub.com/2021/01/16/this-weeks-awesome-tech-stories-from-around-the-web-through-january-16/

Continue Reading

AI

Researchers propose using the game Overcooked to benchmark collaborative AI systems

Avatar

Published

on

Deep reinforcement learning systems are among the most capable in AI, particularly in the robotics domain. However, in the real world, these systems encounter a number of situations and behaviors to which they weren’t exposed during development.

In a step toward systems that can collaborate with humans in order to help them accomplish their goals, researchers at Microsoft, the University of California, Berkeley, and the University of Nottingham developed a methodology for applying a testing paradigm to human-AI collaboration that can be demonstrated in a simplified version of the game Overcooked. Players in Overcooked control a number of chefs in kitchens filled with obstacles and hazards to prepare meals to order under a time limit.

The team asserts that Overcooked, while not necessarily designed with robustness benchmarking in mind, can successfully test potential edge cases in states a system should be able to handle as well as the partners the system should be able to play with. For example, in Overcooked, systems must contend with scenarios like when a plates are accidentally left on counters and when a partner stays put for a while because they’re thinking or away from their keyboard.

Above: Screen captures from the researchers’ test environment.

The researchers investigated a number of techniques for improving system robustness, including training a system with a diverse population of other collaborative systems. Over the course of experiments in Overcooked, they observed whether several test systems could recognize when to get out of the way (like when a partner was carrying an ingredient) and when to pick up and deliver orders after a partner has been idling for a while.

According to the researchers, current deep reinforcement agents aren’t very robust — at least not as measured by Overcooked. None of the systems they tested scored above 65% in the video game, suggesting, the researchers say, that Overcooked can serve as a useful human-AI collaboration metric in the future.

“We emphasize that our primary finding is that our [Overcooked] test suite provides information that may not be available by simply considering validation reward, and our conclusions for specific techniques are more preliminary,” the researchers wrote in a paper describing their work. “A natural extension of our work is to expand the use of unit tests to other domains besides human-AI collaboration … An alternative direction for future work is to explore meta learning, in order to train the agent to adapt online to the specific human partner it is playing with. This could lead to significant gains, especially on agent robustness with memory.”

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Source: https://venturebeat.com/2021/01/15/researchers-propose-using-the-game-overcooked-to-benchmark-collaborative-ai-systems/

Continue Reading

AI

AI Weekly: Meet the people trying to replicate and open-source OpenAI’s GPT-3

Avatar

Published

on

In June, OpenAI published a paper detailing GPT-3, a machine learning model that achieves strong results on a number of natural language benchmarks. At 175 billion parameters — the part of the model that has learned from historical training data — it’s one of the largest of its kind. It’s also among the most sophisticated, with the ability to make primitive analogies, write in the style of Chaucer, and even complete basic code.

In contrast to GPT-3’s predecessors, GPT-2 and GPT-1, OpenAI chose not to open-source the model or training dataset, opting instead to make the former available through a commercial API. The company further curtailed access by choosing to exclusively license GPT-3 to Microsoft, which OpenAI has a business relationship with. Microsoft has invested $1 billion in OpenAI and built an Azure-hosted supercomputer designed to further OpenAI’s research.

Several efforts to recreate GPT-3 in open source have emerged, but perhaps the furthest along is GPT-Neo, a project spearheaded by EleutherAI. A grassroots collection of researchers working to open-source machine learning research, EleutherAI and its founding members — Connor Leahy, Leo Gao, and Sid Black — aim to deliver the code and weights needed to run a model similar, though not identical, to GPT-3 as soon as August. (Weights are parameters within a neural network that transform input data.)

EleutherAI

According to Leahy, EleutherAI began as “something of a joke” on TPU Podcast, a machine learning Discord server, where he playfully suggested someone should try to replicate GPT-3. Leahy, Gao, and Black took this to its logical extreme and founded the EleutherAI Discord server, which became the base of the organization’s operations.

“I consider GPT-3 and other similar results to be strong evidence that it may indeed be possible to create [powerful models] with nothing more than our current techniques,” Leahy told VentureBeat in an interview. “It turns out to be in fact very, very hard, but not impossible with a group of smart people, as EleutherAI has shown, and of course with access to unreasonable amounts of computer hardware.”

As part of a personal project, Leahy previously attempted to replicate GPT-2, leveraging access to compute through Google’s Tensorflow Research Cloud (TFRC) program. The original codebase, which became GPT-Neo, was built to run on tensor processing units (TPUs), Google’s custom AI accelerator chips. But the EleutherAI team concluded that even the generous amount of TPUs provided through TFRC wouldn’t be sufficient to train the GPT-3-like version of GPT-Neo in under two years.

EleutherAI’s fortunes changed when the company was approached by CoreWeave, a U.S.-based cryptocurrency miner that provides cloud services for CGI rendering and machine learning workloads. Last month, CoreWeave offered the EleutherAI team access to its hardware in exchange for an open source GPT-3-like model its customers could use and serve.

Leahy insists that the work, which began around Christmas, won’t involve money or other compensation going in either direction. “CoreWeave gives us access to their hardware, we make an open source GPT-3 for everyone to use (and thank them very loudly), and that’s all,” he said.

Training datasets

EleutherAI concedes that because of OpenAI’s decision not to release some key details of GPT-3’s architecture, GPT-Neo will deviate from it in at least those ways. Other differences might arise from the training dataset EleutherAI plans to use, which was curated by a team of 10 people at EleutherAI, including Leahy, Gao, and Black.

Language models like GPT-3 often amplify biases encoded in data. A portion of the training data is not uncommonly sourced from communities with pervasive gender, race, and religious prejudices. OpenAI notes that this can lead to placing words like “naughty” or “sucked” near female pronouns and “Islam” near words like “terrorism.” Other studies, like one published in April by Intel, MIT, and the Canadian Institute for Advanced Research (CIFAR) researchers, have found high levels of stereotypical bias in some of the most popular models, including Google’s BERT and XLNetOpenAI’s GPT-2, and Facebook’s RoBERTa. Malicious actors could leverage this bias to foment discord by spreading misinformation, disinformation, and outright lies that “radicalize individuals into violent far-right extremist ideologies and behaviors,” according to the Middlebury Institute of International Studies.

For their part, the EleutherAI team says they’ve performed “extensive bias analysis” on the GPT-Neo training dataset and made “tough editorial decisions” to exclude some datasets they felt were “unacceptably negatively biased” toward certain groups or views. The Pile, as it’s called, is an 835GB corpus consisting of 22 smaller datasets combined to ensure broad generalization abilities.

“We continue to carefully study how our models act in various circumstances and how we can make them more safe,” Leahy said.

Leahy personally disagrees with the idea that releasing a model like GPT-3 would have a direct negative impact on polarization. An adversary seeking to generate extremist views would find it much cheaper and easier to hire a troll farm, he argues, as autocratic governments have already done. Furthermore, Leahy asserts that discussions of discrimination and bias point to a real issue but don’t offer a complete solution. Rather than censoring the input data of a model, he says the AI research community must work toward systems that can “learn all that can be learned about evil and then use that knowledge to fight evil and become good.”

“I think the commoditization of GPT-3 type models is part of an inevitable trend in the falling price of the production of convincing digital content that will not be meaningfully derailed whether we release a model or not,” Leahy continued. “The biggest influence we can have here is to allow more low-resource users, especially academics, to gain access to these technologies to hopefully better study them, and also perform our own brand of safety-focused research on it, instead of having everything locked inside industry labs. After all, this is still ongoing, cutting-edge research. Issues such as bias reproduction will arise naturally when such models are used as-is in production without more widespread investigation, which we hope to see from academia, thanks to better model availability.”

Google recently fired AI ethicist Timnit Gebru, reportedly in part over a research paper on large language models that discussed risks such as the impact of their carbon footprint on marginalized communities. Asked about the environmental impact of training GPT-Neo, Leahy characterized the argument as a “red herring,” saying he believes it’s a matter of whether the ends justify the means — that is, whether the output of the training is worth the energy put into it.

“The amount of energy that goes into training such a model is much less than, say, the energy that goes into serving any medium-sized website, or a single trans-Atlantic flight to present a paper about the carbon emissions of AI models at a conference, or, God forbid, Bitcoin mining,” Leahy said. “No one complains about the energy bill of CERN (The European Organization for Nuclear Research), and I don’t think they should, either.”

Future work

EleutherAI plans to use architectural tweaks the team has found to be useful to train GPT-Neo, which they expect will enable the model to achieve performance “similar” to GPT-3 at roughly the same size (around 350GB to 700GB of weights). In the future, they plan to distill the final model down “an order of magnitude or so smaller” for easier inference. And while they’re not planning to provide any kind of commercial API, they expect CoreWeave and others to set up services to make GPT-Neo accessible to users.

As for the next iteration of GPT and similarly large, complex models, like Google’s trillion-parameter Switch-C, Leahy thinks they’ll likely be more challenging to replicate. But there’s evidence that efficiency improvements might offset the mounting compute requirements. An OpenAI survey found that since 2012, the amount of compute needed to train an AI model to the same performance classifying images in a popular benchmark (ImageNet) has been decreasing by a factor of two every 16 months. But the extent to which compute contributes to performance compared with novel algorithmic approaches remains an open question.

“It seems inevitable that models will continue to increase in size as long as increases in performance follow,” Leahy said. “Sufficiently large models will, of course, be out of reach for smaller actors, but this seems to me to just be a fact of life. There seems to me to be no viable alternative. If bigger models equals better performance, whoever has the biggest computer will make the biggest model and therefore have the best performance, easy as that. I wish this wasn’t so, but there isn’t really anything that can be done about it.”

For AI coverage, send news tips to Khari Johnson and Kyle Wiggers and AI editor Seth Colaner — and be sure to subscribe to the AI Weekly newsletter and bookmark our AI channel, The Machine.

Thanks for reading,

Kyle Wiggers

AI Staff Writer

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Source: https://venturebeat.com/2021/01/15/ai-weekly-meet-the-people-trying-to-replicate-and-open-source-openais-gpt-3/

Continue Reading
Amb Crypto3 days ago

Ethereum, Dogecoin, Maker Price Analysis: 15 January

Amb Crypto2 days ago

How are Chainlink’s whales propping up its price?

Amb Crypto3 days ago

NavCoin releases its new privacy protocol, one day after Binance adds NAV to its staking program

Blockchain3 days ago

Bitcoin Cloud Mining With Shamining: Is it Worth it? [Review]

Blockchain3 days ago

Litecoin Regains Footing After Being Knocked Back by Resistance

Blockchain3 days ago

Warp Finance Relaunches With ‘Additional Security’ from Chainlink

Cyber Security5 days ago

Hackers Leak Stolen Pfizer-BioNTech COVID-19 Vaccine Data

Venture Capital4 days ago

Ghana fintech startup secures $700k investment 

Cyber Security5 days ago

Sophisticated Hacks Against Android, Windows Reveal Zero-Day Trove

Blockchain5 days ago

Crypto Games May Substitute Regular Video Games in 2021

Automotive5 days ago

Nokian One All-Season Tire Has Life Expectancy Of 80,000 Miles

Blockchain5 days ago

Amundi and BNY Mellon form strategic alliance

Cyber Security4 days ago

High-Severity Cisco Flaw Found in CMX Software For Retailers

Cannabis5 days ago

The Cannabis Craze is Back in Gear (NASDAQ: SNDL) (NASDAQ: GRWG) (OTC US: MEDH) (OTC US: CRLBF)

SPACS2 days ago

Affinity Gaming’s SPAC Gaming & Hospitality Acquisition files for a $150 million IPO

Blockchain4 days ago

Is Gold Still Worth Buying in the Bitcoin Age?

Cyber Security5 days ago

CISOs Prep For COVID-19 Exposure Notification in the Workplace

Blockchain1 day ago

The Countdown is on: Bitcoin has 3 Days Before It Reaches Apex of Key Formation

Blockchain5 days ago

Schroders appoints Global Head of Infrastructure in Private Assets

Blockchain5 days ago

Biden to Pick Former CFTC Chair Gary Gensler as SEC Chairman: Report

Trending