Orange Business Services Forms Global Strategic Collaboration Agreement with Amazon Web Services to Accelerate Customers’ Innovation in the Cloud
We store cookies on your computer to improve your experience and provide more personalized services, both on this website and on other sites. For more information about the cookies we use, see our Privacy Policy. We won’t track your information when you visit our site. We will have to use at least one cookie to ensure that you won’t have to make this choice again.AcceptDeclinePrivacy Policy
Deep reinforcement learning systems are among the most capable in AI, particularly in the robotics domain. However, in the real world, these systems encounter a number of situations and behaviors to which they weren’t exposed during development.
In a step toward systems that can collaborate with humans in order to help them accomplish their goals, researchers at Microsoft, the University of California, Berkeley, and the University of Nottingham developed a methodology for applying a testing paradigm to human-AI collaboration that can be demonstrated in a simplified version of the game Overcooked. Players in Overcooked control a number of chefs in kitchens filled with obstacles and hazards to prepare meals to order under a time limit.
The team asserts that Overcooked, while not necessarily designed with robustness benchmarking in mind, can successfully test potential edge cases in states a system should be able to handle as well as the partners the system should be able to play with. For example, in Overcooked, systems must contend with scenarios like when a plates are accidentally left on counters and when a partner stays put for a while because they’re thinking or away from their keyboard.
Above: Screen captures from the researchers’ test environment.
The researchers investigated a number of techniques for improving system robustness, including training a system with a diverse population of other collaborative systems. Over the course of experiments in Overcooked, they observed whether several test systems could recognize when to get out of the way (like when a partner was carrying an ingredient) and when to pick up and deliver orders after a partner has been idling for a while.
According to the researchers, current deep reinforcement agents aren’t very robust — at least not as measured by Overcooked. None of the systems they tested scored above 65% in the video game, suggesting, the researchers say, that Overcooked can serve as a useful human-AI collaboration metric in the future.
“We emphasize that our primary finding is that our [Overcooked] test suite provides information that may not be available by simply considering validation reward, and our conclusions for specific techniques are more preliminary,” the researchers wrote in a paper describing their work. “A natural extension of our work is to expand the use of unit tests to other domains besides human-AI collaboration … An alternative direction for future work is to explore meta learning, in order to train the agent to adapt online to the specific human partner it is playing with. This could lead to significant gains, especially on agent robustness with memory.”
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform
In June, OpenAI published a paper detailing GPT-3, a machine learning model that achieves strong results on a number of natural language benchmarks. At 175 billion parameters — the part of the model that has learned from historical training data — it’s one of the largest of its kind. It’s also among the most sophisticated, with the ability to make primitive analogies, write in the style of Chaucer, and even complete basic code.
In contrast to GPT-3’s predecessors, GPT-2 and GPT-1, OpenAI chose not to open-source the model or training dataset, opting instead to make the former available through a commercial API. The company further curtailed access by choosing to exclusively license GPT-3 to Microsoft, which OpenAI has a business relationship with. Microsoft has invested $1 billion in OpenAI and built an Azure-hosted supercomputer designed to further OpenAI’s research.
Several efforts to recreate GPT-3 in open source have emerged, but perhaps the furthest along is GPT-Neo, a project spearheaded by EleutherAI. A grassroots collection of researchers working to open-source machine learning research, EleutherAI and its founding members — Connor Leahy, Leo Gao, and Sid Black — aim to deliver the code and weights needed to run a model similar, though not identical, to GPT-3 as soon as August. (Weights are parameters within a neural network that transform input data.)
EleutherAI
According to Leahy, EleutherAI began as “something of a joke” on TPU Podcast, a machine learning Discord server, where he playfully suggested someone should try to replicate GPT-3. Leahy, Gao, and Black took this to its logical extreme and founded the EleutherAI Discord server, which became the base of the organization’s operations.
“I consider GPT-3 and other similar results to be strong evidence that it may indeed be possible to create [powerful models] with nothing more than our current techniques,” Leahy told VentureBeat in an interview. “It turns out to be in fact very, very hard, but not impossible with a group of smart people, as EleutherAI has shown, and of course with access to unreasonable amounts of computer hardware.”
As part of a personal project, Leahy previously attempted to replicate GPT-2, leveraging access to compute through Google’s Tensorflow Research Cloud (TFRC) program. The original codebase, which became GPT-Neo, was built to run on tensor processing units (TPUs), Google’s custom AI accelerator chips. But the EleutherAI team concluded that even the generous amount of TPUs provided through TFRC wouldn’t be sufficient to train the GPT-3-like version of GPT-Neo in under two years.
EleutherAI’s fortunes changed when the company was approached by CoreWeave, a U.S.-based cryptocurrency miner that provides cloud services for CGI rendering and machine learning workloads. Last month, CoreWeave offered the EleutherAI team access to its hardware in exchange for an open source GPT-3-like model its customers could use and serve.
Leahy insists that the work, which began around Christmas, won’t involve money or other compensation going in either direction. “CoreWeave gives us access to their hardware, we make an open source GPT-3 for everyone to use (and thank them very loudly), and that’s all,” he said.
Training datasets
EleutherAI concedes that because of OpenAI’s decision not to release some key details of GPT-3’s architecture, GPT-Neo will deviate from it in at least those ways. Other differences might arise from the training dataset EleutherAI plans to use, which was curated by a team of 10 people at EleutherAI, including Leahy, Gao, and Black.
Language models like GPT-3 often amplify biases encoded in data. A portion of the training data is not uncommonly sourced from communities with pervasive gender, race, and religious prejudices. OpenAI notes that this can lead to placing words like “naughty” or “sucked” near female pronouns and “Islam” near words like “terrorism.” Other studies, like one published in April by Intel, MIT, and the Canadian Institute for Advanced Research (CIFAR) researchers, have found high levels of stereotypical bias in some of the most popular models, including Google’s BERT and XLNet, OpenAI’s GPT-2, and Facebook’s RoBERTa. Malicious actors could leverage this bias to foment discord by spreading misinformation, disinformation, and outright lies that “radicalize individuals into violent far-right extremist ideologies and behaviors,” according to the Middlebury Institute of International Studies.
For their part, the EleutherAI team says they’ve performed “extensive bias analysis” on the GPT-Neo training dataset and made “tough editorial decisions” to exclude some datasets they felt were “unacceptably negatively biased” toward certain groups or views. The Pile, as it’s called, is an 835GB corpus consisting of 22 smaller datasets combined to ensure broad generalization abilities.
“We continue to carefully study how our models act in various circumstances and how we can make them more safe,” Leahy said.
Leahy personally disagrees with the idea that releasing a model like GPT-3 would have a direct negative impact on polarization. An adversary seeking to generate extremist views would find it much cheaper and easier to hire a troll farm, he argues, as autocratic governments have already done. Furthermore, Leahy asserts that discussions of discrimination and bias point to a real issue but don’t offer a complete solution. Rather than censoring the input data of a model, he says the AI research community must work toward systems that can “learn all that can be learned about evil and then use that knowledge to fight evil and become good.”
“I think the commoditization of GPT-3 type models is part of an inevitable trend in the falling price of the production of convincing digital content that will not be meaningfully derailed whether we release a model or not,” Leahy continued. “The biggest influence we can have here is to allow more low-resource users, especially academics, to gain access to these technologies to hopefully better study them, and also perform our own brand of safety-focused research on it, instead of having everything locked inside industry labs. After all, this is still ongoing, cutting-edge research. Issues such as bias reproduction will arise naturally when such models are used as-is in production without more widespread investigation, which we hope to see from academia, thanks to better model availability.”
Google recently fired AI ethicist Timnit Gebru, reportedly in part over a research paper on large language models that discussed risks such as the impact of their carbon footprint on marginalized communities. Asked about the environmental impact of training GPT-Neo, Leahy characterized the argument as a “red herring,” saying he believes it’s a matter of whether the ends justify the means — that is, whether the output of the training is worth the energy put into it.
“The amount of energy that goes into training such a model is much less than, say, the energy that goes into serving any medium-sized website, or a single trans-Atlantic flight to present a paper about the carbon emissions of AI models at a conference, or, God forbid, Bitcoin mining,” Leahy said. “No one complains about the energy bill of CERN (The European Organization for Nuclear Research), and I don’t think they should, either.”
Future work
EleutherAI plans to use architectural tweaks the team has found to be useful to train GPT-Neo, which they expect will enable the model to achieve performance “similar” to GPT-3 at roughly the same size (around 350GB to 700GB of weights). In the future, they plan to distill the final model down “an order of magnitude or so smaller” for easier inference. And while they’re not planning to provide any kind of commercial API, they expect CoreWeave and others to set up services to make GPT-Neo accessible to users.
As for the next iteration of GPT and similarly large, complex models, like Google’s trillion-parameter Switch-C, Leahy thinks they’ll likely be more challenging to replicate. But there’s evidence that efficiency improvements might offset the mounting compute requirements. An OpenAI survey found that since 2012, the amount of compute needed to train an AI model to the same performance classifying images in a popular benchmark (ImageNet) has been decreasing by a factor of two every 16 months. But the extent to which compute contributes to performance compared with novel algorithmic approaches remains an open question.
“It seems inevitable that models will continue to increase in size as long as increases in performance follow,” Leahy said. “Sufficiently large models will, of course, be out of reach for smaller actors, but this seems to me to just be a fact of life. There seems to me to be no viable alternative. If bigger models equals better performance, whoever has the biggest computer will make the biggest model and therefore have the best performance, easy as that. I wish this wasn’t so, but there isn’t really anything that can be done about it.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform
Don’t miss TechRepublic’s CES 2021 coverage, which includes product announcements from Lenovo, Samsung, LG, and Dell about PCs, laptops, software, robots, monitors, and TVs.
Due to the COVID-19 pandemic, CES 2021 is all-digital for the first time ever. The event runs from Monday, January 11 to Thursday, January 14. CES has always been one of the leading tech events each year and, despite being an online-only event in 2021, thousands of products are expected to be announced.
There are six top trends to watch for at CES 2021, according to TechRepublic’s Editor-in-Chief Bill Detwiler, Associate Managing Editor Teena Maddox, and UK Editor-in-Chief Steve Ranger. Several visionary tech and industry leaders are expected to deliver keynote speeches at CES 2021 including Verizon Chairman and CEO Hans Vestberg, General Motors Chairman and CEO Mary Barra, Best Buy CEO Corie Barry, Mastercard CEO Michael Miebach, and more.
TechRepublic will be reporting on all of the CES 2021 tech news that business pros need to know. Keep checking this article for our latest CES 2021 coverage.
Photos: Best sleep solutions at CES 2021 Entrepreneurs and physicians at CES 2021 have something for every sleep problem at every stage of life from infants to seniors.
Photos: Best robots at CES 2021 CES is virtually synonymous with the latest innovations in robots. Here are some of the best robots we’ve seen at CES 2021 so far.
Low-blue light tech takes center stage at CES 2021 Due to COVID-19, many professionals and students are operating remotely. To mitigate the potential impacts of increased screen time, companies are investing in low-blue light tech.
Top gaming computers released at CES 2021 This week, a number of brands have unveiled gaming computers at CES 2021. Here are some of the coolest gaming computers we’ve seen so far.
CES 2021 showcases new tech that enhances your best features From customized lipsticks and skin toners to smart mirrors and virtual hair consultations, new tech featured at CES 2021 helps you put your best face forward for you next virtual meeting and beyond.
Razer makes an unexpected move at CES 2021 with Project Hazel Quietly starting at the beginning of the pandemic last year, the tech gaming company transformed some manufacturing facilities to create wearable and innovative protection against COVID-19.
CES 2021: Lenovo releases new lineup of ThinkBooks Lenovo has released a slew of new ThinkBooks to assist business professionals working remotely. This includes the lightweight ThinkBook 13x i, highly versatile ThinkBook Plus Gen 2 i, and more.
Our 10 favorite CES 2021 Best of Innovation Honorees A virus risk indicator, hydropower shower speaker, robot companion, and water bottle that glows when it’s time to drink are among the products that caught our attention.
Be in the know about smart cities, AI, Internet of Things, VR, AR, robotics, drones, autonomous driving, and more of the coolest tech innovations. Delivered Wednesdays and Fridays
A battle for control over machine learning operations (MLOps) is beginning in earnest as organizations embrace feature store repositories to build AI models more efficiently.
A feature store is at its core a data warehouse through which developers of AI models can share and reuse the artifacts that make up an AI model as well as an entire AI model that might need to be modified or further extended. In concept, feature store repositories play a similar role as a Git repository does in enabling developers to build applications more efficiently by sharing and reusing code.
Early pioneers of feature store repositories include Uber, which built a platform dubbed Michaelangelo, and Airbnb, which created a feature store dubbed Zipline. But neither of those platforms are available as open source code. Leading providers of feature store repositories trying to fill that void include Tecton, Molecula, Hopsworks, Splice Machine, and, most recently, Amazon Web Services (AWS). There is also an open source feature store project, dubbed Feast, that counts among its contributors Google and Tecton.
It can take a data science team six months or longer to construct a single AI model, so pressure to accelerate those processes is building. Organizations that employ AI models not only want to build more of them faster, but AI models deployed in production environments also need to be either regularly updated or replaced as business conditions change.
Less clear right now, however, is to what degree feature store repositories represent a standalone category versus being a foundational element of a larger MLOps platform. As investment capital starts to pour into the category, providers of feature store platforms are trying to have it both ways.
Splice Machine, for example, offers a SQL-based feature store platform that organizations can deploy apart from its platform for managing data science processes. “It’s important to modularize the feature store so it can be used in other environments,” said Splice Machine CEO Monte Zweben. “I think you’ll see adoption of feature stores in both manners.”
Over time, however, it will become apparent that feature stores one way or another need to be part of a larger platform to derive the most value, he added.
Fresh off raising an additional $17.6 million in funding, Molecula is also positioning its feature store as a standalone offering in addition to being a foundation around which MLOps processes will revolve. In fact, Molecula is betting that feature stores, in addition to enabling AI models to be constructed more efficiently, will also become critical to building any type of advanced analytics application, said Molecula CEO H.O. Maycotte.
To achieve that goal, Molecula built its own storage architecture to eliminate all the manual copy-and-paste processes that make building AI models and other types of advanced analytics applications so cumbersome today, he noted. “It’s not just for MLOps,” said Maycotte. “Our buyer is the data engineer.”
Tecton, meanwhile, appears to be more focused on enabling the creation of a best-of-breed MLOps ecosystem around its core feature flag platform. “Feature stores will be at the center of an MLOps toolchain,” said Tecton CEO Mike Del Balso.
Casting a shadow over each of these vendors, however, are cloud service providers that will make feature store repositories available as a service. Most AI models are trained on a public cloud because of the massive amounts of data required and the cost of the graphics processor units (GPUs) required. Adding a feature store repository to a cloud service that is already being employed to build an AI model is simply a logical extension.
However, providers of feature store platforms contend it’s only a matter of time before MLOps processes span multiple clouds. Many enterprise IT organizations are going to standardize on a feature store repository that makes it simpler to share AI models and their components across multiple clouds.
Regardless of how MLOps evolves, the need for a centralized repository for building AI models has become apparent. The issue enterprise IT organizations need to address now is determining which approach makes the most sense today, because whatever feature store platform they select now will have a major impact on their AI strategy for years to come.
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform