Artificial intelligence and robotics are bringing drastic changes in the technological fields. Things we only imagined twenty years back have now become a reality. From automated systems at a manufacturing plant to self-serving robots in a restaurant, technology has evolved, driving humans together. In Today’s world, AI and robots serve people as problem-solvers, companions, and first-responders. Basically, when you chat online with a business on their website thinking that you are talking to their customer representative, you are actually talking to a chatbot. Technology has evolved for good, and it is not going to stop here.
AI and robotics are getting used in multiple fields
When we talk about AI and robotics, they are not specific to a certain industry. Their adaptability has made them the favorites in all the industries and sectors you can name or think of. From gaming to defense, healthcare, automotive, fitness, education, retail, manufacturing, and whatnot. Online gambling, for example, is a billion-dollar industry, and online gambling platforms like True Blue Casino have already started using AI-based algorithms that control the outcome of the gameplay.
So, it is safe to say that machines and computers will positively manage most of our dealings. It is just the start. AI, machine learning, and robotics are bound to progress further in the coming years before they become commonplace. Data has played a crucial role in the development of these systems because data has enabled these machines to learn on their own. With that being said, let’s discuss the applications of AI and robotics and how they are going to shape our future.
Where AI and Robots are used Today?
AI and robots are a powerful combination for automating tasks. In recent times, artificial intelligence has become a significantly common presence in robotic solutions, bringing in learning capabilities and flexibility in previously rigid applications. While still being nascent, both technologies work well when combined.
1. Virtual Assistant and Chatbots
Virtual assistants and chatbots propel the world with astounding automation levels, driving costs down, and productivity. Virtual assistants are a manifestation of AI and machine learning through the simulation of conversation with humans. Virtual assistants and chatbots are designed to obey automated rules using capabilities called Natural Language Processing (NLP). The recent advancements in technology have significantly improved their performance. From Siri to Google Assistant and Alexa, they are the glorified versions of virtual assistants.
From answering the basic questions like Today’s date and weather to performing some complex tasks like “Hey Siri! Set up an alarm for 8 AM,” these virtual assistants will slowly replace your human assistants. The best part is that they amalgamate very well with machines in your home. With the likes of IoT (Internet of Things), you can command your virtual assistant to turn on the light or AC or music in your house.
2. Agriculture and Farming
Believe it or not, robotics and AI are your next best bet for sustainable agriculture. With the food supply chain facing a crisis, courtesy of centuries of environmental abuse, over-farming, labor shortages, and population growth, it is threatening our most basic needs. AI and automation are believed to provide relief from the effects of an aging agricultural workforce. With the likes of autonomous drones, self-driving agricultural machines, etc., farmers can spend more time focusing on creating sustainable harvests and less time watching the path in front of them.
Deere is a well-known agriculture equipment manufacturer that is popular for its self-driving machinery. Also, it expanded its agricultural arsenal with the introduction of an automated weed sprayer. It uses next-gen technology with advanced robotics, machine learning, and computer vision to distinguish between crop and weed. Also, Big Data is helping farmers to deliver better crops. Big Data has given rise to prescription agriculture that uses web-based tools for creating maps or prescriptions, telling farmers how much fertilizer they need to apply to certain crops and areas.
3. Autonomous Flying
Autonomous flying uses computer vision technology for hovering in the air while avoiding obstacles and moving in a straight path. With the introduction of artificial intelligence, these flying machines are getting smarter. From aerial view monitoring to security surveillance, video recording, rescue missions, and more, drones and unmanned aerial vehicles are revolutionizing and replacing many job roles. The application of computer vision in autonomous flying includes obstacle detection, collision avoidance, self-navigation, and object tracking.
Machine learning can bring some drastic changes to how autonomous flying vehicles function. While object tracking UAVs capture real-time data, it also uses an on-board intelligence system that enables it to make human-independent decisions based on the real-time data.
These drones can be used in urban management and smart cities for advanced surveillance, quick facial recognition, or tracing unwanted objects. They are also highly beneficial in agriculture and farming as they can monitor crops, check the soil fertility, assess soil, and help crop production. Other applications may include:
- Scanning or mapping terrain of buildings in real estate;
- In the military to bombard or combat enemies in the war;
- For human tracking and face recognition.
4. Retail, Shopping and Fashion
The retail sector is reaping the benefits of AI and machine learning for some time now. Artificial intelligence is helping retailers better understand their target market through data analysis. Since data is the new currency of this digital world, it can make or break a business. Keeping this in mind, retailers are using predictive analytics to help forecast customer behavior based on sales data. E-commerce sites are using recommendations based on the customer’s regional search trends, location, and search history. Moreover, shopping sites like Amazon offer its customers product recommendations based on past sales data.
AI also helps retailers enhance their online store by customizing messages they send to their prospective customers. Content generation is a tedious process, but with AI’s Natural Language Generation (NLG), retailers can send targeted messages and offers to customers.
Robots have been introduced to manage the inventory and sales floor, giving ultimate precision and cutting high costs. And when it comes to fashion, AI is slowly taking over the supply chain and fashion store. From sorting of dresses to sewing, these mundane tasks are performed by AI-induced systems with better accuracy and faster speed. Robots can easily stitch fabrics with precision and can also detect flaws in the material, ensuring quality assurance.
5. Security and Surveillance
The robots developed Today use artificial intelligence, long-range sensors, high-definition cameras, and fast computer processing, all of which makes for a pretty decent security system for different needs. Experts believe that robots can easily guard a designated area. There are robots in the making that uses mapping software for creating a geo-fenced perimeter.
They are designed to monitor the grounds and inside of the building. These security robots are intelligently designed and use differential GPS that can easily find objects within a few centimeters. So, when it is moving, it knows exactly where it is. They can record and store data on a daily basis with their security camera. The foundation of an AI-based security system is a self-monitoring system that features an HD camera.
The latest AI-powered security robots use facial recognition to store the identities of people visiting a particular house or building and create a catalog of individuals who are regular visitors or those who are known.
6. Sports Analytics and Activities
The sports industry is embracing artificial intelligence and robots to make games more exciting and fairer. Sports are more than just games for millions of people. For some, it is an emotion. Above all, it is a billion-dollar industry. With so much at stake, organizations and associations across the globe are trying their best to gain a competitive advantage and keep the fans happy using robotics technology and artificial intelligence.
AI is helping players improve their fitness and help teams discover new talents. In some sports, robot referees are already a thing, while smart machines are assisting spectators in finding their seats at the stadium. For those who don’t want to visit the jam packed stadium to have fun, their fan experience is retained and redefined using VR headsets. Artificial intelligence is also helping clubs and teams come up with strategies based on previous data.
The following are some of the interventions that are being implemented in the sports industry:
- Smart apps and Virtual Reality tech are driving fan engagement;
- Tech-powered refereeing is soon going to become a reality;
- Smart algorithms are developing new games;
- AI is helping team management and support staff to find new star players;
- AI is assisting clubs and teams to protect the wellbeing of their players.
7. Manufacturing and Production
The evolution of the manufacturing and production industry is seen with the implementation of robotics and AI. The primary reason for the introduction of AI in the manufacturing industry is to cover for the lack of workforce, simplify the whole production process, and improve efficiency. Earlier, it used to take a whole team’s effort to manage one task system. Now since bots have taken over, it has helped manufacturers boost production speed.
AI is helping the industry by making product decisions instant and smarter. This is an era of customized products, and AI is helping manufacturers gather useful customer data, which is used to make product-based decisions. Also, it has helped the companies to reduce the overall cost of production. AI and robotics is the future of manufacturing. To get a better understanding of how essential are robotics and AI in the manufacturing industry, have a look at their use cases:
- Demand-based production;
- Automatic control;
- Damage control and quick maintenance;
- Product design and redesign.
Robotics and AI have influenced the way computer games are designed and played. AI is helping game developers to create characters and generate their behavior to imitate humans. The primary goal of artificial intelligence in games is collecting and processing data obtained from players. Above all, it has enabled game developers to create games based on their needs and expectations.
Also, online gambling has benefited a lot from artificial intelligence. It studies the expectations and preferences of the gambler for example in top 10 online casinos Australia, giving them maximum satisfaction from the comfort of their home. The adaptability and learning nature of the algorithm of AI allows for creating realistic and natural game environments.
Last but not least, AI-based games have tremendous graphics. It needs a team of hundreds of developers to create such stunning graphics, but thanks to AI, the whole process is automated. This not only saves time, money, and resources as well.
Artificial intelligence and robotics are the driving force of the future. In the next decade, you will surely see some stunning technological revelations based on AI. AI is all about data, and when properly implemented, it will use the given data to our benefit, automating most of the processes and making our lives easier.
Trade with the Official CFD Partners of AC Milan
The Easiest Way to Way To Trade Crypto.
Deep Learning vs Machine Learning: How an Emerging Field Influences Traditional Computer Programming
When two different concepts are greatly intertwined, it can be difficult to separate them as distinct academic topics. That might explain why it’s so difficult to separate deep learning from machine learning as a whole. Considering the current push for both automation as well as instant gratification, a great deal of renewed focus has been heaped on the topic.
Everything from automated manufacturing worfklows to personalized digital medicine could potentially grow to rely on deep learning technology. Defining the exact aspects of this technical discipline that will revolutionize these industries is, however, admittedly much more difficult. Perhaps it’s best to consider deep learning in the context of a greater movement in computer science.
Defining Deep Learning as a Subset of Machine Learning
Machine learning and deep learning are essentially two sides of the same coin. Deep learning techniques are a specific discipline that belong to a much larger field that includes a large variety of trained artificially intelligent agents that can predict the correct response in an equally wide array of situations. What makes deep learning independent of all of these other techniques, however, is the fact that it focuses almost exclusively on teaching agents to accomplish a specific goal by learning the best possible action in a number of virtual environments.
Traditional machine learning algorithms usually teach artificial nodes how to respond to stimuli by rote memorization. This is somewhat similar to human teaching techniques that consist of simple repetition, and therefore might be thought of the computerized equivalent of a student running through times tables until they can recite them. While this is effective in a way, artificially intelligent agents educated in such a manner may not be able to respond to any stimulus outside of the realm of their original design specifications.
That’s why deep learning specialists have developed alternative algorithms that are considered to be somewhat superior to this method, though they are admittedly far more hardware intensive in many ways. Subrountines used by deep learning agents may be based around generative adversarial networks, convolutional neural node structures or a practical form of restricted Boltzmann machine. These stand in sharp contrast to the binary trees and linked lists used by conventional machine learning firmware as well as a majority of modern file systems.
Self-organizing maps have also widely been in deep learning, though their applications in other AI research fields have typically been much less promising. When it comes to defining the deep learning vs machine learning debate, however, it’s highly likely that technicians will be looking more for practical applications than for theoretical academic discussion in the coming months. Suffice it to say that machine learning encompasses everything from the simplest AI to the most sophisticated predictive algorithms while deep learning constitutes a more selective subset of these techniques.
Practical Applications of Deep Learning Technology
Depending on how a particular program is authored, deep learning techniques could be deployed along supervised or semi-supervised neural networks. Theoretically, it’d also be possible to do so via a completely unsupervised node layout, and it’s this technique that has quickly become the most promising. Unsupervised networks may be useful for medical image analysis, since this application often presents unique pieces of graphical information to a computer program that have to be tested against known inputs.
Traditional binary tree or blockchain-based learning systems have struggled to identify the same patterns in dramatically different scenarios, because the information remains hidden in a structure that would have otherwise been designed to present data effectively. It’s essentially a natural form of steganography, and it has confounded computer algorithms in the healthcare industry. However, this new type of unsupervised learning node could virtually educate itself on how to match these patterns even in a data structure that isn’t organized along the normal lines that a computer would expect it to be.
Others have proposed implementing semi-supervised artificially intelligent marketing agents that could eliminate much of the concern over ethics regarding existing deal-closing software. Instead of trying to reach as large a customer base as possible, these tools would calculate the odds of any given individual needing a product at a given time. In order to do so, it would need certain types of information provided by the organization that it works on behalf of, but it would eventually be able to predict all further actions on its own.
While some companies are currently relying on tools that utilize traditional machine learning technology to achieve the same goals, these are often wrought with privacy and ethical concerns. The advent of deep structured learning algorithms have enabled software engineers to come up with new systems that don’t suffer from these drawbacks.
Developing a Private Automated Learning Environment
Conventional machine learning programs often run into serious privacy concerns because of the fact that they need a huge amount of input in order to draw any usable conclusions. Deep learning image recognition software works by processing a smaller subset of inputs, thus ensuring that it doesn’t need as much information to do its job. This is of particular importance for those who are concerned about the possibility of consumer data leaks.
Considering new regulatory stances on many of these issues, it’s also quickly become something that’s become important from a compliance standpoint as well. As toxicology labs begin using bioactivity-focused deep structured learning packages, it’s likely that regulators will express additional concerns in regards to the amount of information needed to perform any given task with this kind of sensitive data. Computer scientists have had to scale back what some have called a veritable fire hose of bytes that tell more of a story than most would be comfortable with.
In a way, these developments hearken back to an earlier time when it was believed that each process in a system should only have the amount of privileges necessary to complete its job. As machine learning engineers embrace this paradigm, it’s highly likely that future developments will be considerably more secure simply because they don’t require the massive amount of data mining necessary to power today’s existing operations.
Image Credit: toptal.io
Extra Crunch roundup: Tonal EC-1, Deliveroo’s rocky IPO, is Substack really worth $650M?
For this morning’s column, Alex Wilhelm looked back on the last few months, “a busy season for technology exits” that followed a hot Q4 2020.
We’re seeing signs of an IPO market that may be cooling, but even so, “there are sufficient SPACs to take the entire recent Y Combinator class public,” he notes.
Once we factor in private equity firms with pockets full of money, it’s evident that late-stage companies have three solid choices for leveling up.
Seeking more insight into these liquidity options, Alex interviewed:
- DigitalOcean CEO Yancey Spruill, whose company went public via IPO;
- Latch CFO Garth Mitchell, who discussed his startup’s merger with real estate SPAC $TSIA;
- Brian Cruver, founder and CEO of AlertMedia, which recently sold to a private equity firm.
After recapping their deals, each executive explains how their company determined which flashing red “EXIT” sign to follow. As Alex observed, “choosing which option is best from a buffet’s worth of possibilities is an interesting task.”
Thanks very much for reading Extra Crunch! Have a great weekend.
Senior Editor, TechCrunch
Full Extra Crunch articles are only available to members
Use discount code ECFriday to save 20% off a one- or two-year subscription
The Tonal EC-1
On Tuesday, we published a four-part series on Tonal, a home fitness startup that has raised $200 million since it launched in 2018. The company’s patented hardware combines digital weights, coaching and AI in a wall-mounted system that sells for $2,995.
By any measure, it is poised for success — sales increased 800% between December 2019 and 2020, and by the end of this year, the company will have 60 retail locations. On Wednesday, Tonal reported a $250 million Series E that valued the company at $1.6 billion.
Our deep dive examines Tonal’s origins, product development timeline, its go-to-market strategy and other aspects that combined to spark investor interest and customer delight.
We call this format the “EC-1,” since these stories are as comprehensive and illuminating as the S-1 forms startups must file with the SEC before going public.
Here’s how the Tonal EC-1 breaks down:
We have more EC-1s in the works about other late-stage startups that are doing big things well and making news in the process.
What to make of Deliveroo’s rough IPO debut
Why did Deliveroo struggle when it began to trade? Is it suffering from cultural dissonance between its high-growth model and more conservative European investors?
Let’s peek at the numbers and find out.
Kaltura puts debut on hold. Is the tech IPO window closing?
The Exchange doubts many folks expected the IPO climate to get so chilly without warning. But we could be in for a Q2 pause in the formerly scorching climate for tech debuts.
Is Substack really worth $650M?
A $65 million Series B is remarkable, even by 2021 standards. But the fact that a16z is pouring more capital into the alt-media space is not a surprise.
Substack is a place where publications have bled some well-known talent, shifting the center of gravity in media. Let’s take a look at Substack’s historical growth.
RPA market surges as investors, vendors capitalize on pandemic-driven tech shift
Robotic process automation came to the fore during the pandemic as companies took steps to digitally transform. When employees couldn’t be in the same office together, it became crucial to cobble together more automated workflows that required fewer people in the loop.
RPA has enabled executives to provide a level of automation that essentially buys them time to update systems to more modern approaches while reducing the large number of mundane manual tasks that are part of every industry’s workflow.
E-commerce roll-ups are the next wave of disruption in consumer packaged goods
This year is all about the roll-ups, the aggregation of smaller companies into larger firms, creating a potentially compelling path for equity value. The interest in creating value through e-commerce brands is particularly striking.
Just a year ago, digitally native brands had fallen out of favor with venture capitalists after so many failed to create venture-scale returns. So what’s the roll-up hype about?
Hack takes: A CISO and a hacker detail how they’d respond to the Exchange breach
The cyber world has entered a new era in which attacks are becoming more frequent and happening on a larger scale than ever before. Massive hacks affecting thousands of high-level American companies and agencies have dominated the news recently. Chief among these are the December SolarWinds/FireEye breach and the more recent Microsoft Exchange server breach.
Everyone wants to know: If you’ve been hit with the Exchange breach, what should you do?
5 machine learning essentials nontechnical leaders need to understand
Machine learning has become the foundation of business and growth acceleration because of the incredible pace of change and development in this space.
But for engineering and team leaders without an ML background, this can also feel overwhelming and intimidating.
Here are best practices and must-know components broken down into five practical and easily applicable lessons.
Embedded procurement will make every company its own marketplace
Embedded procurement is the natural evolution of embedded fintech.
In this next wave, businesses will buy things they need through vertical B2B apps, rather than through sales reps, distributors or an individual merchant’s website.
Knowing when your startup should go all-in on business development
There’s a persistent fallacy swirling around that any startup growing pain or scaling problem can be solved with business development.
That’s frankly not true.
Dear Sophie: What should I know about prenups and getting a green card through marriage?
I’m a founder of a startup on an E-2 investor visa and just got engaged! My soon-to-be spouse will sponsor me for a green card.
Are there any minimum salary requirements for her to sponsor me? Is there anything I should keep in mind before starting the green card process?
— Betrothed in Belmont
Startups must curb bureaucracy to ensure agile data governance
Many organizations perceive data management as being akin to data governance, where responsibilities are centered around establishing controls and audit procedures, and things are viewed from a defensive lens.
That defensiveness is admittedly justified, particularly given the potential financial and reputational damages caused by data mismanagement and leakage.
Nonetheless, there’s an element of myopia here, and being excessively cautious can prevent organizations from realizing the benefits of data-driven collaboration, particularly when it comes to software and product development.
Bring CISOs into the C-suite to bake cybersecurity into company culture
Cyber strategy and company strategy are inextricably linked. Consequently, chief information security officers in the C-Suite will be just as common and influential as CFOs in maximizing shareholder value.
How is edtech spending its extra capital?
Edtech unicorns have boatloads of cash to spend following the capital boost to the sector in 2020. As a result, edtech M&A activity has continued to swell.
The idea of a well-capitalized startup buying competitors to complement its core business is nothing new, but exits in this sector are notable because the money used to buy startups can be seen as an effect of the pandemic’s impact on remote education.
But in the past week, the consolidation environment made a clear statement: Pandemic-proven startups are scooping up talent — and fast.
Tech in Mexico: A confluence of Latin America, the US and Asia
Knowledge transfer is not the only trend flowing in the U.S.-Asia-LatAm nexus. Competition is afoot as well.
Because of similar market conditions, Asian tech giants are directly expanding into Mexico and other LatAm countries.
How we improved net retention by 30+ points in 2 quarters
There’s certainly no shortage of SaaS performance metrics leaders focus on, but NRR (net revenue retention) is without question the most underrated metric out there.
NRR is simply total revenue minus any revenue churn plus any revenue expansion from upgrades, cross-sells or upsells. The greater the NRR, the quicker companies can scale.
5 mistakes creators make building new games on Roblox
Even the most experienced and talented game designers from the mobile F2P business usually fail to understand what features matter to Robloxians.
For those just starting their journey in Roblox game development, these are the most common mistakes gaming professionals make on Roblox.
CEO Manish Chandra, investor Navin Chaddha explain why Poshmark’s Series A deck sings
“Lead with love, and the money comes.” It’s one of the cornerstone values at Poshmark. On the latest episode of Extra Crunch Live, Chandra and Chaddha sat down with us and walked us through their original Series A pitch deck.
Will the pandemic spur a smart rebirth for cities?
Cities are bustling hubs where people live, work and play. When the pandemic hit, some people fled major metropolitan markets for smaller towns — raising questions about the future validity of cities.
But those who predicted that COVID-19 would destroy major urban communities might want to stop shorting the resilience of these municipalities and start going long on what the post-pandemic future looks like.
The NFT craze will be a boon for lawyers
There’s plenty of uncertainty surrounding copyright issues, fraud and adult content, and legal implications are the crux of the NFT trend.
Whether a court would protect the receipt-holder’s ownership over a given file depends on a variety of factors. All of these concerns mean artists may need to lawyer up.
Viewing Cazoo’s proposed SPAC debut through Carvana’s windshield
It’s a reasonable question: Why would anyone pay that much for Cazoo today if Carvana is more profitable and whatnot? Well, growth. That’s the argument anyway.
The AI Trends Reshaping Health Care
Click to learn more about author Ben Lorica.
Applications of AI in health care present a number of challenges and considerations that differ substantially from other industries. Despite this, it has also been one of the leaders in putting AI to work, taking advantage of the cutting-edge technology to improve care. The numbers speak for themselves: The global AI in health care market size is expected to grow from $4.9 billion in 2020 to $45.2 billion by 2026. Some major factors driving this growth are the sheer volume of health care data and growing complexities of datasets, the need to reduce mounting health care costs, and evolving patient needs.
Deep learning, for example, has made considerable inroads into the clinical environment over the last few years. Computer vision, in particular, has proven its value in medical imaging to assist in screening and diagnosis. Natural language processing (NLP) has provided significant value in addressing both contractual and regulatory concerns with text mining and data sharing. Increasing adoption of AI technology by pharmaceutical and biotechnology companies to expedite initiatives like vaccine and drug development, as seen in the wake of COVID-19, only exemplifies AI’s massive potential.
We’re already seeing amazing strides in health care AI, but it’s still the early days, and to truly unlock its value, there’s a lot of work to be done in understanding the challenges, tools, and intended users shaping the industry. New research from John Snow Labs and Gradient Flow, 2021 AI in Healthcare Survey Report, sheds light on just this: where we are, where we’re going, and how to get there. The global survey explores the important considerations for health care organizations in varying stages of AI adoption, geographies, and technical prowess to provide an extensive look into the state of AI in health care today.
One of the most significant findings is around which technologies are top of mind when it comes to AI implementation. When asked what technologies they plan to have in place by the end of 2021, almost half of respondents cited data integration. About one-third cited natural language processing (NLP) and business intelligence (BI) among the technologies they are currently using or plan to use by the end of the year. Half of those considered technical leaders are using – or soon will be using – technologies for data integration, NLP, business intelligence, and data warehousing. This makes sense, considering these tools have the power to help make sense of huge amounts of data, while also keeping regulatory and responsible AI practices in mind.
When asked about intended users for AI tools and technologies, over half of respondents identified clinicians among their target users. This indicates that AI is being used by people tasked with delivering health care services – not just technologists and data scientists, as in years past. That number climbs even higher when evaluating mature organizations, or those that have had AI models in production for more than two years. Interestingly, nearly 60% of respondents from mature organizations also indicated that patients are also users of their AI technologies. With the advent of chatbots and telehealth, it will be interesting to see how AI proliferates for both patients and providers over the next few years.
In considering software for building AI solutions, open-source software (53%) had a slight edge over public cloud providers (42%). Looking ahead one to two years, respondents indicated openness to also using both commercial software and commercial SaaS. Open-source software gives users a level of autonomy over their data that cloud providers can’t, so it’s not a big surprise that a highly regulated industry like health care would be wary of data sharing. Similarly, the majority of companies with experience deploying AI models to production choose to validate models using their own data and monitoring tools, rather than evaluation from third parties or software vendors. While earlier-stage companies are more receptive to exploring third-party partners, more mature organizations are tending to take a more conservative approach.
Generally, attitudes remained the same when asked about key criteria used to evaluate AI solutions, software libraries or SaaS solutions, and consulting companies to work with.Although the answers varied slightly for each category,technical leaders considered no data sharing with software vendors or consulting companies, the ability to train their own models, and state-of-the art accuracy as top priorities. Health care-specific models and expertise in health care data engineering, integration, and compliance topped the list when asked about solutions and potential partners. Privacy, accuracy, and health care experience are the forces driving AI adoption. It’s clear that AI is poised for even more growth, as data continues to grow and technology and security measures improve. Health care, which can sometimes be seen as a laggard for quick adoption, is taking to AI and already seeing its significant impact. While its approach, the top tools and technologies, and applications of AI may differ from other industries, it will be exciting to see what’s in store for next year’s survey results.
Turns out humans are leading AI systems astray because we can’t agree on labeling
Top datasets used to train AI models and benchmark how the technology has progressed over time are riddled with labeling errors, a study shows.
Data is a vital resource in teaching machines how to complete specific tasks, whether that’s identifying different species of plants or automatically generating captions. Most neural networks are spoon-fed lots and lots of annotated samples before they can learn common patterns in data.
But these labels aren’t always correct; training machines using error-prone datasets can decrease their performance or accuracy. In the aforementioned study, led by MIT, analysts combed through ten popular datasets that have been cited more than 100,000 times in academic papers and found that on average 3.4 per cent of the samples are wrongly labelled.
The datasets they looked at range from photographs in ImageNet, to sounds in AudioSet, reviews scraped from Amazon, to sketches in QuickDraw. Examples of some of the mistakes compiled by the researchers show that in some cases, it’s a clear blunder, such as a drawing of a light bulb tagged as a crocodile, in others, however, it’s not always obvious. Should a picture of a bucket of baseballs be labeled as ‘baseballs’ or ‘bucket’?
Inside the 1TB ImageNet dataset used to train the world’s AI: Naked kids, drunken frat parties, porno stars, and more
Annotating each sample is laborious work. This work is often outsourced work to services like Amazon Mechanical Turk, where workers are paid the square root of sod all to sift through the data piece by piece, labeling images and audio to feed into AI systems. This process amplifies biases and errors, as Vice documented here.
Workers are pressured to agree with the status quo if they want to get paid: if a lot of them label a bucket of baseballs as a ‘bucket’, and you decide it’s ‘baseballs’, you may not be paid at all if the platform figures you’re wrong or deliberately trying to mess up the labeling. That means workers will choose the most popular label to avoid looking like they’ve made a mistake. It’s in their interest to stick to the narrative and avoid sticking out like a sore thumb. That means errors, or worse, racial biases and suchlike, snowball in these datasets.
The error rates vary across the datasets. In ImageNet, the most popular dataset used to train models for object recognition, the rate creeps up to six per cent. Considering it contains about 15 million photos, that means hundreds of thousands of labels are wrong. Some classes of images are more affected than others, for example, ‘chameleon’ is often mistaken for ‘green lizard’ and vice versa.
There are other knock-on effects: neural nets may learn to incorrectly associate features within data with certain labels. If, say, many images of the sea seem to contain boats and they keep getting tagged as ‘sea’, a machine might get confused and be more likely to incorrectly recognize boats as seas.
Problems don’t just arise when trying to compare the performance of models using these noisy datasets. The risks are higher if these systems are deployed in the real world, Curtis Northcutt, co-lead author of the stud and a PhD student at MIT, and also cofounder and CTO of ChipBrain, a machine-learning hardware startup, explained to The Register.
“Imagine a self-driving car that uses an AI model to make steering decisions at intersections,” he said. “What would happen if a self-driving car is trained on a dataset with frequent label errors that mislabel a three-way intersection as a four-way intersection? The answer: it might learn to drive off the road when it encounters three-way intersections.
What would happen if a self-driving car is trained on a dataset with frequent label errors that mislabel a three-way intersection as a four-way intersection?
“Maybe one of your AI self-driving models is actually more robust to training noise, so that it doesn’t drive off the road as much. You’ll never know this if your test set is too noisy because your test set labels won’t match reality. This means you can’t properly gauge which of your auto-pilot AI models drives best – at least not until you deploy the car out in the real-world, where it might drive off the road.”
When the team working on the study trained some convolutional neural networks on portions of ImageNet that have been cleared of errors, their performance improved. The boffins believe that developers should think twice about training large models on datasets that have high error rates, and advise them to sort through the samples first. Cleanlab, the software the team developed and used to identify incorrect and inconsistent labels, can be found on GitHub.
“Cleanlab is an open-source python package for machine learning with noisy labels,” said Northcutt. “Cleanlab works by implementing all of the theory and algorithms in the sub-field of machine learning called confident learning, invented at MIT. I built cleanlab to allow other researchers to use confident learning – usually with just a few lines of code – but more importantly, to advance the progress of science in machine learning with noisy labels and to provide a framework for new researchers to get started easily.”
And be aware that if a dataset’s labels are particularly shoddy, training large complex neural networks may not always be so advantageous. Larger models tend to overfit to data more than smaller ones.
“Sometimes using smaller models will work for very noisy datasets. However, instead of always defaulting to using smaller models for very noisy datasets, I think the main takeaway is that machine learning engineers should clean and correct their test sets before they benchmark their models,” Northcutt concluded. ®
Bitcoin Preis steigt auf über 60.000 USD, neues ATH wahrscheinlich
WUFL Season 1 comes to an end with a huge win for Arslan Ash
Wild Rift patch 2.2a brings tons of champion changes and the addition of Rammus later this month
Epic Games Store lost $181 million & $273 million in 2019 and 2020
How to tame Raptors in Fortnite Season 6?
CS:GO: RpK To Play Last Tournament With Vitality
Three of the Major Threats to Application Security and How to Mitigate Them
Team Spirit take Dota 2 Champions League trophy with stellar performance
Innovative middleware platform Doshii signs up three new POS partners
Splitit partners with UnionPay, the world’s largest card network of 9 billion cardholders
Forscher sehen Zentralisierung von Binance kritisch
Fortnite: How To Get The Axe Of Champions – The Rarest Pickaxe
Flybuys and Klarna take the points game to the next level
VCT Stage 2 Challengers 1 North America recap
Australia’s first virtual card technology of its kind secures a further $1.7 million in Series A funding
European VCT Challengers 2 teams set, Team Liquid and Fnatic qualify
100 Thieves Sweeps VCT Stage 2 Challengers Main Event
LoL: Everything You Need To Know About MSI 2021
Cloud9 Topple Team Liquid to Win 2021 LCS Mid-Season Showdown
From 16-1 to 6-16: Love and Heartbreak in CS:GO
MAD Lions Win the 2021 LEC Spring Split After a Thrilling Five-Game Brawl
Toronto Ultra Surprisingly Wins Call of Duty League Major 2
MYOB and Butn form strategic partnership to benefit businesses
F1 Esports launch Women’s Wildcard qualification to Pro Exhibition
Hikaru Nakamura drops chessbae, apologizes for YouTube strike
Complete guide to romance and marriage in Stardew Valley
TenZ on loan to Sentinels through Valorant Challengers Finals
Slacked Benched by Florida Mutineers, No Replacement Announced Yet
Heroic win ESL Pro League Season 13 after Two Overtimes and a 1v4 Clutch
DWG KIA Outclass Gen.G in 3-0 Fashion, Win the 2021 LCK Spring Split
Esports2 days ago
chessbae removed as moderator from Chess.com amid drama
Blockchain1 week ago
Bitcoin Cash Price Prediction: BCH/USD Price Turns Bearish; Can the $540 Support Hold?
Blockchain1 week ago
How Chainlink will help secure Polkadot’s environment
Blockchain1 week ago
Blockchain-based renewable energy marketplaces gain traction in 2021
Esports1 week ago
GeneRaL is replaced by RAMZES666 on Na’Vi
Esports1 week ago
Amouranth becomes Twitch’s top female streamer, beats Pokimane
Blockchain1 week ago
Mark Cuban Thinks Dogecoin ($DOGE) Could Get to $1, but Could It Get to $10?
Blockchain1 week ago
Hardware Hacker Modifies Old School Game Boy To Mine Bitcoin
Blockchain1 week ago
‘Silent crash’ as price floors collapse across NFT space
Blockchain1 week ago
IRS gets access to crypto exchange Circle’s user data, targets Kraken next
Blockchain1 week ago
Bitcoin Miners Net Position Turns Positive: Is A Rally to New Highs Overdue?
Blockchain1 week ago
Crypto Analyst Tyler Swope: These 3 Altcoin Gems Could Go Exponential