Connect with us

Artificial Intelligence

What is Training Data Security and Why Does it Matter?



Modzy Hacker Noon profile picture


A software platform for organizations and developers to responsibly deploy, monitor, and get value from AI – at scale.

The effectiveness and predictive power of machine learning models is highly dependent on the quality of data used during the training phase. In most real-world scenarios, models are trained using domain-specific data provided by known and trusted sources.

However, not all data sources are known and benevolent; some have an adversarial nature and aim to corrupt the way models make their predictions. For example, malicious users can poison the data used for training a machine learning model by injecting false samples into the training dataset.

Consequently, training data security and authentication should be considered as a crucial step during the training, testing and development process of machine learning models. At Modzy, we have developed a unique solution which intelligently filters out unclean data points before the data is sent to the machine learning model to guarantee the quality and security of training and test datasets.

What You Need to Know About Adversarial Attacks and Training Data

A body of work in the machine learning research community points to the effectiveness of data poisoning attacks in degrading the performance of machine learning models [1, 2]. In these attacks, the attacker aims to manipulate the performance of a model by inserting carefully constructed poison instances into the data.

For example, poison frog attacks are a type of attack where the poisoned instances are introduced to the training dataset in an engineered way so that after training, the model is incapable of correctly classifying specific instances belonging to a specific object class [2].

Under another attack framework called backdoor attacks, the attacker has control over a portion of the data and can leverage that portion to force the model into making decisions based on what the attacker desires [3], turning the model into malware. For example, an attacker can teach a malware classifier that if a certain string is present in a file, then that file should be tagged as benign.

This means that the attacker now can compose malware files, including the specific string somewhere in the file, which will be tagged as benign by the classifier.

Traditionally, computer security focused on protecting a system against attackers by solidifying boundaries between the system and the outside world [4]. However, the most critical part of any machine learning process is the training and testing data which come directly from the outside world, which includes the security of training and testing data.

Given that most machine learning models are either trained on user data or tested against it, an attacker can easily inject malicious data to affect the performance of the machine learning model [5].

Further, as transfer learning becomes a more common tool used in the training of machine learning models in different applications, these types of data-based attacks may be transferred from one model to another easily and propagate quickly in an inconspicuous manner. The possibility of such attacks drives us to define training data security as a fundamental part of any data science and machine learning process at Modzy.

Modzy Approach to Training Data Security

Security is at the heart of the Modzy platform. At Modzy, we take the security and authentication of datasets and the data used during training and inference very seriously, because it is of utmost concern for many of the customers we serve. Our data scientists developed a new approach for detecting data points in training and testing datasets that were manipulated by adversaries.

This detection framework acts as a filter on the data during both training and inference to detect the poisoned data instances before they get to the model. Our detection model consists of a novel architecture which utilizes Residual Networks (ResNet) and was trained on a large adversarial dataset to detect data points poisoned by a variety of different attack methodologies. This model can detect poisoned data by learning how adversarial data points behave inside a machine learning model.

One of the most interesting properties of our detection solution is its ability to transfer from one dataset to another. In other words, our detection solution can detect adversarial inputs for a range of applications, datasets, and model architectures. This means that our modular detection solution can be attached to a variety of machine learning models to increase the defensive capabilities and robustness of the models against a range of different adversarial attacks.

What This Means for You

As machine learning is being increasingly used for consequential decision-making processes in mission-critical environments, model protection against adversarial attacks becomes increasingly important. To do so, we must first understand various types of adversarial attacks during both training and inference.

One aspect of this protection is to authenticate and ensure the security of the training data before training starts, and also to protect the input data points before they are input to the model during inference.

Most machine learning models were originally designed without any concern for security and robustness against attacks, but researchers in the field have since identified several kinds of attacks under the umbrella of adversarial machine learning; all of these can greatly undermine the utility of machine learning models.

It is of paramount importance that any machine learning pipeline used for training, testing and development of models is designed to take in account the security of both training and inference data. Modzy’s data scientists are actively working toward developing better defensive solutions and applying these solutions to the training and development of all Modzy’s AI models.


  • Biggio, Battista, Blaine Nelson, and Pavel Laskov. “Poisoning attacks against support vector machines.” arXiv preprint arXiv:1206.6389 (2012).
  • Shafahi, Ali, et al. “Poison frogs! targeted clean-label poisoning attacks on neural networks.” Advances in Neural Information Processing Systems. 2018.
  • Chen, Xinyun, et al. “Targeted backdoor attacks on deep learning systems using data poisoning.” arXiv preprint arXiv:1712.05526 (2017).
  • Bishop, Matthew A. “The art and science of computer security.” (2002).
  • Steinhardt, Jacob, Pang Wei W. Koh, and Percy S. Liang. “Certified defenses for data poisoning attacks.” Advances in neural information processing systems. 2017.
by Modzy @modzy. A software platform for organizations and developers to responsibly deploy, monitor, and get value from AI – at scale.Visit us


Join Hacker Noon

Create your free account to unlock your custom reading experience.

Coinsmart. Beste Bitcoin-Börse in Europa


Digital Onboarding: BNY Mellon and Saphyre to Leverage AI to Enhance Customer Experience



BNY Mellon (NYSE: BK), an American investment banking services holding company headquartered in New York City with over $380 billion in assets, and Saphyre recently revealed that they’ll utilize AI tech to enhance the customer experience while also automating and expediting client onboarding.

This partnership with Saphyre supports the bank’s OMNISM strategy to work cooperatively with Fintechs to better support customers’ investment goals.

Saphyre’s platform has been developed to provide seamless communication between customers and priority stakeholders by enhancing traditional communication methods, like email, fax, and phone calls.

This latest integration between the two firms will allow for improved communication while lowering time to market, and also enabling more efficient international trading.

Caroline Butler, Global Head of Custody at BNY Mellon, stated:

“Time is a finite and precious commodity. BNY Mellon’s work with Saphyre aims to create true savings for our custody clients and truly expedite the client onboarding process. What once took days or weeks, is now near real time. This is yet another example of the digitization efforts BNY Mellon has undertaken in the past two years with a direct client benefit.”

Gabino M. Roche, Jr., CEO and Founder at Saphyre, remarked:

“Having BNY Mellon join the Saphyre endeavor is a great honor. By applying our patented technology to their leading asset servicing operations we’ve demonstrated the ability to intelligently pre-fill client custody packs, allow for digital signatures, auto-setup SWIFT Reporting, Trade Message Routing, and Corporate Action standing instruction – while intelligently and dynamically tracking market requirements and their respective document statuses. In a post-COVID world where AI and digital is paramount, BNY Mellon is fully seizing the innovation mandate.”

Earlier this year, BNY Mellon released a report in which it noted that the bank thinks there’s now real demand for Bitcoin and other cryptocurrencies. In its report, the bank clarified that it’s not attempting to derive a price target or formalize “a valuation mode” for these new forms of assets. However, they intend to look into the different “analogies” and “dissimilarities” that may be applied to Bitcoin and “potentially other areas of cryptos.”

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading

Artificial Intelligence

Waabi’s Raquel Urtasun explains why it was the right time to launch an AV technology startup



Raquel Urtasun, the former chief scientist at Uber ATG, is the founder and CEO of Waabi, an autonomous vehicle startup that came out of stealth mode last week. The Toronto-based company, which will focus on trucking, raised an impressive $83.5 million in a Series A round led by Khosla Ventures. 

Urtasun joined Mobility 2021 to talk about her new venture, the challenges facing the self-driving vehicle industry and how her approach to AI can be used to advance the commercialization of AVs.

Why did Urtasun decide to found her own company?

Urtasun, who is considered a pioneer in AI, led the R&D efforts as a chief scientist at Uber ATG, which was acquired by Aurora in December. Six months later, we have Waabi. The company’s mission is to take an AI-first approach to solving self-driving technology. 

I left Uber a little bit over three months ago to start this new company, Waabi, with the idea of having a different way of solving self-driving. This is a combination of my 20-year career in AI as well as more than 10 years in self-driving. Thinking about a new company was something that was always in my head. And the more that I was in the industry, the more that I started thinking about going away from the traditional approach and trying to have a diverse view of how to solve self-driving was actually the way to go. So that’s why I decided to do this company. (Time stamp: 1:21)

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading

Artificial Intelligence

Fraud protection startup nSure AI raises $6.8M in seed funding



Fraud protection startup nSure AI has raised $6.8 million in seed funding, led by DisruptiveAI, Phoenix Insurance, AXA-backed venture builder Kamet, Moneta Seeds and private investors.

The round will help the company bolster the predictive AI and machine learning algorithms that power nSure AI’s “first of its kind” fraud protection platform. Prior to this round, the company received $550,000 in pre-seed funding from Kamet in March 2019.

The Tel Aviv-headquartered startup, which currently has 16 employees, provides fraud detection for high-risk digital goods, such as electronic gift cards, airline tickets, software and games. While most fraud detection tools analyze each online transaction in an attempt to decide which purchases to approve and decline, nSure AI’s risk engine leverages deep learning techniques to accurately identify fraudulent transactions.

NSure AI, which is backed by insurance company AXA, said it has a 98% approval rating on average for purchases, compared to an industry average of 80%, allowing retailers to recapture nearly $100 billion a year in revenue lost by declining legitimate customers. The company is so confident in its technology that it will accept liability for any fraudulent transaction allowed by the platform.

Founders Alex Zeltcer and Ziv Isaiah started the company after experiencing the unique challenges faced by retailers of digital assets. The first week of their online gift card business found that 40% of sales were fraudulent, resulting in chargebacks. The founders began to develop their own platform for supporting the sale of high-risk digital goods after no other fraud detection service met their needs.

Zeltcer, co-founder and chief executive, said the investment “enables us to register thousands of new merchants, who can feel confident selling higher-risk digital goods, without accepting fraud as a part of business.”

NSure AI, which currently monitors and manages millions of transactions every month, has approved close to $1 billion in volume since going live in 2019.

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading


Enterprise AI platform Dataiku launches managed service for smaller companies



Dataiku is going downstream with a new product today called Dataiku Online. As the name suggests, Dataiku Online is a fully managed version of Dataiku. It lets you take advantage of the data science platform without going through a complicated setup process that involves a system administrator and your own infrastructure.

If you’re not familiar with Dataiku, the platform lets you turn raw data into advanced analytics, run some data visualization tasks, create data-backed dashboards and train machine learning models. In particular, Dataiku can be used by data scientists, but also business analysts and less technical people.

The company has been mostly focused on big enterprise clients. Right now, Dataiku has more than 400 customers, such as Unilever, Schlumberger, GE, BNP Paribas, Cisco, Merck and NXP Semiconductors.

There are two ways to use Dataiku. You can install the software solution on your own, on-premise servers. You can also run it on a cloud instance. With Dataiku Online, the startup offers a third option and takes care of setup and infrastructure for you.

“Customers using Dataiku Online get all the same features that our on-premises and cloud instances provide, so everything from data preparation and visualization to advanced data analytics and machine learning capabilities,” co-founder and CEO Florian Douetteau said. “We’re really focused on getting startups and SMBs on the platform — there’s a perception that small or early-stage companies don’t have the resources or technical expertise to get value from AI projects, but that’s simply not true. Even small teams that lack data scientists or specialty ML engineers can use our platform to do a lot of the technical heavy lifting, so they can focus on actually operationalizing AI in their business.”

Customers using Dataiku Online can take advantage of Dataiku’s pre-built connectors. For instance, you can connect your Dataiku instance with a cloud data warehouse, such as Snowflake Data Cloud, Amazon Redshift and Google BigQuery. You can also connect to a SQL database (MySQL, PostgreSQL…), or you can just run it on CSV files stored on Amazon S3.

And if you’re just getting started and you have to work on data ingestion, Dataiku works well with popular data ingestion services. “A typical stack for our Dataiku Online Customers involves leveraging data ingestion tools like FiveTran, Stitch or Alooma, that sync to a cloud data warehouse like Google BigQuery, Amazon Redshift or Snowflake. Dataiku fits nicely within their modern data stacks,” Douetteau said.

Dataiku Online is a nice offering to get started with Dataiku. High-growth startups might start with Dataiku Online as they tend to be short on staff and want to be up and running as quickly as possible. But as you become bigger, you could imagine switching to a cloud or on-premise installation of Dataiku. Employees can keep using the same platform as the company scales.

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading
Aviation5 days ago

The Story Of The Boeing 777 Family

Esports5 days ago

Every new Passive Power in Legends of Runeterra Lab of Legends 2.9.0

Crowdfunding4 days ago

April/May 2021 Top Campaigns

Aviation2 days ago

Delta Air Lines Flight Diverts To Oklahoma Over Unruly Off-Duty Flight Attendant

Blockchain4 days ago

Crypto Fund Manager Says Bitcoin ETFs to be Approved By 2022

Esports3 days ago

Lost Ark Founders Pack: Everything You Need to Know

Fintech4 days ago

PayPal launches PayPal Rewards Card in Australia

Energy3 days ago

Industrial robots market in the automotive industry | $ 3.97 billion growth expected during 2021-2025 | 17000+ Technavio Research Reports

Cyber Security3 days ago

Data Breach that Impacted Both Audi of America and Volkswagen of America

Energy3 days ago

Daiki Axis Co., Ltd. (4245, First Section, Tokyo Stock Exchange) Overview of Operating Performance for the First Three Months Ended March 31, 2021

Aviation2 days ago

Spirit Airlines Just Made The Best Argument For Lifting LaGuardia’s Perimeter Rule

Cleantech3 days ago

Tesla Model S 420 Plaid Is The Best Car In The World (But Not For Me)

Start Ups5 days ago

Loupe Tech Lands $12M Series A To Connect Sports Card Enthusiasts

Fintech4 days ago

Stripe launches Stripe Tax to simplify global tax compliance for Australian businesses

Blockchain4 days ago

JPMorgan Cautioned Coming Bear Market Signal in Bitcoin

Blockchain4 days ago

Blockchain technology can help to protect sensitive information

Blockchain5 days ago

PayPal Sets New Record of Daily Crypto Volume of Over $300 Million

Private Equity5 days ago

Warburg Pincus backs $150m Series E for cybersecurity company Aura

AI4 days ago

Ransomware Incidents Surging; Cybersecurity Experts Scramble to Respond   

Blockchain3 days ago

DCR Technical Analysis: Look for Support Levels of $130.13 and $126.01