Connect with us


Achieve 12x higher throughput and lowest latency for PyTorch Natural Language Processing applications out-of-the-box on AWS Inferentia




AWS customers like Snap, Alexa, and Autodesk have been using AWS Inferentia to achieve the highest performance and lowest cost on a wide variety of machine learning (ML) deployments. Natural language processing (NLP) models are growing in popularity for real-time and offline batched use cases. Our customers deploy these models in many applications like support chatbots, search, ranking, document summarization, and natural language understanding. With AWS Inferentia you can also achieve out-of-the-box highest performance and lowest cost on opensource NLP models, without the need for customizations.

In this post, you learn how to maximize throughput for both real-time applications with tight latency budgets and batch processing where maximum throughput and lowest cost are key performance goals on AWS Inferentia. For this post, you deploy an NLP-based solution using HuggingFace Transformers pretrained BERT base models, with no modifications to the model and one-line code change at the PyTorch framework level. The solution achieves 12 times higher throughput at 70% lower cost on AWS Inferentia, as compared to deploying the same model on GPUs.

To maximize inference performance of Hugging Face models on AWS Inferentia, you use AWS Neuron PyTorch framework integration. Neuron is a software development kit (SDK) that integrates with popular ML frameworks, such as TensorFlow and PyTorch, expanding the frameworks APIs so you can run high-performance inference easily and cost-effectively on Amazon EC2 Inf1 instances. With a minimal code change, you can compile and optimize your pretrained models to run on AWS Inferentia. The Neuron team is consistently releasing updates with new features and increased model performance. With the v1.13 release, the performance of transformers based models improved by an additional 10%–15%, pushing the boundaries of minimal latency and maximum throughput, even for larger NLP workloads.

To test out the Neuron SDK features yourself, check out the latest Utilizing Neuron Capabilities for PyTorch.

The NeuronCore Pipeline mode explained

Each AWS Inferentia chip, available through the Inf1 instance family, contains four NeuronCores. The different instance sizes provide 1 to 16 chips, totaling 64 NeuronCores on the largest instance size, the inf1.24xlarge. The NeuronCore is a compute unit that runs the operations of the Neural Network (NN) graph.

When you compile a model without Pipeline mode, the Neuron compiler optimizes the supported NN operations to run on a single NeuronCore. You can combine the NeuronCores into groups, even across AWS Inferentia chips, to run the compile model. This configuration allows you to use multiple NeuronCores in data parallel mode across AWS Inferentia chips. This means that, even on the smallest instance size, four models can be active at any given time. Data parallel implementation of four (or more) models provides the highest throughput and lowest cost in most cases. This performance boost comes with minimum impact on latency, because AWS Inferentia is optimized to maximize throughput at small batch sizes.

With Pipeline mode, the Neuron compiler optimizes the partitioning and placement of a single NN graph across a requested number of NeuronCores, in a completely automatic process. It allows for an efficient use of the hardware because the NeuronCores in the pipeline run streaming inference requests, using a faster on-chip cache to hold the model weights. When one of the cores in the pipeline finishes processing a first request it can start processing following requests, without waiting for the last core to complete processing the first request. This streaming pipeline inference increases per core hardware utilization, even when running inference of small batch sizes on real-time applications, such as batch size 1.

Finding the optimum number of NeuronCores to fit a single large model is an empirical process. A good starting point is to use the following approximate formula, but we recommend experimenting with multiple configurations to achieve an optimum deployment:

neuronCore_pipeline_cores = 4*round(number-of-weights-in-model/(2E7))

The compiler directly takes the value of neuroncore-pipeline-cores compilation flag, and that is all there is to it! To enable this feature, add the argument to the usual compilation flow of your desired framework.

In TensorFlow Neuron, use the following code:

import numpy as np
import tensorflow.neuron as tfn example_input = np.zeros([1,224,224,3], dtype='float16')
tfn.saved_model.compile("<Path to your saved model>", "<Path to write compiled model>/1", model_feed_dict={'input_1:0' : example_input }, compiler_args = ['--neuroncore-pipeline-cores', '8'])

In PyTorch Neuron, use the following code:

import torch
import torch_neuron model = torch.jit.load(<Path to your traced model>)
inputs = torch.zeros([1, 3, 224, 224], dtype=torch.float32) model_compiled = torch.neuron.trace(model, example_inputs=inputs, compiler_args = ['--neuroncore-pipeline-cores', '8'])

For more information about the NeuronCore Pipeline and other Neuron features, see Neuron Features.

Run HuggingFace question answering models in AWS Inferentia

To run a Hugging Face BertForQuestionAnswering model on AWS Inferentia, you only need to add a single, extra line of code to the usual Transformers implementation, besides importing the torch_neuron framework. You can adapt the following example of the forward pass method according to the following snippet:

from transformers import BertTokenizer, BertForQuestionAnswering
import torch
import torch_neuron tokenizer = BertTokenizer.from_pretrained('twmkn9/bert-base-uncased-squad2')
model = BertForQuestionAnswering.from_pretrained('twmkn9/bert-base-uncased-squad2',return_dict=False) question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
inputs = tokenizer(question, text, return_tensors='pt') neuron_model = torch.neuron.trace(model, example_inputs = (inputs['input_ids'],inputs['attention_mask']), verbose=1) outputs = neuron_model(*(inputs['input_ids'],inputs['attention_mask']))

The one extra line in the preceding code is the call to the torch.neuron.trace() method. This call compiles the model and returns a new neuron_model() method that you can use to run inference over the original inputs, as shown in the last line of the script. If you want to test this example, see PyTorch Hugging Face pretrained BERT Tutorial.

The ability to compile and run inference using the pretrained models—or even fine-tuned, as in the preceding code—directly from the Hugging Face model repository is the initial step towards optimizing deployments in production. This first step can already produce two times greater performance with 70% lower cost when compared to a GPU alternative (which we discuss later in this post). When you combine NeuronCore Groups and Pipelines features, you can explore many other ways of packaging the models within a single Inf1 instance.

Optimize model deployment with NeuronCore Groups and Pipelines

The HuggingFace question answering deployment requires some of the model’s parameters to be set a priori. Neuron is an ahead-of-time (AOT) compiler, which requires knowledge of the tensor shapes at compile time. For that, we define both batch size and sequence length for our model deployment. In the previous example, the Neuron framework inferred those from the example input passed on the trace call: (inputs[‘input_ids’], inputs[‘attention_mask’]).

Besides those two model parameters, you can set the compiler argument ‘--neuroncore-pipeline-cores’ and the environment variable ‘NEURONCORE_GROUP_SIZES‘ to fine-tune how your model server consumes the NeuronCores on the AWS Inferentia chip.

For example, to maximize the number of concurrent server workers processing the inference request on a single AWS Inferentia chip—four cores—you set NEURONCORE_GROUP_SIZES=”1,1,1,1” and ‘--neuroncore-pipeline-cores’ to 1, or leave it out as a compiler argument. The following image depicts this split. It’s a full data parallel deployment.

For minimum latency, you can set ‘--neuroncore-pipeline-cores’ to 4 and NEURONCORE_GROUP_SIZES=”4” so that the process consumes all four NeuronCores at once, for a single model. The AWS Inferentia chip can process four inference requests concurrently, as a stream. The model pipeline parallel deployment looks like the following figure.

Data parallel deployments favor throughput with multiple workers processing requests concurrently. The pipeline parallel, however, favors latency, but can also improve throughput due to the stream processing behavior. With these two extra parameters, you can fine-tune the serving application architecture according to the most important serving metrics for your use case.

Optimize for minimum latency: Multi-core pipeline parallel

Consider an application that requires minimum latency, such as sequence classification as part of an online chatbot workflow. As the user submits text, a model running on the backend classifies the intent of a single user input and is bounded by how fast it can infer. The model most likely has to provide responses to single input (batch size 1) requests.

The following table compare the performance and cost of Inf1 instances vs. the g4dn.xlarge—the most optimized GPU instance family for inference in the cloud—while running the HuggingFace BERT base model in a data parallel vs. pipeline parallel configuration and batch size 1. Looking at the 95th percentile (p95) of latency, we get lower values in Pipeline mode for both the 4 core inf1.xlarge and the 16 cores inf1.6xlarge instances. The best configuration between Inf1 instances is the 16 cores case, with a 58% reduction in latency, reaching 6 milliseconds.

Instance Batch Size Inference Mode NeuronCores per model Throughput [sentences/sec] Latency p95 [seconds] Cost per 1M inferences Throughput ratio [inf1/g4dn] Cost ratio [inf1/g4dn]
inf1.xlarge 1 Data Parallel 1 245 0.0165 $0.42 1.6 43%
inf1.xlarge 1 Pipeline Parallel 4 291 0.0138 $0.35 2.0 36%
inf1.6xlarge 1 Data Parallel 1 974 0.0166 $0.54 6.5 55%
inf1.6xlarge 1 Pipeline Parallel 16 1793 0.0069 $0.30 12.0 30%
g4dn.xlarge 1 149 0.0082 $0.98

The model tested was the PyTorch version of HuggingFace bert-base-uncase, with sequence length 128. On AWS Inferentia, we compile the model to use all available cores and run full pipeline parallel. For the data parallel cases, we compile the models for a single core and configured the NeuronCore Groups to run a worker model per core. The GPU deployment used the same setup as AWS Inferentia, where the model was traced with TorchScript JIT and cast to mixed precision using PyTorch AMP Autocast.

Throughput also increased 1.84 times with Pipeline mode on AWS Inferentia, reaching 1,793 sentences per second, which is 12 times the throughput of g4dn.xlarge. The cost of inference on this configuration also favors the inf1.6xlarge over the most cost-effective GPU option, even at a higher cost per hour. The cost per million sentences is 70% lower based on Amazon Elastic Compute Cloud (Amazon EC2) On-Demand instance pricing. For latency sensitive applications that can’t utilize the full throughput of the inf1.6xlarge, or for smaller models such as BERT Small, we recommend using Pipeline mode on inf1.xlarge for a cost-effective deployment.

Optimize for maximum throughput: Single-core data parallel

An NLP use case that requires increase throughput over minimum latency is extractive question answering tasks, as part of a search and document retrieval pipeline. In this case, increasing the number of document sections processed in parallel can speed up the search result or improve the quality and breadth of searched answers. In such a setup, inferences are more likely to run in batches (batch size larger than 1).

To achieve maximum throughput, we found through experimentation the optimum batch size to be 6 on AWS Inferentia, for the same model tested before. On g4dn.xlarge, we ran batch 64 without running out of GPU memory. The following results help show how batch size 6 can provide 9.2 times more throughput on inf1.6xlarge at 61% lower cost, when compared to GPU.

Instance Batch Size Inference Mode NeuronCores per model Throughput [sentences/sec] Latency p95 [seconds] Cost per 1M inferences Throughput ratio [inf1/g4dn] Cost ratio [inf1/g4dn]
inf1.xlarge 6 Data Parallel 1 985 0.0249 $0.10 2.3 30%
inf1.xlarge 6 Pipeline Parallel 4 945 0.0259 $0.11 2.2 31%
inf1.6xlarge 6 Data Parallel 1 3880 0.0258 $0.14 9.2 39%
inf1.6xlarge 6 Pipeline Parallel 16 2302 0.0310 $0.23 5.5 66%
g4dn.xlarge 64 422 0.1533 $0.35

In this application, cost considerations can also impact the final serving infrastructure design. The most cost-efficient way of running the batched inferences is using the inf1.xlarge instance. It achieves 2.3 times higher throughput than the GPU alternative, at 70% lower cost. Choosing between inf1.xlarge and inf1.6xlarge depends only on the main objective: minimum cost or maximum throughput.

To test out the NeuronCore Pipeline and Groups feature yourself, check out the latest Utilizing Neuron Capabilities tutorials for PyTorch.


In this post, we explored ways to optimize your NLP deployments using the NeuronCore Groups and Pipeline features. The native integration of AWS Neuron SDK and PyTorch allowed you to compile and optimize the HuggingFace Transformers model to run on AWS Inferentia with minimal code change. By tunning the deployment architecture to be pipeline parallel, the BERT models achieve minimum latency for real-time applications, with 12 times higher throughput than a g4dn.xlarge alternative, while costing 70% less to run. For batch inferencing, we achieve 9.2 times higher throughput at 60% less cost.

The Neuron SDK features described in this post also apply to other ML model types and frameworks. For more information, see the AWS Neuron Documentation.

Learn more about the AWS Inferentia chip and the Amazon EC2 Inf1 instances to get started running your own custom ML pipelines on AWS Inferentia using the Neuron SDK.

About the Authors

Fabio Nonato de Paula is a Sr. Manager, Solutions Architect for Annapurna Labs at AWS. He helps customers use AWS Inferentia and the AWS Neuron SDK to accelerate and scale ML workloads in AWS. Fabio is passionate about democratizing access to accelerated ML and putting deep learning models in production. Outside of work, you can find Fabio riding his motorcycle on the hills of Livermore valley or reading ComiXology.

Mahadevan Balasubramaniam is a Principal Solutions Architect for Autonomous Computing with nearly 20 years of experience in the area of physics infused deep learning, building and deploying digital twins for industrial systems at scale. Mahadevan obtained his PhD in Mechanical Engineering from Massachusetts Institute of Technology and has over 25 patents and publications to his credit.

Coinsmart. Beste Bitcoin-Börse in Europa


Emerging Technologies Achievable Through The Cloud: 4 Practical Examples




Steve Sangapu

Cloud computing is the foundation beneath some of the fastest growing industries in the world, So it’s not difficult to get lost in all the buzzwords that are thrown around cloud computing and digress from actual technological advances and benefits that are achievable with smart and efficient use of the cloud.

So what’s behind the hype? Some extremely powerful technologies and workflows. And that’s exactly what we’re going to take a look at in this article — the top 4 practical examples of technologies achievable through the cloud in 2020.

Contrary to popular belief, information alone won’t give companies a competitive advantage — executives also need to be able to base their decisions on data before the opportunities pass. However, most companies generate terabytes of data every week but are unable to capitalize on any of the data. Big data analytics is a solution to this problem.

Thanks to the advanced evolution of the cloud, companies are able to gather and analyze data at a nearly instantaneous rate. Leveraging big data analytics empowers organizations to run more efficiently in terms of cost and decision making. Companies can make data-driven decisions brought to them by data analysis tools that are provided through the cloud.

BigQuery from Google Cloud has many powerful features that allow users to view their data in real-time, providing continual up-to-date information to help guide business decisions. Big Query is a serverless NoOps (no operations) platform that separates compute and storage, meaning that better autoscaling is offered as they can be independently scaled as required. BigQuery’s Machine Learning and Business Intelligence Engine analysis of various data models are quite powerful. It integrates seamlessly with the Google Cloud AI platform and other tools like Data Studio.

Cloud service providers like Google Cloud Platform (GCP) use shared computing to process large datasets extremely quickly. Also known as cluster computing, Google uses hundreds of computers interconnected together for quick data analysis and completing complex computing tasks. Businesses like yours can also make use of similar services like cloud service providers to improve insights and decision-making.

(Sources: SAS, Google Cloud, Hostingtribunal)

1. How VR could bring transhumanism to the masses

2. How Augmented Reality (AR) is Reshaping the Food Service Industry

3. ExpiCulture — Developing an Original World-Traveling VR Experience

4. Enterprise AR: 7 real-world use cases for 2021

Automating mundane and repetitive tasks is and should be the top priority for businesses in this age. Even automating the simplest tasks, most business environments can free up to 30% time for employees — allowing them to focus on more important matters.

Cloud service providers have made it extremely easy for businesses of all sizes to dabble with business process automation. For instance, at the most fundamental level, businesses can automate how they receive and sort documents through document management, to automating entire workflows including delivery pipelines and testing updates in a controlled cloud environment. Tools such as Google’s Document Understanding AI can actually help you ensure your data is accurate and compliant. This is especially helpful in highly regulated industries where accuracy and precision are crucial to operations. It is also quick and easy to request more compute if needed for deep learning and complex ML training by requesting GPUs or using a managed service like Kubeflow.

Another emerging technology that is now accessible to small to medium enterprises is machine learning. Put simply, machine learning refers to training computer algorithms to interpret and interact with data without human interference. With increasing accuracy, MI (a subset of AI) is becoming incredibly valuable to businesses as it has virtually unlimited use cases.

You can read more about how cloud solutions using AI and ML can help save time, cut costs, and improve rates of human error.

(Sources: Google Cloud, Interactions)

Although lesser-known among legacy businesses, the Internet of Things is one of the fastest-growing industries in the world and was valued at $190 billion in 2018. Alexa and Google Home are two of the most popular examples of IoT devices of which you’re most likely very familiar. Apart from that, smart TVs, smart refrigerators, smart LEDs, security systems, thermostats, and even cars (think Tesla) that operate over WiFi are all a part of the internet of things.

Think of IoT devices as part of a much larger network all of which have a backbone in the cloud. Aside from pure convenience, IoT can be seen making significant breakthroughs in other spaces such as health tech. Fitbit, for example, has partnered with Google to transform how their product integrates between fitness and the cloud. The device uses Google’s Cloud Healthcare API. The API is a service that “helps facilitate the exchange of data among healthcare applications and services that run on Google’s Cloud.” Even more interesting is that the API also integrates analytics tools like BigQuery, AI tools like AI Platform, and data processing tools like Dataflow.

Similar tools and APIs are available for businesses in different industries so they too can help connect their device to an online network and introduce security patches, fix bugs, add features, and more.

(Sources: internetofbusiness, Google Healthcare API)

Though it has become significantly more popular in the last few years, augmented and virtual reality are not new technologies. Leftronic reports that the number of augmented reality users will reach 3.5 billion by 2023. Furthermore, they estimate that the AR and VR device market will hit $198 billion by 2025. In fact, large institutions like Boeing and NASA have been developing their own AR and VR technologies for training purposes for quite some time now. However, thanks to cloud proliferation, technologies like virtual reality are finally becoming accessible and more importantly, affordable for the average business to experiment with.

So how does it work?

When applications superimpose a CG image into the real world, they create an augmented reality experienced. Augmented reality places computer-generated objects in the human world, whereas virtual reality places you into a computer-generated world. Businesses can use this technology in a number of ways including giving consumers a virtual reality tour of their product or use it for training in a safe environment.

It’s also quite easy to get started with. Google’s Cloud Anchor allows developers to create experiences within their app for users to add virtual objects into an augmented reality environment. Thanks to Google’s ARCore Cloud Anchor service, experiences are allowed to be hosted and shared between users. Virtual Reality allows you to be transported to distant places and immerse yourself in foreign environments. Devices such as the Oculus Rift or Quest and the HTC Vive provide outstanding experiences that can run independently of a computer. When used at its capacity, Virtual Reality can be transformative for gaming, education, and immersive experiences.

These emerging technologies unlock a completely new frontier that businesses can compete in without exorbitant investments or technical knowledge. With all the right tools already available at their disposal, most businesses only need a helping hand to get started. If your organization is considering using the cloud to leverage an emerging technology but are unsure about the intricacies, reach out to D3V and set up a free strategic consultation with our certified cloud experts. Our team can help determine the best set of options for your company based on your business needs and aspirations.

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading


Optimal Dynamics nabs $22M for AI-powered freight logistics




Join Transform 2021 this July 12-16. Register for the AI event of the year.

Optimal Dynamics, a New York-based startup applying AI to shipping logistics, today announced that it closed a $18.4 million round led by Bessemer Venture Partners. Optimal Dynamics says that the funds will be used to more than triple its 25-person team and support engineering efforts, as well as bolster sales and marketing departments.

Last-mile delivery logistics tends to be the most expensive and time-consuming part of the shipping process. According to one estimate, last-mile accounts for 53% of total shipping costs and 41% of total supply chain costs. With the rise of ecommerce in the U.S., retail providers are increasingly focusing on fulfilment and distribution at the lowest cost. Particularly in the construction industry, the pandemic continues to disrupt wholesalers — a 2020 Statista survey found that 73% of buyers and users of freight transportation and logistics services experienced an impact on their operations.

Founded in 2016, Optimal Dynamics offers a platform that taps AI to generate shipment plans likely to be profitable — and on time. The fruit of nearly 40 years of R&D at Princeton, the company’s product generates simulations for freight transportation, enabling logistics companies to answer questions about what equipment they should buy, how many drivers they need, daily dispatching, load acceptance, and more.

Simulating logistics

Roughly 80% of all cargo in the U.S. is transported by the 7.1 million people who drive flatbed trailers, dry vans, and other heavy lifters for the country’s 1.3 million trucking companies. The trucking industry generates $726 billion in revenue annually and is forecast to grow 75% by 2026. Even before the pandemic, last-mile delivery was fast becoming the most profitable part of the supply chain, with research firm Capgemini pegging its share of the pie at 41%.

Optimal Dynamics’ platform can perform strategic, tactical, and real-time freight planning, forecasting shipment events as far as two weeks in advance. CEO Daniel Powell — who cofounded the company with his father, Warren Princeton, a professor of operations research and financial engineering — says that the underlying technology was deployed, tested, and iterated with trucking companies, railroads, and energy companies, along with projects in health, ecommerce, finance, and materials science.

“Use of something called ‘high-dimensional AI’ allows us to take in exponentially greater detail while planning under uncertainty. We also leverage clever methods that allow us to deploy robust AI systems even when we have very little training data, a common issue in the logistics industry,” Powell told VentureBeat via email. “The results are … a dramatic increase in companies’ abilities to plan into the future.”

The global logistics market was worth $10.32 billion in 2017 and is estimated to grow to $12.68 billion USD by 2023, according to Research and Markets. Optimal Dynamics competes with Uber, which offers a logistics service called Uber Freight. San Francisco-based startup KeepTruckin recently secured $149 million to further develop its shipment marketplace. Next Trucking closed a $97 million investment. And Convoy raised $400 million at a $2.75 billion valuation to make freight trucking more efficient.

But 25-employee Optimal Dynamics investor Mike Droesch, a partner at BVP, says that demand remains strong for the company’s products. “Logistics operators need to consider a staggering number of variables, making this an ideal application for a software-as-a-service product that can help operators make more informed decisions by leveraging Optimal Dynamics industry leading technology. We were really impressed with the combination of their deep technology and the commercial impact that Optimal Dynamics is already delivering to their customers,” he said in a statement.

With the latest funding round, a series A, Optimal Dynamics has raised over $22 million to date. Beyond Bessemer, Fusion Fund, The Westly Group, TenOneTen Ventures, Embark Ventures, FitzGate Ventures, and John Larkin and John Hess also contributed .


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading


Code-scanning platform BluBracket nabs $12M for enterprise security




Join Transform 2021 this July 12-16. Register for the AI event of the year.

Code security startup BluBracket today announced it has raised $12 million in a series A round led by Evolution Equity Partners. The capital will be used to further develop BluBracket’s products and grow its sales team.

Detecting exploits in source code can be a pain point for enterprises, especially with the onset of containerization, infrastructure as code, and microservices. According to a recent Flexera report, the number of vulnerabilities remotely exploitable in apps reached more than 13,300 from 249 vendors in 2020. In 2019, Barracuda Networks found that 13% of security pros hadn’t patched their web apps over the past 12 months. And in a 2020 survey from Edgescan, organizations said it took them an average of just over 50 days to address critical vulnerabilities in internet-facing apps.

BluBracket, which was founded in 2019 and is headquartered in Palo Alto, California, scans codebases for secrets and blocks future commits from introducing new risks. The platform can monitor real-time risk scores across codebases, git configurations, infrastructure as code, code copies, and code access and resolve issues, detecting passwords and over 50 different types of tokens, keys, and IDs.

Code-scanning automation

Coralogix estimates that developers create 70 bugs per 1,000 lines of code and that fixing a bug takes 30 times longer than writing a line of code. In the U.S., companies spend $113 billion annually on identifying and fixing product defects.

BluBracket attempts to prevent this by proactively monitoring public repositories with the highest risk factors, generating reports for dev teams. It prioritizes commits based on their risk scores, minimizing duplicates using a tracking hash for every secret. A rules engine reduces false positives and scans for regular expressions, as well as sensitive words. And BluBracket sanitizes commit history both locally and remotely, supporting the exporting of reports via download or email.

BluBracket offers a free product in its Community Edition. Both it and the company’s paid products, Teams and Enterprise, work with GitHub, BitBucket, and Gitlab and offer CI/CD integration with Jenkins, GitHub Actions, and Azure Pipelines.


Above: The Community Edition of BluBracket’s software.

Image Credit: BluBracket

“Since our introduction early last year, the industry has seen through Solar Winds how big of an attack surface code is. Hackers are exploiting credentials and secrets in code, and valuable code is available in the public domain for virtually every company we engage with,” CEO Prakash Linga, who cofounded BluBracket with Ajay Arora, told VentureBeat via email.

BluBracket competes on some fronts with Sourcegraph, a “universal code search” platform that enables developer teams to manage and glean insights from their codebase. It has another rival in Amazon’s CodeGuru, an AI-powered developer tool that provides recommendations for improving code quality. There’s also cloud monitoring platform Datadog, codebase coverage tester Codecov, and feature-piloting solution LaunchDarkly, to name a few.

But BluBracket, which has about 30 employees, says demand for its code security solutions has increased “dramatically” since 2020. Its security products are being used in “dozens” of companies with “thousands” of users, according to Linga.

“DevSecOps and AppSec teams are scrambling, as we all know, to address this growing threat. By enabling their developers to keep these secrets out of code in the first place, our solutions make everyone’s life easier,” Linga continued. “We are excited to work with Evolution on this next stage of our company’s growth.”

Unusual Ventures, Point72 Ventures, SignalFire, and Firebolt Ventures also participated in BluBracket’s latest funding round. The startup had previously raised $6.5 million in a seed round led by Unusual Ventures.


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading


Data governance and security startup Cyral raises $26M




Join Transform 2021 this July 12-16. Register for the AI event of the year.

Data security and governance startup Cyral today announced it has raised $26 million, bringing its total to date to $41.1 million. The company plans to put the funds toward expanding its platform and global workforce.

Managing and securing data remains a challenge for enterprises. Just 29% of IT executives give their employees an “A” grade for following procedures to keep files and documents secure, according to Egnyte’s most recent survey. A separate report from KPMG found only 35% of C-suite leaders highly trust their organization’s use of data and analytics, with 92% saying they were concerned about the reputational risk of machine-assisted decisions.

Redwood City, California-based Cyral, which was founded in 2018 by Manav Mital and Srini Vadlamani, uses stateless interception technology to deliver enterprise data governance across platforms, including Amazon S3, Snowflake, Kafka, MongoDB, and Oracle. Cyral monitors activity across popular databases, pipelines, and data warehouses — whether on-premises, hosted, or software-as-service-based. And it traces data flows and requests, sending output logs, traces, and metrics to third-party infrastructure and management dashboards.

Cyral can prevent unauthorized access from users, apps, and tools and provide dynamic attribute-based access control, as well as ephemeral access with “just-enough” privileges. The platform supports both alerting and blocking of disallowed accesses and continuously monitors privileges across clouds, tracking and enforcing just-in-time and just-enough privileges for all users and apps.

Identifying roles and anomalies

Beyond this, Cyral can identify users behind shared roles and service accounts to tag all activity with the actual user identity, enabling policies to be specified against them. And it can perform baselining and anomaly detection, analyzing aggregated activity across data endpoints and generating policies for normal activity, which can be set to alert or block anomalous access.

“Cyral is built on a high-performance stateless interception technology that monitors all data endpoint activity in real time and enables unified visibility, identity federation, and granular access controls. [The platform] automates workflows and enables collaboration between DevOps and Security teams to automate assurance and prevent data leakage,” the spokesperson said.


Existing investors, including Redpoint, Costanoa Ventures, A.Capital, and strategic investor Silicon Valley CISO Investments, participated in Cyral’s latest funding round. Since launching in Q2 2020, Cyral — which has 40 employees and occupies a market estimated to be worth $5.7 billion by 2025, according to Markets and Markets — says it has nearly doubled the size of its team and close to quadrupled its valuation.

“This is an emerging market with no entrenched solutions … We’re now working with customers across a variety of industries — finance, health care, insurance, supply chain, technology, and more. They include some of the world’s largest organizations with complex environments and some of the fastest-growing tech companies,” the spokesperson said. “With Cyral, our company was built during the pandemic. We have grown the majority of our company during this time, and it has allowed us to start our company with a remote-first business model.”


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading
Aviation5 days ago

JetBlue Hits Back At Eastern Airlines On Ecuador Flights

Blockchain5 days ago

“Privacy is a ‘Privilege’ that Users Ought to Cherish”: Elena Nadoliksi

AI2 days ago

Build a cognitive search and a health knowledge graph using AWS AI services

Blockchain2 days ago

Meme Coins Craze Attracting Money Behind Fall of Bitcoin

Energy3 days ago

ONE Gas to Participate in American Gas Association Financial Forum

SaaS4 days ago

Fintech3 days ago

Credit Karma Launches Instant Karma Rewards

Blockchain15 hours ago

Shiba Inu: Know How to Buy the New Dogecoin Rival

Blockchain4 days ago

Opimas estimates that over US$190 billion worth of Bitcoin is currently at risk due to subpar safekeeping

Esports3 days ago

Pokémon Go Special Weekend announced, features global partners like Verizon, 7-Eleven Mexico, and Yoshinoya

SaaS4 days ago

Blockchain2 days ago

Sentiment Flippening: Why This Bitcoin Expert Doesn’t Own Ethereum

Blockchain4 days ago

Yieldly announces IDO

Esports2 days ago

Valve launches Supporters Clubs, allows fans to directly support Dota Pro Circuit teams

SaaS4 days ago

Esports4 days ago

5 Best Mid Laners in League of Legends Patch 11.10

Cyber Security3 days ago

Top Tips On Why And How To Get A Cyber Security Degree ?

Business Insider2 days ago

Bella Aurora launches its first treatment for white patches on the skin

Blockchain4 days ago

Decentraland Price Prediction 2021-2025: MANA $25 by the End of 2025

Private Equity2 days ago

Warburg Pincus leads $110m Aetion Series C in wake of company doubling revenue last year