Rockset has upgraded its main database engine for more efficient scaling and performance gains by separating storage and compute. The outfit said it also separates the cloud compute resources used for ingesting and querying data.
Built for the job of real-time analytics, the new release is part of the company’s efforts to build a new class of database, different from both transactional databases, which process limited volumes of data in real time, and analytics or data-warehousing systems, which can analyse large volumes of data offline, CEO Venkat Venkataramani told The Register. “We are actually actively trying to build a third leg: a real-time database.”
The engineering team behind Rockset previously worked at Facebook on the real-time analytics on large workloads.
The approach they took was to build what the company calls a “real-time indexing database”, Venkataramani said. That means the database indexes all the data coming into the system in real time, allowing for a one to two-second lag, and all of that data is then visible to queries, applications and dashboards.
The new features include separation of storage and compute. The idea is now common to analytics and data warehousing systems, largely led by cloud-native system Snowflake, but also now a feature of AWS Redshift, Google’s BigQuery, Azure Synapse and even Teradata, with its heritage in the on-premises appliance data warehouse systems. Rockset brings the idea to real-time databases and is trying to temp batch analytics types away from them and towards its own tech.
“Our pricing model completely decouples storage from compute, so that you can scale each of them independently,” Venkataramani claimed.
Other popular real-time systems, such Apache Druid and ElasticSearch with Kibana, couple storage and compute because they were initially built for on-premises systems, he said.
The second new feature separates ingest compute from query compute. The idea is to avoid these jobs competing for resources, and ensure data does not become “stale” or that query performance is reduced, said the company, which claims Intel, Nvidia and Deloitte among its customers.
Big time schema
The firm is looking to unite other warring factions – the SQL and NoSQL worlds. To start with, it ingests data without a schema, like a NoSQL database, because real-time data sources are likely to be semi-structured, using JSON files, for example.
“Without asking for schema management or database administration, we automatically convert NoSQL into SQL tables in the cloud using a technology called converged indexes,” Venkataramani said. “You would write to Rockset as though it’s a NoSQL database in the cloud, except it indexes it and exposes all of those datasets as fast SQL tables, fully schematised, fully indexed. So all of your SQL queries with full-featured joins and aggregations, and complex filtering – all the standard SQL features – will come back from real time data without the need for database administration.”
In common use cases, Rockset is often coupled with a NoSQL database. For example, if a developer is trying to create a real-time leader board for a massively multiplayer online game, a solution might be to ingest data into MongoDB as a transactional system, but at the same time replicate that data to Rockset for the analytical queries the leader board requires.
Rockset is not the first to tackle the problems of real-time data analytics. SAP has said its in-memory HANA database can be used for these problems in business and engineering environments. From the SQL database world, MariaDB is building features that support analytics for both offline and live data.
Rockset is a proprietary database derived from RocksDB, an open-source system used by Facebook, Yahoo!, and LinkedIn. With its heritage in real-time analytics for web-first industries, Rockset is hoping its fresh approach will convince organisations outside its core customer base that it has a better solution for real-time analytics problems that are becoming more important in other kinds of businesses, particularly as they shift online. ®
Facebook is leaky, creepy, and trashy. Now it wants to host some of your customer data
Facebook wants to host some of your customer data, an offer that hurts its own partner community.
The antisocial giant says it will host data generated by WhatsApp, specifically when used alongside the messaging service’s Business API. That interface lets businesses manage messages to and from customers, and to integrate e-commerce and other apps into the messaging platform. Facebook lets partners implement the API and choose where data is stored.
The Social Network™ on Thursday announced “a new way for businesses to store and manage their WhatsApp chats with customers using Facebook’s secure hosting infrastructure which will roll out early next year.”
Facebook says customers who take up its offer “will make it easier to onboard to WhatsApp Business API … respond to WhatsApp messages faster, keep their inventory up to date and sell products through chats.”
But the Silicon Valley giant also says that using a third party – even Facebook – breaks end-to-end encryption.
Facebook’s anti-trademark bot torpedoes .org website that just so happened to criticize Zuck’s sucky ethics board
“If a business chooses to use a third party vendor to operate the WhatsApp Business API on their behalf, we do not consider that to be end-to-end encrypted since the business you are messaging has chosen to give a third-party vendor access to those messages,” Facebook said. “This will also be the case if that third-party vendor is Facebook.”
Facebook will therefore disclose when it is hosting chats on behalf of a customer albeit without revealing the degraded encryption.
The web goliath said it will also “expand our partnerships with business solution providers we’ve worked with over the last two years,” so while it says it will offer a better on-boarding experience, it’s throwing them another unspecified bone.
The Social Network™ said its hosting services will emerge in coming months, which gives us all plenty of time to ponder whether you want to get into business with a corporation that has failed to suppress misinformation, allowed live-streaming of a racist terror attack, leaked personal data, and took years to figure out that holocaust denial has no place in public conversations. ®
How to get started with Intel Optane
Sponsored If you take your data centre infrastructure seriously, you’ll have taken pains to construct a balanced architecture of compute, memory and storage precisely tuned to the needs of your most important applications.
You’ll have balanced the processing power per core with the appropriate amount of memory, and ensured that both are fully utilised by doing all you can to can get data off your storage subsystems and to the CPU as quickly as possible.
Of course, you’ll have made compromises. Although the proliferation of cores in today’s processors puts an absurd amount of compute power at your disposal, DRAM is expensive, and can only scale so far. Likewise, in recent years you’ll have juiced up your storage with SSDs, possibly going all flash, but there are always going to be bottlenecks en route to those hungry processors. You might have stretched to some NVMe SSDs to get data into compute quicker, but even when we’re pushing against the laws of physics, we are still constrained by the laws of budgets. This is how it’s been for over half a century.
So, if someone told you that there was a technology that could offer the benefits of DRAM, but with persistence, and which was also cheaper than current options, your first response might be a quizzical, even sceptical, “really”. Then you might lean in, and ask “really?”
That is the promise of Intel® Optane™, which can act as memory or as storage, potentially offering massive price performance boosts on both scores. And drastically improve the utilisation of those screamingly fast, and expensive, CPUs.
So, what is Optane™? And where does it fit into your corporate architecture?
Intel describes Optane™ as persistent memory, offering non-volatile high capacity with low latency at near DRAM performance. It’s based on the 3D XPoint™ technology developed by Intel and Micron Technology. It is byte and bit addressable, like DRAM. At the same time, it offers a non-volatile storage medium without the latency and endurance issues associated with regular flash. So, the same media is available in both SSDs, for use as storage on the NVMe bus, and as DIMMs for use as memory, with up to 512GB per module, double that of current conventional memory.
It’s also important to understand what Intel means when it talks about the Optane™ Technology platform. This encompasses both forms of Optane™ – memory and storage – together with the Intel® advanced memory controller and interface hardware and software IP. This opens up the possibility not just of speeding up hardware operations, but of optimising your software to make the most efficient use of the hardware benefits.
So where will Optane™ help you? Let’s assume that the raw compute issue is covered, given that today’s data centre is running CPUs with multiple cores. The problem is more about ensuring those cores are fully utilised. Invariably they are not, simply because the system cannot get data to them fast enough.
DRAM has not advanced at the same rate as processor technology, as Alex Segeda, Intel’s EMEA business development manager for memory and storage, explains, both in terms of capacity growth and in providing persistency. The semiconductor industry has pretty much exhausted every avenue available when it comes to improving price per GB. When it comes to the massive memory pools needed in powerful systems, he explains, “It’s pretty obvious that DRAM becomes the biggest contributor to the cost of the hardware…in the average server it’s already the biggest single component.”
Meanwhile, flash – specifically NAND – has become the default storage technology in enterprise servers, and manufacturers have tried everything they can to make it cheaper, denser and more affordable. Segeda compares today’s SSDs to tower blocks – great for storing something, whether data or people, but problems arise when you need to get a lot of whatever you’re storing in or out at the same time. While the cost of flash has gone down, endurance and performance, especially on write operations, means “it’s not fit for the purpose of solving the challenge of having a very fast, persistent storage layer”.
Moreover, Segeda maintains, many people are not actually aware of these issues. “They’re buying SSDs, often SAS SSDs, and they think it is fast enough. It’s not. You are most likely not utilising your hardware to the full potential. You paid a few thousand dollars for your compute, and you’re just not feeding it with data.”
To highlight where those chokepoints are in typical enterprise workloads, Intel has produced a number of worked examples. For example, when a 375GB Optane™ SSD DC P4800X is substituted for a 2TB Intel® SSD DC P4500 as the storage tier for a MySQL installation running 80 virtual cores, CPU utilisation jumps from 20 per cent to 70 per cent, while transaction throughput per second is tripled, and latency drops from over 120ms to around 20ms.
This latency reduction, says Segeda, “is what matters if you’re doing things like ecommerce, high frequency trading.”
The same happens when running virtual machines, using Optane™ in the caching tier for the disk groups in a VMware vSAN cluster, says Segeda. “We’re getting half of the latency and we’re getting double the IO from storage. It means I can have more virtual machines accessing my storage at the same time. Right on the same hardware. Or maybe I can have less nodes in my cluster, just to deliver the same performance.”
A third example uses Intel® Optane™ DC Persistent memory as a system memory extension in a Redis installation. The demo compares a machine with total available memory of 1.5TB of DRAM and a machine using 192GB of DRAM and 1.5TB of DCPMM. The latter delivered the same degree of CPU utilization, with up to 90 per cent of the throughput efficiency of the DRAM only server.
These improvements hold out the prospect of cramming more virtual machines or containers on the same server, says Segeda, or keeping more data closer to the PC, to allow real time analytics. This is important because while modern applications generate more and more data, only a “small, small fraction” is currently meaningfully analysed, says Segeda. “If you’re not able to do that, and get that insight, what’s the point of capturing the data? For compliance?” Clearly, compliance is important but it doesn’t help companies monetise the data they’re generating or giving them an edge over rivals.
The prospect of opening up storage and memory bottlenecks will obviously appeal, whether your infrastructure is already straining, or because while things are ticking over right this minute, you know that memory and storage demands are only likely to go in one direction in future. So, how do you work out how and where Optane™ will deliver the most real benefit for your own infrastructure?
On a practical level, the first step is to identify where the problems are. Depending on your team’s engineering expertise, this could be something you can do inhouse, using your existing monitoring tools. Intel® also provides a utility called Storage Performance Snapshot to run traces on your infrastructure and visualise the data to highlight where data flow is being choked off.
Either way, you’ll want to ask yourself some fundamental questions, says Segeda: “What’s your network bandwidth? Is it holding you back? What’s your storage workload? What’s your CPU utilisation? Is the CPU waiting for storage? Is the CPU waiting for network? [Then] you can start making very meaningful assumptions.” This should give you an indication of whether expanding the memory pool, or accelerating your storage, or both will help.
As for practical next steps, Segeda suggests talking through options with your hardware suppliers, and Intel account manager if you have one, to take a holistic view of the problem.
Simply retrofitting your existing systems can be an option he says. Add in an Optane™ SSD on NVMe, and you have a very fast storage device. Optane™ memory can be added to the general memory pool, giving memory expansion at a relatively lower cost.
However, Segeda says, “You can have a better outcome if you do some reengineering, and explicit optimization.”
Using Optane™ as persistent memory requires significant modification to the memory controller, something that is currently offered in the Intel® Second Generation Xeon® Scalable Gold or Platinum Processors. This will enable the use of App Direct Mode, which allows suitably modified applications to be aware of memory persistence. So, for example Segeda explains, this will allow an in memory database like SAP Hana to exploit the persistence, meaning it does not have to constantly reload data.
Clearly, an all-new installation raises the option of a more efficient setup, with software optimised to take full advantage of the infrastructure, and with fewer but more compute powerful nodes. All of which gives which the potential to save not just on DRAM and storage, but on electricity, real estate, and also on software licenses.
For years, infrastructure and software engineers and data centre architects have had to delicately balance computer, storage, memory, and network. With vast pools of persistent memory and faster storage now in reach, at lower cost, that juggling act may just be about to get much, much easier.
Sponsored by Intel®
The hills are alive with the sound of Azure as Microsoft pledges Austrian bit barns
Microsoft has announced yet another cloud region, this time in Austria.
As is ever the case, Microsoft has not said where the facility will be or detailed its disposition, or revealed said when it will open. But it has said that the facility will bring Azure, Microsoft 365, Dynamics 365 and the Power Platform to Austrian soil.
The region will be Microsoft’s 64th Azure facility.
Local politicians all lauded the decision, suggesting it will bring the land of Mozart, Strauss, Freud, radio pioneer Heddy Lamarr and strudel roaring into the digital age and let a thousand startups bloom.
Microsoft has also committed to work with Austria’s Ministry of Digitalization to launch a “Center of Digital Excellence”, establish a security network with business, academia and government, and train public servants and private citizens alike in cybersecurity.
Here at The Register we think an Austrian cloud also creates terrific chance for some show tunes, as the new facility will mean the hills are alive with the sound of Azure. The improved resilience that a full Microsoft bit barn brings will mean salespeople can break into a chorus of “You are six nines, I am seven nines.”
If that resilience proves as elusive as an Edelweiss, we can imagine spontaneous outbursts of “So Long, Farewell”.
We’ll leave it to readers to decide how to deal with “The Lonely Goatherd” and its frequent yodeling interjections. ®
How 5G Will Impact Customer Experience?
You can now Request the PlayStation VR Camera Adaptor for PS5
HSBC and Wave Facilitate Blockchain-Powered Trade Between New Zealand and China
How followers on Instagram can help to navigate your brand during a pandemic
How Was 2020 Cyber Security Awareness Month?
Sci-fi Shooter Hive Slayer is Free, Asks Players for Louisiana Hurricane Relief Donations Instead
AMD Announces Radeon RX 6000-series GPUs with USB-C “for a modern VR experience”
Resiliency And Security: Future-Proofing Our AI Future
AI Projects Progressing Across Federal Government Agencies
Kucoin and Revain Announce Partnership
Crowdfunded AR Startup Tilt Five Secures $7.5M Series A Investment
The Importance of XR Influencers
Head Back Underground in 2021 With Cave Digger 2: Dig Harder
Five All-New Multiplayer Modes Revealed for Tetris Effect: Connected
The Perfect Investment
Snapchat’s new Halloween AR Lenses Offer Full Body Tracking
How the PS5 Will Completely Change Gaming As We Know It?
Compromised Credentials used by Hackers to Access the Content Management System
Which are the safest payment methods for online betting?
How to stay safe if you’re using an Android device for betting?
Bell nonlocality with a single shot
Optimization of the surface code design for Majorana-based qubits
Classical Simulations of Quantum Field Theory in Curved Spacetime I: Fermionic Hawking-Hartle Vacua from a Staggered Lattice Scheme
How Digital Transformation Will Change the Retail Industry
Where to Change Quest 2 Privacy Settings and See Your VR Data Collected by Facebook
Cyber Security Prognostication Conversation
Win a Huge The Walking Dead Onslaught Merch Bundle Including the Game
Hold Your Nerve With These Scary VR Horror Titles
Ethereum City Builder MCP3D Goes DeFi with $MEGA Token October 28
I dare you to ignore this trend…
Why Bitcoin’s Price Is Rising Despite Selling Pressure from Crypto Whales
5 Work From Home Office Essentials
Gorilla Glass Maker Corning & Pixelligent Partner to Develop Optics for Consumer AR Headsets
Skye’s Beautiful VR Funeral
AR For Remote Assistance: A True Game Changer
Smart Contract 101: MetaMask
Yupitergrad Adding PlayStation VR & Oculus Quest Support Jan 2021
Hack & Slash Rogue-lite ‘Until You Fall’ Leaves Early Access on Steam & Oculus PC
‘Elite Dangerous: Horizons’ Now Free to All Owners of the Base Game
New Darknet Markets Launch Despite Exit Scams as Demand Rises for Illicit Goods
Blockchain1 week ago
Bitcoinnami Officially Launches on October 21, 2020
Esports1 week ago
Who is Dr. Karlov in Warzone?
AR/VR1 week ago
The Best VR Headsets in 2020
Esports5 days ago
How to Play With Friends Online in Dynamax Adventures in Pokémon Sword and Shield The Crown Tundra
AR/VR1 week ago
HTC Vive’s XR Suite for Remote Collaboration Goes Live
Esports1 week ago
How to use the AR Mapping features in Pokémon Go
Esports5 days ago
How to Separate and Rejoin Calyrex from Glastrier or Spectrier in Pokémon Sword and Shield Crown Tundra
Cleantech1 week ago
GM Unveils Factory ZERO