Connect with us

Publications

Real-Time Data Processing for Analytical Use Cases: Is it Worth it?

Published

on

Taavi Rehemägi Hacker Noon profile picture

@taavi-rehemagiTaavi Rehemägi

CEO of Dashbird. 13y experience as a software developer & 5y of building Serverless applications.

Real-time processing provides a notable advantage over batch processing — data becomes available to consumers faster. In the traditional ETL, you would not be able to analyze events from today until tomorrow’s nightly jobs would finish. These days, many businesses rely on data being available within minutes, seconds, or even milliseconds. With streaming technologies, we no longer need to wait for scheduled batch jobs to see new data events. 

Live dashboards are updated automatically as new data comes in

Despite all the benefits, real-time streaming adds a lot of additional complexity to the overall data processes, tooling, and even data format. Therefore, it’s crucial to carefully weigh out the pros and cons of switching to real-time data pipelines. In this article, we’ll look at several options to reap the benefits of a real-time paradigm with the least amount of architectural changes and maintenance effort.

Traditional approach

When you hear about real-time data pipelines, you may immediately start thinking about Apache Kafka, Flink, Spark Streaming, and similar frameworks which require a lot of knowledge to operate a distributed event streaming platform. Those open-source platforms are best suited to scenarios: 

– when you need to continuously ingest and process reasonably large amounts of real-time data, 

– when you anticipate multiple producers and consumers and you want to decouple their communication, 

– or when you want to own the underlying infrastructure, possibly on-prem (e.g. compliance). 

While many companies and services attempt to facilitate the management of underlying distributed clusters, the architecture still remains fairly complex. Therefore, you need to consider:

– whether you have the resources to operate those clusters,

how much data do you plan to process by using this platform,

– whether the added complexity is worth the effort.

In the next sections, we’ll look at alternative options if your real-time needs don’t justify the added complexity and costs of a self-managed distributed streaming platform.

Amazon Kinesis

AWS realized the customer’s difficulties in managing message-bus architectures a long time ago (2013). As a result, they came up with Kinesis — a family of services that attempt to make real-time analytics easier. By leveraging serverless Kinesis Data Streams, you can create a data stream with a few clicks in the AWS management console. Once you configured your estimated throughput and the number of shards, you can start implementing data producers and consumers. Even though Kinesis is serverless, you still need to monitor the message size and the number of shards to ensure that you don’t encounter any unexpected write throttles. 

In my previous article, you can find an example of a Kinesis producer (source) sending data to a Kinesis data stream using a Python client, and how to continuously send micro-batches of data records to S3 (consumer/destination) by leveraging a Kinesis Data Firehose delivery stream. 

Alternatively, to consume data from Kinesis Data Stream, we could:

– aggregate and analyze data with Kinesis Data Analytics,

– use Apache Flink to send this data into Amazon Timestream.

The main benefits of using Kinesis Data Streams as compared to sending data directly to your desired application are latency and decoupling. Kinesis allows you to store data within the stream for up to seven days and have multiple consumers that would receive data at the same time. This means that if a new application would need to collect the same data, you could add a new consumer to the process. This new consumer would not affect other data consumers or producers thanks to decoupling on the Kinesis architecture level.

Amazon Timestream

As mentioned in the previous section, the major advantage of Kinesis is decoupling. If you don’t need multiple applications that would regularly consume data from the stream, you could considerably streamline the process by using Amazon Timestream — a serverless time-series data store allowing you to analyze data in (near) real-time. The underlying architecture is smart enough to ingest data first into an in-memory store for fast retrieval of real-time data, and then it automatically moves “old” data to cheaper long-term storage according to the specified retention period.

Why would you use a time-series database for real-time data?

Any new data record comes into the stream at a particular time. You may be tracking price changes over time, sensor measurements, logs, CPU utilization — practically any real-time streaming data is some sort of a time series. Therefore, it makes sense to consider using a time series database such as Timestream. The simplicity of the service makes it very appealing, especially if you would like to use SQL to retrieve data for analytics. 

When comparing the SQL interface of Timestream against the one available in Kinesis Data Analytics, Timestream is a clear winner. Kinesis SQL is quite obscure and introduces a lot of specific vocabulary. In contrast, Timestream provides an intuitive SQL interface with many useful built-in time-series functions, making time-based aggregation (ex. minutely or hourly time buckets) much easier. 

Side note: don’t use semicolons at the end of your queries in Timestream. If you do, you’ll get an error.

Demo: Real-time ingestion into Timestream using Python

To demonstrate how Timestream works, we’ll be sending cryptocurrency price changes into a Timestream table. 

Let’s start by creating a Timestream database and table. We can do all that either from the AWS management console or from AWS CLI:

The above code should create a database in your AWS region. Make sure that you use one of the regions in which Timestream is available. 

Side note: The easiest way to find available regions for any AWS service is to check the pricing page: https://aws.amazon.com/timestream/pricing/.

Now we can create a table. You need to specify your in-memory and magnetic store retention period.

Our database and table are created. Now we can get the latest price data from the Cryptocompare API. This AP provides many useful endpoints to get the latest information about a cryptocurrency market. We will focus on getting real-time price data for selected cryptocurrencies.

We’ll get data in the following format:

{‘BTC’: {‘USD’: 34406.27},

‘DASH’: {‘USD’: 178.1},

‘ETH’: {‘USD’: 2263.64},

‘REP’: {‘USD’: 26.6}}

Additionally, we need to convert this data to the proper Timestream format with a time column, measures, and dimensions. Here is the full script that we can use to ingest new data every 10 seconds: https://gist.github.com/d00b8173d7dbaba08ba785d1cdb880c8.

That’s it! The most time-consuming part is the definition of your dimensions and measures (lines 21–44). You should be careful about the design of your measures and dimensions: with Timestream you can only query data from a single table. No JOINS between tables are allowed. Therefore, it’s important to think ahead about your access patterns before you start ingesting data into Timestream.

Here is how the data looks like in the end. Note that the ingestion time is presented in UTC:

AWS Timestream: exploring the results in the query console — image by author

We could now easily connect Timestream to Grafana for near real-time visualization. But that’s a story for another article.

Never-ending script

In the Timestream example above, running in a single process, we used a never-ending loop defined using while True. This is a common approach for a simple service ingesting data all the time, typically running as a background process or a service in a container orchestration platform. 

Minutely scheduled jobs

An alternative to a continuously running script is a service that is scheduled to run every minute. The benefit of this approach is that it allows you to treat this near real-time process as a batch job, which simplifies your architecture. You can think of it as a reversed Kappa architecture: while Kappa processes batch in the same way as real-time data (streaming-first approach), this approach “batchifies” real-time data streams (batch-first approach) into micro-batches. 

Instead of while True, we now still ingest data roughly every 10 seconds but the actual process is executed once per minute, allowing us to track which runs were successful, and does not depend on the health of a single job run:

There is no “right” or “wrong” approach. The main purpose of this method is to treat near real-time ingestion as a batch job. Here is a full Gist: https://gist.github.com/d953cdbc6edbf8b224815cc5d8b53f73.

Which option should you choose?

The following questions may help you to make the right decision for your use case:

– Which problem(s) do you want to solve using real-time streaming: is it anomaly detection, alerting, product recommendation, dynamic pricing algorithm, tracking current market prices, understanding user behavior? Having a specific use case in mind can help you determine the right tool for the job, especially because there are a lot of specialized tools on the market.

– Which latency is acceptable in your use case? Is it OK if your data is available for analytics 1 minute after the event or stream has been received? Or on the contrary, you need a millisecond latency because otherwise, this data will no longer be actionable?

– How many resources (employees and budget) do you have to keep your platform operational? Does it make sense to spin up your own Kafka cluster, use some managed service, or maybe a serverless option such as Amazon Kinesis or Amazon Timestream can address your needs?

– How do you plan to monitor and observe the health of your data streams?

– How much training would be needed to teach your team how to use this specific platform?

– Which data sources would need to be ingested in real-time, i.e. data producers?

– What is the target datastore (data lake, data warehouse, specific database) from which you would want to retrieve this data, i.e. data consumers? And how do you want to retrieve this data — via SQL, Python, or perhaps only via analytical dashboards?

– In which way (architecture-wise) would you want to process this data? Is Kappa, Lambda, or other architecture worth considering to distinguish between real-time and batch?

Conclusion

Ultimately, it depends on the problem that you try to solve using real-time processing technologies, the scale of your problem, and available resources. In many scenarios, a simple minutely batch job may be sufficient. It allows having a single architecture for all data processing needs, and data available within few minutes or even seconds after its generation.

For other scenarios, Kinesis Data Streams or Amazon Timestream may provide simple yet effective means to add (near) real-time capabilities with very little maintenance effort. Lastly, if you do have employees who know how to operate Kafka, Flink, or Spark Streams, those can be helpful if you want to own your infrastructure and not being reliant on cloud providers. As always, thinking about the problem at hand will help assess the trade-offs and make the best decision for your use case.

Also published on: https://dashbird.io/blog/real-time-processing-analytical/

by Taavi Rehemägi @taavi-rehemagi. CEO of Dashbird. 13y experience as a software developer & 5y of building Serverless applications.Serverless af

Tags

Join Hacker Noon

Create your free account to unlock your custom reading experience.

Coinsmart. Beste Bitcoin-Börse in Europa
Source: https://hackernoon.com/real-time-data-processing-for-analytical-use-cases-is-it-worth-it-9b3g3703?source=rss

AI

Amazon Wants a Leader For Its Digital Currency and Blockchain Product Unit

Published

on

Amazon seems determined to maintain its reputation as an innovative company and is looking to experiment with cryptocurrencies through a digital currency payment and blockchain unit.

According to an announcement posted on Thursday, Amazon is looking for a blockchain specialist to lead its Digital Currency and Blockchain strategy.

The Payments Acceptance & Experience team is seeking an experienced product leader to develop Amazon’s Digital Currency and Blockchain strategy and product roadmap … You will work closely with teams across Amazon, including AWS, to develop the roadmap, including the customer experience, technical strategy and capabilities as well as the launch strategy.

What Amazon is Looking For

The expert must have at least an MBA or equivalent degree, 10+ years of business or technology experience, team management skills, understanding of data and metrics, and good communication skills.

The corporation did not disclose any salary offer. The person must be based on or willing to move to Seattle, Washington.

Amazon seems to be convinced of the need to innovate in the field of payments and finance. The cryptocurrency and blockchain development team is a sign of the company’s interest in exploring these emerging technologies to offer better financial products.


ADVERTISEMENT

According to an email shared by Business Insider, Amazon’s team confirmed its interest in exploring an approach to the world of cryptocurrencies. Still, they did not specify whether it would be through the development of a proprietary currency or through the acceptance of cryptocurrencies as a means of payment:

“We’re inspired by the innovation happening in the cryptocurrency space and are exploring what this could look like on Amazon … We believe the future will be built on new technologies that enable modern, fast, and inexpensive payments, and hope to bring that future to Amazon customers as soon as possible.”

An Old Relationship With Crypto

Amazon’s interest in the world of cryptocurrencies isn’t new. Back in 2017, it purchased, at least preemptively, a number of domains linking its brand to cryptocurrencies, including amazoncryptocurrency.com, amazoncryptocurrencies.com, and even amazonethereum.com.

However, at the time, Patrick Gauthier told CNBC that the e-commerce giant did not have much interest in cryptocurrencies and had no plans to support crypto payments.

In fact, the Pay With Moon plugin that allowed payments on Amazon with Bitcoin through Lightning Network had to change its business model to instead allow its users to purchase virtual credit cards instead of paying directly on Amazon’s site.

Also, as Cryptopotato reported in February this year, Amazon launched a job offer for a new payments system involving “Digital and Emerging Payments (DEP),” although they did not mention a direct relationship with Bitcoin or any cryptocurrency either.

This time, however, Amazon seems more willing to go public with its casual relationship with cryptos.

SPECIAL OFFER (Sponsored)

Binance Futures 50 USDT FREE Voucher: Use this link to register & get 10% off fees and 50 USDT when trading 500 USDT (limited offer).

PrimeXBT Special Offer: Use this link to register & enter POTATO50 code to get 50% free bonus on any deposit up to 1 BTC.

You Might Also Like:

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://coingenius.news/amazon-wants-a-leader-for-its-digital-currency-and-blockchain-product-unit-38/?utm_source=rss&utm_medium=rss&utm_campaign=amazon-wants-a-leader-for-its-digital-currency-and-blockchain-product-unit-38

Continue Reading

AI

Fintech Giant Zip Co to Provide Cryptocurrency Trading Services

Published

on

Amazon seems determined to maintain its reputation as an innovative company and is looking to experiment with cryptocurrencies through a digital currency payment and blockchain unit.

According to an announcement posted on Thursday, Amazon is looking for a blockchain specialist to lead its Digital Currency and Blockchain strategy.

The Payments Acceptance & Experience team is seeking an experienced product leader to develop Amazon’s Digital Currency and Blockchain strategy and product roadmap … You will work closely with teams across Amazon, including AWS, to develop the roadmap, including the customer experience, technical strategy and capabilities as well as the launch strategy.

What Amazon is Looking For

The expert must have at least an MBA or equivalent degree, 10+ years of business or technology experience, team management skills, understanding of data and metrics, and good communication skills.

The corporation did not disclose any salary offer. The person must be based on or willing to move to Seattle, Washington.

Amazon seems to be convinced of the need to innovate in the field of payments and finance. The cryptocurrency and blockchain development team is a sign of the company’s interest in exploring these emerging technologies to offer better financial products.


ADVERTISEMENT

According to an email shared by Business Insider, Amazon’s team confirmed its interest in exploring an approach to the world of cryptocurrencies. Still, they did not specify whether it would be through the development of a proprietary currency or through the acceptance of cryptocurrencies as a means of payment:

“We’re inspired by the innovation happening in the cryptocurrency space and are exploring what this could look like on Amazon … We believe the future will be built on new technologies that enable modern, fast, and inexpensive payments, and hope to bring that future to Amazon customers as soon as possible.”

An Old Relationship With Crypto

Amazon’s interest in the world of cryptocurrencies isn’t new. Back in 2017, it purchased, at least preemptively, a number of domains linking its brand to cryptocurrencies, including amazoncryptocurrency.com, amazoncryptocurrencies.com, and even amazonethereum.com.

However, at the time, Patrick Gauthier told CNBC that the e-commerce giant did not have much interest in cryptocurrencies and had no plans to support crypto payments.

In fact, the Pay With Moon plugin that allowed payments on Amazon with Bitcoin through Lightning Network had to change its business model to instead allow its users to purchase virtual credit cards instead of paying directly on Amazon’s site.

Also, as Cryptopotato reported in February this year, Amazon launched a job offer for a new payments system involving “Digital and Emerging Payments (DEP),” although they did not mention a direct relationship with Bitcoin or any cryptocurrency either.

This time, however, Amazon seems more willing to go public with its casual relationship with cryptos.

SPECIAL OFFER (Sponsored)

Binance Futures 50 USDT FREE Voucher: Use this link to register & get 10% off fees and 50 USDT when trading 500 USDT (limited offer).

PrimeXBT Special Offer: Use this link to register & enter POTATO50 code to get 50% free bonus on any deposit up to 1 BTC.

You Might Also Like:

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://coingenius.news/fintech-giant-zip-co-to-provide-cryptocurrency-trading-services-24/?utm_source=rss&utm_medium=rss&utm_campaign=fintech-giant-zip-co-to-provide-cryptocurrency-trading-services-24

Continue Reading

AI

Blockchain Startups Raised over $4 Billion in VC Funding in Q2 2021

Published

on

Most blockchain-based startups have seen funding from venture backers, despite the current cryptocurrency market downturn, recording over $4 billion in Q2 alone.

This massive venture capital backing is in keeping with the established trend of VC funding for blockchain firms as investors look to be part of the new wave of disruption associated with decentralized finance.

VC Backers Continue to Dole Out Funding for Blockchain Startups

According to CNBC on Thursday (July 22, 2021), venture capital investors seem not to worry about the volatile nature associated with the crypto market, especially with the current slump in market prices. Bitcoin, which reached an all-time high )ATH) of over $63,000 back in April, is trading within the $33,000 range, losing over 50% of its ATH. Ether price has also suffered a slump after getting to over $4,000 in May.

Meanwhile, data from CB Insights, an analytics firm, revealed that the total funds received by different blockchain companies are $4.38 billion. The figure signals a more than 50% increase from Q1 2021, and almost a ninefold growth compared to Q2 2020.

In May, major fintech company Circle received $440 million from VC backers, making it the largest venture capital funding in a blockchain company. Meanwhile, Circle is planning to go public through an alliance with a special purpose acquisition company (SPAC) Concord Acquisition Corp. The merger, if successful, will put Circle’s valuation at $4.5 billion.


ADVERTISEMENT

Ledger, a cryptocurrency hardware wallet, raised the second-biggest round in Q1 2021 with $380 million. According to an interview with CNBC in December 2020, the company’s CEO Pascal Gauthier noted that the cryptocurrency market was gradually maturing, with institutional investors showing interest in the emerging industry.

Speaking to CNBC, CB Insights senior analyst, Chris Bendtsen :

“At the current rate, blockchain funding will shatter the previous year-end record — more than tripling the total raised back in 2018. Blockchain’s record funding year is being driven by the rising consumer and institutional demand for cryptocurrencies. Despite short-term price volatility, VC firms are still bullish on crypto’s future as a mainstream asset class and blockchain’s potential to make financial markets more efficient, accessible, and secure.”

Institutional Investors Seek Exposure to Crypto Industry

The record inflow of funding for blockchain firms is coming from both traditional VC funds and blockchain-focused funds alike. Some asset managers are even creating blockchain venture arms for both early and late-stage funding of projects in the industry.

As previously reported by CryptoPotato in June, venture capital giant Andreessen Horowitz announced the launch of a $2.2 billion cryptocurrency fund. According to the company, the new fund would be distributed across various crypto and blockchain startups.

Blockchain Capital raised $300 million for its Fund V LP back in May, with PayPal, Visa. hedge funds, and others participating in the capital raise.

Meanwhile, the trend is continuing in Q3 2021 with massive funding deals. Recently, major cryptocurrency derivatives platform FTX secured a record $900 million in its Series B funding, causing the company’s valuation to grow to $18 billion.

SPECIAL OFFER (Sponsored)

Binance Futures 50 USDT FREE Voucher: Use this link to register & get 10% off fees and 50 USDT when trading 500 USDT (limited offer).

PrimeXBT Special Offer: Use this link to register & enter POTATO50 code to get 50% free bonus on any deposit up to 1 BTC.

You Might Also Like:

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://coingenius.news/blockchain-startups-raised-over-4-billion-in-vc-funding-in-q2-2021-10/?utm_source=rss&utm_medium=rss&utm_campaign=blockchain-startups-raised-over-4-billion-in-vc-funding-in-q2-2021-10

Continue Reading

CNBC

WhatsApp says NSO spyware was used to attack officials working for US allies

Published

on

The NSO Group has denied that its spyware was used to compromise many politicians’ phones, but WhatsApp is telling a different story. The chat giant’s CEO, Will Cathcart, told The Guardian in an interview that governments allegedly used NSO’s Pegasus software to attack senior government officials worldwide in 2019, including high-ranking national security officials who were US allies. The breaches were reportedly part of a larger campaign that compromised 1,400 WhatsApp users in two weeks, prompting a lawsuit.

The reporting on the NSO “matches” with findings from the 2019 attack on WhatsApp, Cathcart said. Human rights activists and journalists were also believed to be victims.

The executive was responding to allegations that governments used Pegasus to hack phones for 37 people, including those of women close to murdered Saudi journalist Jamal Khashoggi. Those targets were also on a 2016 list of over 50,000 phone numbers that included activists, journalists and politicians, although it’s not clear that anyone beyond the 37 fell prey to attacks.

NSO has strongly rejected claims about the hacks and the list, insisting that there’s “no factual basis” and that the list was too large to be focused solely on potential Pegasus targets. It also directly challenged Cathcart, asking if the WhatsApp exec had “other alternatives” to its tools that would help thwart “pedophiles, terrorists and criminals” using encrypted software.

Cathcart, however, didn’t buy that explanation — he pointed to the 1,400 people as possible evidence that the number of targets was “very high.” Whatever the truth, it’s safe to say WhatsApp won’t shy away from its lawsuit (or a war of words) any time soon.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://www.engadget.com/whatsapp-nso-spyware-attack-215334253.html?src=rss

Continue Reading
Esports4 days ago

How to reduce lag and increase FPS in Pokémon Unite

Esports5 days ago

Coven skins for Ashe, Evelynn, Ahri, Malphite, Warwick, Cassiopeia revealed for League of Legends

Esports4 days ago

Will New World closed beta progress carry over to the game’s full release?

Esports4 days ago

How to add friends and party up in New World

Esports4 days ago

Can you sprint in New World?

Esports4 days ago

How to claim New World Twitch drops

Esports4 days ago

Twitch streamer gets banned in New World after milking cow

AR/VR4 days ago

Moth+Flame partners with US Air Force to launch Virtual Reality sexual assault prevention and response training

Blockchain5 days ago

Uniswap (UNI) and AAVE Technical Analysis: What to Expect?

Blockchain5 days ago

Rothschild Investment Purchases Grayscale Bitcoin and Ethereum Trusts Shares

Esports5 days ago

Konami unveils Yu-Gi-Oh! Master Duel, a digital version of the Yu-Gi-Oh! TCG and OCG formats

Esports4 days ago

How to change or join a new world in New World

Esports4 days ago

Best Akshan builds in League of Legends

Esports4 days ago

How to turn off and on PvP in New World

Esports4 days ago

Here are all the servers in the New World closed beta

Esports5 days ago

Team BDS adds GatsH to VALORANT roster as sixth man before EU Stage 3 Challengers 2

Esports5 days ago

Overwatch League 2021 Grand Finals to be held in Los Angeles, playoff bracket in Dallas

Blockchain5 days ago

NexWEB Technologies Chooses Butterfly Protocol for Powering its Blockchain Domain-Based NFT Platform

Gaming5 days ago

Why Is It Better to Play Slots Using Cryptocurrency?

Esports9 hours ago

Who won Minecraft Championships (MCC) 15? | Final Standings and Scores

Trending