Connect with us


How 5G Will Impact Data and Enterprises in 2021




Click to learn more about author
Vinay Ravuri.

According to recent research by CCS Insight, there will be almost a quarter of a billion 5G connections worldwide by the end of 2020. They say this number is expected to triple in 2021 and grow to 3.6 billion globally by 2025. 5G will impact the way we live and work today — improving connectivity speeds and services for everyday gadgets like cell phones, tablets, and laptops is only one aspect of what 5G will enable. 5G monetization will go way beyond consumer services, notes mobility giant Ericsson — in their 5G for business (a 2030 market compass report), they state:

“The addressable market increases exponentially with the introduction of faster, extremely low latency mobile networks that will allow enterprises from different business segments to provide new and revolutionary services.”

5G Deployments Gain Steam, Spotlight True Value of 5G

We will see a substantial increase in 5G
trial initiatives by enterprises over the next 12 months, especially in
verticals such as manufacturing, energy, and surveillance, to enable
mission-critical applications that require low latency. In 2021, consumer
handsets will remain the most widely adopted 5G use case, focused on allowing
faster data speeds for users, before being overtaken by enterprise applications
in 2022. These enterprise and industrial deployments will offer more radical
disruptions and significant value-added services and apps associated with
5G. They will utilize new attributes and properties, such as
Ultra-Reliable Low Latency Communications (URLLC), Massive-Machine Type
Communications (MMTC), and private bands, which are uniquely 5G. (For
example, when a mobile device misses frames while playing a YouTube video, this
will only impact the user in a minor way. However, when an industrial
surveillance drone or autonomous vehicle loses connection or is slow to receive
critical signals, it may crash, endangering safety, or leave field staff vulnerable
to accidents and leading to significant equipment and property damages.)

Efforts like CBRS and O-RAN will allow
enterprises to deploy private 5G networks at lower operating price points,
allow new providers to enter the market, and bridge the digital divide.

We may see traditional consumer 5G vendors
entering the enterprise space, but these companies will struggle to pivot away
from the traditional consumer markets based on available internal resources and
expertise. Incumbents will certainly not ignore the enterprise market; however,
they are not likely to put all their eggs in that basket as newer players will.

5G Geopolitics
Do Not Halt Adoption

Although the ever-changing global
permissions regarding Huawei equipment usage may impact the companies’
investment in innovation and profitability, this trend will not affect overall
5G adoption. Instead, these uncertainties provide an opening for innovative
startups to gain footing in the market. In 2021, driven by the rapidly
increasing demand for 5G, country bans on specific equipment vendors will not
impact 5G adoption or innovation and instead open up the prospect base for
other vendors.

Equally, customer and market innovation of
5G will predicate on innovation at the silicon level. This will give rise
to innovation at both a software and hardware level. Operators and OEMs
will also look for novel solutions that can achieve new levels of performance,
power, and TCO apart from existing traditional merchant silicon offerings.

Network Systems
Become Open, While Monolithic Solutions Evolve or Get Left Behind

Similar to what happened in the data center
space, OpenRAN (O-RAN) will become more mainstream, challenging monolithic
solutions of the past. Prior to the open source software trend for data centers,
large companies such as Dell, HP, and IBM dominated in this market. When
companies like Facebook and Google stepped into this space, they popularized
the belief that monolithic solutions are outdated, and companies should instead
disaggregate the hardware from software and use open source software rather
than commercial. Today, as a result of open compute standards, commoditized
hardware, and open source software offerings, the market is much more
democratized, which has resulted in lower costs and increased vendor
competition, more customization, and more room for innovation.

No matter the hype, 5G vendors are still
highly consolidated today. To scale 5G solutions with efficiency, intelligence,
and versatility, the market will prioritize O-RAN solutions rather than
monolithic ones. We will see white box hardware and open source software
elements from different vendors. O-RAN will present the opportunity for a lot
more startups to get a hold in this space. We have already seen big cloud
operations making announcements and investments in open source technology in
2020, and this wave will continue in 2021 and beyond.

AI-Enabled Automation
at the Edge Brings Manufacturing Back to the U.S.

Enabled by reliable, high-performance 5G connectivity and real-time edge computing capabilities, enterprises will more rapidly implement AI technology to support various use cases across their organizations, including robotic assembly lines, autonomous vehicles, augmented reality training, and more. These intelligent, AI-driven solutions will automate traditional, manual processes to increase performance and visibility while lowering cost.

Traditional manufacturing settings employ a
significant amount of manual labor. To keep operational costs low, many
industries have taken their factories outside of the U.S. By implementing AI
throughout factories, manufacturing can automate many resource-expensive tasks.
For example, Tesla was able to place its factories in one of the world’s most
expensive geographies by automating most of its factory functions. By utilizing
AI to operate their robot workers, there is a massive opportunity for the western
world to reshore or nearshore manufacturing operations. In 2021, manufacturers
will lay the groundwork for company-wide, cost-effective industrial automation.
Over the next five years, we will see these industries completely transform
themselves due to the automating power of real-time AI.

With an overwhelming tsunami of data being generated by more
end devices at the edge, it is even more important to process this data in real-time
as close to the data source as much as possible. I look forward to seeing how
these predictions will become a reality over the next 12-months.



Why machine learning strategies fail




List of icons of major machine learning and cloud providers

Most companies are still trying to figure out how to make AI work. A recent survey looks at some of the barriers to machine learning.Read More Source:

Continue Reading


Oraichain Review: The AI Powered Oracle System




Blockchain technology and artificial intelligence are now being integrated into every industry and nearly every aspect of our economy. Both of these technologies are concerned with the usage and storage of data, which has become critical as data in the modern world is immense and growing.

One thing that hasn’t been accomplished though is the combination of blockchain and artificial intelligence. Combining the two can provide even greater value as adding artificial intelligence to blockchain will allow for analyzing data, which will generate more insight about that data.

Oraichain Overview

Overview of Oraichain’s AI Powered Oracles

One area where this could be particularly useful is in smart contracts. These are protocols or programs that are created to automatically execute, either documenting or controlling relevant actions and events. It does this based on its programming and the terms of a contract that’s been specified.

Smart contracts are increasingly used on blockchains as they have a number of useful benefits, particularly in the increasingly popular decentralized finance space. However  they remain under one limiting constraints in that they must follow strict rules, which prevents the use of an artificial intelligence model in any smart contract.

The solution to this problem is being developed by Oraichain. This is a data oracle platform and it is designed to connect artificial intelligence APIs with smart contracts or other applications.

With Oraichain a smart contract can be enhanced to securely access external artificial intelligence APIs. The focus of blockchains currently is the use of price oracles, but with Oraichain smart contracts will have access to reliable AI data, providing new and useful functionality to blockchains.

What is Oraichain?

As a data oracle platform Oraichain is concerned with the aggregation and connection of smart contracts and AI APIs. It is the very first AI-powered data oracle in the world. Currently there are six major areas or features that Oraichain is bringing to the table.

Oraichain Mainnet

Oraichain Announcing its Recent Mainnet. Via Blog

AI Oracle

As we’ve already mentioned, Oraichain is designed to enhance the utility of smart contracts by allowing them access to external APIs which are AI driven. Current oracle blockchains are primarily focused on price oracles, but Oraichain plans on changing all that.

With Oraichain dApps gain new and useful functionality by being able to use reliable external AI data. This is accomplished by sending requests to validators who acquire and test data from various external AI APIs. Once confirmed the data is stored on-chain, ensuring its reliability and allowing it to be used as proof in the future.

AI Marketplace

The AI Marketplace on Oraichain is where AI providers are able to sell their AI services. This brings AI to Oraichain and rewards the providers with ORAI tokens. There are a number of services that are provided, including price prediction, face authentication, yield farming, and much more.

The AI providers benefit from hosting their models directly on Oraichain without the need for third-parties. Using this mechanism allows small companies or even individuals to compete with larger entities in having their work featured in the AI Marketplace. Developers and users get to choose the AI services they require and pay for them with ORAI tokens.

AI Ecosystem

The AI Marketplace is not the only piece of the AI ecosystem of Oraichain. There is additional AI infrastructure to support the AI model developers. The ecosystem includes a fully developed and functional web GUI to assist in publishing AI services more rapidly and with less troubles.

Yield Farming

Yield farming is just one potential use case for Oraichain. Image via Oraichain Docs.

The ecosystem also allows AI providers to follow the flow of any requests for their services from start to finish. This is included as a means to increase the transparency of the system. With this level of transparency users can easily see which validators are best at execution, and if there are any malicious providers.

Staking & Earning

Validators stake their tokens and receive rewards for securing the network. Other users are also able to delegate their tokens to existing validators and share in those rewards proportionally. It’s important that delegators do understand that this is not passive income.

Delegators need to actively monitor the validators to ensure they continue to perform well within the ecosystem. If they are delegating to a malicious validator they risk having their delegated tokens slashed. So, delegators are equally responsible for ensuring the Oraichain ecosystem remains secure and of high quality.

Test Cases

The test cases are provided to Oraichain to verify the integrity and correctness of any AI services on the blockchain network. It is possible for third parties to become test case providers and then examine specific AI models to determine if they are qualified to operate on Oraichain and charge fees. Users can provide expected outputs and see if the AI model results are similar. These test cases providers encourage the AI providers to continue providing the best quality services.

Orai DAO

Governance on Oraichain is done by the community in a DAO model. Anyone owning ORAI tokens is able to participate in the governance of the network. They can also participate in the ongoing development and the future plans for the Oraichain ecosystem. While the project development team was responsible for creating the foundation for governance, it has now been automated and will forever remain in the hands of the community.

What Prevents Blockchain using AI Models?

Smart contracts in the way they are developed currently are unable to run AI models, and developers have found it to be nearly impossible to integrate an AI model into a smart contract. AI models are typically very complex constructions based on neural networks, SVM, clustering and other approaches. Smart contracts include three characteristics that prevent the inclusion of AI models:

Oraichain Oracle

Three things keep blockchains from using AI models, but Oraichain will fix that. Image via

Strictness: Smart contracts are developed in such a way that they must always follow the strict rules put in place for them. All input for the smart contract must be 100% accurate if an output is expected. However AI models don’t necessarily provide 100% accurate inputs. Oraichain will overcome some of the aspects of smart contract strictness, giving a better user experience and enhanced smart contract functionality.

Environment: Typically smart contracts are created using high-level programming languages, such as Solidity and Rust. This provides better security and syntax for the smart contracts. By contrast most AI models are written in Java or Python.

Data size: Due to the gas costs of running smart contracts on most networks they are usually created with very small storage allowances. Comparatively the AI models are quite large and use a lot of storage space.

Blockchain Based Oracle AI

Oraichain is being developed as a way to create smart contracts that are able to use AI models. On the surface the mechanism being used by Oraichain seems similar to those used by Chainlink or the Band Protocol, but Oraichain is more heavily focused on AI APIs and the quality of the AI models.

Each user request includes attached test cases, and in order to receive payment the providers API must pass a specified number of test cases. Validators manage the test case features, and the quality of the AI models, making Oraichain quite different and unique from other solutions.

Oraichain System Overview

The Oraichain public blockchain allows for a number of user-generated data requests. In addition to users requesting data the blockchain also allows smart contracts to request data securely from artificial intelligence APIs that are external to the blockchain. The blockchain has been built using the Cosmos SDK and utilizes Tendermint’s Byzantine Fault Tolerance (BFT) as a consensus mechanism to ensure transaction confirmations are handled rapidly.

In terms of consensus mechanism the Oraichain protocol is similar to Delegated Proof-of-Stake (DPoS). The network is constructed of validators, each of which owns and stakes ORAI tokens, while other users who hold ORAI tokens are able to delegate them to the nominated validators. In this way both validators and delegators receive rewards proportional to their stake with each newly created block.

Validators have the task of collecting data from AI Providers and validating the data before it is stored to the blockchain. In order to validate the AI API each validator is required to do testing based on the test cases provided by users, test providers, or the smart contracts. Any time a users is unsure which test case might be good they are able to request additional test cases from the test providers. Thus the validity of the AI APIs can always be verified.

Oraichain System Overview

A representation of the inner workings behind Oraichain. Image via Oraichain Docs

You can see above how the flow of requesting an AI API works in the Oraichain system. When performing a request the smart contracts or users are required to call an oracle script which is available from the AI Marketplace or from the Oraichain gateway. These oracle scripts include external AI data sources, provided by the AI providers, along with test cases and optional test sources. There is also a transaction fee required to complete each request.

Whenever a request is submitted a random validator is chosen to complete the request. This validator then retrieves the necessary data from one or more AI providers and executes test scenarios to verify the validity of the data. If the tests pass the data can be passed along, but if the tests fail the request is cancelled.

When a request is successful the results of the test are written to the Oraichain blockchain. This result can be fetched from smart contracts or regular applications and serves as the proof of execution. A successful request is also required to pay the necessary transaction fees, which are used to reward validators and delegators.

There is an overhead of reading results from Oraichain’s transactions, but it helps ensure that the AI API quality is good and there is no data tampering during the process of fetching data from AI providers.

If we compare this testing with Chainlink and Bank Protocol we see that API testing using test cases is unique to Oraichain. Because Oraichain is focused on AI APIs it is crucial that testing is included to control the quality of the AI providers in the ecosystem. Plus users and test providers can submit new and suitable test cases to properly verify any AI API. These test cases incentivize the AI providers to improve the quality and accuracy of their AI models.

Oraichain Validation

Validating test cases to complete a request. Image via Oraichain Docs

Another unique feature added to the Oraichain model is the ability of the community to rate the reputation of each validator in regard to the improvement of the quality of the AI APIs. In this way validators can be slashed if they are found to have low availability, slow response times, failure to validate AI providers, failure to perform test cases properly, or any other bad behavior.

One warning is that a large number of validators are needed to prevent the system from becoming centralized. A greater number of validators serves to increase the availability of the network, while also improving on scalability and successful request performance.

At the same time block reward and transaction fees need to be sufficient to incentivize a large number of validators to join and participate in the Oraichain ecosystem. Otherwise the network could become centralized, and will certainly slow to the point of being unusable.

The Oraichain Team

Oraichain recently made some changes to their leadership, moving the former CTO of Orachain into the position of CEO for Oraichain Vietnam and welcoming Mr. Tu Pham as the CTO of Oraichain.

Oraichain Team

The impressive leadership team at Oraichain. Image via

Chung Dao continues as the CEO of Oraichain. As one of the co-founders of the project he has been instrumental in its growth since the very beginning. He is also the co-founder of Rikkeisoft and has achieved a PhD in Computer Science from The University of Tokyo.

The AI Lead and another co-founder of the project is Diep Nguyen, a lecturer at VNU in Hanoi and holder of a PhD from Keio University.

In addition, Oraichain’s total workforce has now been expanded to 25 people including the core team, AI and blockchain specialists, data scientists and developers.

Oraichain & Binance Chain Integration

Around the same time as making changes to the leadership at Oraichain, the team also completed an integration with Binance Chain. This integration creates a bridge from Ethereum to Binance Chain for the ERC-20 ORAI tokens. Oraichain has committed to providing the necessary liquidity for trading the BNB/ORAI pairing on PancakeSwap.

Oraichain Bridge

Swap easily from ERC-20 to BEP-20 tokens and vice versa. Image via Oraichain Blog.

Anyone who wishes to swap between the ERC-20 ORAI token and the BEP-20 ORAI token can do so at

Further information regarding the new BEP-20 token and instructions on swapping can be found here.

ORAI Token Economics

Anytime an AI request is sent to the Oraichain network there is an associated transaction cost that needs to be paid in ORAI tokens. In fact, the token plays a role as the transaction fee that is paid for request executing validators, AI-API providers, test case providers, and block creating validators.

The transaction fee is not set, but varies based on the fee requirement of the validators who execute the requests, the AI API providers, and the test case providers. This means that anytime there is a request made the validators can choose whether or not to execute the request based on the transaction fee offered.

Once validators have decided whether or not to be included in the pool of willing participants the system randomly chooses one of the validators that have expressed a willingness to execute the request. The validator is also responsible for clarifying the fee paid to AI-API providers, test case providers, and block creating validators in the MsgResultReport.

It is possible for more than one validator to be included in a request, in which case the transaction fee is divided equally among the validators that participated in the request. Again, the validators must decide if they are willing to accept such a transaction fee.

The ORAI token is rewarded for each newly created block, so to keep the value of ORAI token, holders must stake their token to the Oraichain network. The rewarding token is divided based on the number of tokens that a holder is staking to a validator. Moreover, there is a mechanism to punish bad behaviors of validators in aspects of AI API quality, response time, and availability.

Oraichain Tokenomics

The new tokenomics supports the growth of ORAI tokens. Image via Oraichain Blog

The team also changed the tokenonmics by burning 73% of the total supply of ORAI tokens in December 2020. They also extended the emission schedule to 2027, thus flattening the release curve and protecting from sudden supply shocks. It also helps to minimize inflation in the early years of the project.

The ORAI Token

There was a seed sale conducted in October 2020 with ORAI tokens being sold for $0.081 each. There was a goal of $70,000 for the sale, however no data regarding the total funds raised was released. In November 2020 there was a private sale scheduled, but it was never held. Finally, there was a public sale scheduled for February 2021, but after the team changed the tokenomics and burned 73% of the circulating supply the public sale was cancelled.

The price of the ORAI token has surged in 2021, reaching an all-time high of $107.48 on February 20, 2021. That contrasts with the all-time low of $2.83 just four months earlier on October 29, 2020.

Oraichain Chart

The ORAI token has soared higher in just 4 months. Image via

As of February 23, 2021 the price has retreated substantially from its all-time high, trading at $65.06. There are very few exchanges handling the token, with the majority of transactions occurring on Uniswap. There is also a small amount of activity on KuCoin,, and Bithumb Global.

Oraichain Use Cases

There are already a number of use cases that have generated interest in Oraichain.

Yield Farming with AI

The yield farming based on Oraichain was inspired by the development of (YFI). Like yEarn, the Oraichain system helps reduce the complexity of yield trading. Where yEarn uses crowdsourcing knowledge, Oraichain provides AI-based price prediction APIs as inputs to smart contracts. The yield farming use case has two functionalities:

Earn: Get price prediction from Oraichain and automatically decide BUY/SELL tokens. Users choose the best performing AI APIs.

Vaults: Apply automated trading oracle scripts on Oraichain. Deposit tokens and the assigned oracle script will find the best AI input to maximize yield.

yAI Finance

AI powered DeFi platform. Image via

Compared to (crowdsource-based strategies), AI-based trading performance could be less efficient, but risk management could be better since all buying or selling decision is based on AI models (or by machine) and not by human psychology.

Flexible smart contracts & face authentication

There are several scenarios in which face authentication is very useful:

  • using your face to get your balance instead of using a private key,
  • withdrawing tokens to registered wallets using your face
  • using your face in order to reset your private/public key pair
  • using both your private key and face in order to execute a smart contract.

Using face authentication might be riskier than a private key, but it helps improve the user experience. In cases of checking balance and withdrawing tokens to registered wallets, face authentication is considered as safe and convenient.

Fake news Detection

This use case focuses more on a regular application that wants to check if the news can be trusted. Oraichain provides a marketplace in a decentralized manner in which combining results from different providers is possible. If the providers want to receive payments, their APIs must pass the test cases just as any other API provider.

More potential use cases

  • Smart contracts help check if a product is fake in the supply chain
  • Smart contracts deciding a loan based on users’ credit score
  • Smart contracts automatically pricing game items based on their characteristics and DNA
  • Marketplace of automated diagnostics for X-ray images, spam classification, handwriting detection using OCR, and citizen ID card detection using OCR.

Oraichain Roadmap

Oraichain Roadmap

An impressive 2021 roadmap. Image via


Just like other projects that have been built in the data oracle sector the demand for Oraichain should only increase as the DeFi economy continues to expand. Starting with the yAI DeFi product Oraichain is showing it is more than capable of competing in the space.

In addition, this platform fills a niche not served by crowdsourced projects like yEarn Finance. It’s also taking a unique approach that sets it apart from industry leader Chainlink.

The mainnet launch for the project is on February 24, which will be an exciting time to see how much demand there is for the project and its unique take on oracle protocols and DeFi. It could also reinvigorate the ORAI token, which has seen impressive growth over the past four months.

Oraichain is a young project, but it’s already made very impressive strides. The roadmap for the project is quite impressive, but the team is impressive as well. That could lead to unprecedented growth for Oraichain in 2021 and beyond.

As the lone project taking on the problem of adding AI to smart contracts Oraichain could be setting itself up as a leader in the blockchain space for some time to come.

Disclaimer: These are the writer’s opinions and should not be considered investment advice. Readers should do their own research.

The post Oraichain Review: The AI Powered Oracle System appeared first on Coin Bureau.


Continue Reading


Biden should double down on Trump’s policy of promoting AI within government




Binary code, ones and zeros in a 1970 dot matrix font on a distressed US Flag faded to data 1010110 columns.

The current administration should not only maintain the policy of promoting government use of AI, it should make it a priority.Read More Source:

Continue Reading


Setting up Amazon Personalize with AWS Glue




Data can be used in a variety of ways to satisfy the needs of different business units, such as marketing, sales, or product. In this post, we focus on using data to create personalized recommendations to improve end-user engagement. Most ecommerce applications consume a huge amount of customer data that can be used to provide personalized recommendations; however, that data may not be cleaned or in the right format to provide those valuable insights.

The goal of this post is to demonstrate how to use AWS Glue to extract, transform, and load your JSON data into a cleaned CSV format. We then show you how to run a recommendation engine powered by Amazon Personalize on your user interaction data to provide a tailored experience for your customers. The resulting output from Amazon Personalize is recommendations you can generate from an API.

A common use case is an ecommerce platform that collects user-item interaction data and suggests similar products or products that a customer may like. By the end of this post, you will be able to take your uncleaned JSON data and generate personalized recommendations based off of products each user has interacted with, creating a better experience for your end-users. For the purposes of this post, refer to this user-item-interaction dataset to build this solution.

The resources of this solution may incur a cost on your AWS account. For pricing information, see AWS Glue Pricing and Amazon Personalize Pricing.

The following diagram illustrates our solution architecture.


For this post, you need the following:

For instructions on creating a bucket, see Step 1: Create your first S3 bucket. Make sure to attach the Amazon Personalize access policy.

These are very permissive policies; in practice it’s best to use least privilege and only give access where it’s needed. For instructions on creating a role, see Step 2: Create an IAM Role for AWS Glue.

Crawling your data with AWS Glue

We use AWS Glue to crawl through the JSON file to determine the schema of your data and create a metadata table in your AWS Glue Data Catalog. The Data Catalog contains references to data that is used as sources and targets of your ETL jobs in AWS Glue. AWS Glue is a serverless data preparation service that makes it easy to extract, clean, enrich, normalize, and load data. It helps prepare your data for analysis or machine learning (ML). In this section, we go through how to get your JSON data ready for Amazon Personalize, which requires a CSV file.

Your data can have different columns that you may not necessarily want or need to run through Amazon Personalize. In this post, we use the user-item-interaction.json file and clean that data using AWS Glue to only include the columns user_id, item_id, and timestamp, while also transforming it into CSV format. You can use a crawler to access your data store, extract metadata, and create table definitions in the Data Catalog. It automatically discovers new data and extracts schema definitions. This can help you gain a better understanding of your data and what you want to include while training your model.

The user-item-interaction JSON data is an array of records. The crawler treats the data as one object: just an array. We create a custom classifier to create a schema that is based on each record in the JSON array. You can skip this step if your data isn’t an array of records.

  1. On the AWS Glue console, under Crawlers, choose Classifiers.
  2. Choose Add classifier.
  3. For Classifier name¸ enter json_classifier.
  4. For Classifier type, select JSON.
  5. For JSON path, enter $[*].
  6. Choose Create.

Choose Create.

  1. On the Crawlers page, choose Add crawler.
  2. For Crawler name, enter json_crawler.
  3. For Custom classifiers, add the classifier you created.

For Custom classifiers, add the classifier you created.

  1. Choose Next.
  2. For Crawler source type, choose Data stores.
  3. Leave everything else as default and choose Next.
  4. For Choose a data store, enter the Amazon S3 path to your JSON data file.
  5. Choose Next.

Choose Next.

  1. Skip the section Add another data store.
  2. In the Choose an IAM role section, select Choose an existing IAM role.
  3. For IAM role, choose the role that you created earlier (AWSGlueServiceRole-xxx).
  4. Choose Next.

Choose Next.

  1. Leave the frequency as Run on Demand.
  2. On the Output page, choose Add database.
  3. For Database name, enter json_data.
  4. Choose Finish.
  5. Choose Run it now. 

You can also run your crawler by going to the Crawlers page, selecting your crawler, and choosing Run crawler.

Using AWS Glue to convert your files from CSV to JSON

After your crawler finishes running, go to the Tables page on the AWS Glue console. Navigate to the table your crawler created. Here you can see the schema of your data. Make note of the fields you want to use with your Amazon Personalize data. For this post, we want to keep the user_id, item_id, and timestamp columns for Amazon Personalize.

For this post, we want to keep the user_id, item_id, and timestamp columns for Amazon Personalize.

At this point, you have set up your database. Amazon Personalize requires CSV files, so you have to transform the data from JSON format into three cleaned CSV files that include only the data you need in Amazon Personalize. The following table shows examples of the three CSV files you can include in Amazon Personalize. It’s important to note that interactions data is required, whereas user and item data metadata is optional.

Dataset Type Required Fields Reserved Keywords

USER_ID (string)

1 metadata field


ITEM_ID (string)

1 metadata field


USER_ID (string)

ITEM_ID (string)



EVENT_TYPE (string)


EVENT_VALUE (float,null)

It’s also important to make sure that you have at least 1,000 unique combined historical and event interactions in order to train the model. For more information about quotas, see Quotas in Amazon Personalize.

To save the data as a CSV, you need to run an AWS Glue job on the data. A job is the business logic that performs the ETL work in AWS Glue. The job changes the format from JSON into CSV. For more information about data formatting, see Formatting Your Input Data.

  1. On the AWS Glue Dashboard, choose AWS Glue Studio.

AWS Glue Studio is an easy-to-use graphical interface for creating, running, and monitoring AWS Glue ETL jobs.

  1. Choose Create and manage jobs.
  2. Select Source and target added to the graph.
  3. For Source, choose S3.
  4. For Target, choose S3.
  5. Choose Create.

Choose Create.

  1. Choose the data source S3 bucket.
  2. On the Data source properties – S3 tab, add the database and table we created earlier.

On the Data source properties – S3 tab, add the database and table we created earlier.

  1. On the Transform tab, select the boxes to drop user_login and location.

In this post, we don’t use any additional metadata to run our personalization algorithm.

In this post, we don’t use any additional metadata to run our personalization algorithm.

  1. Choose the data target S3 bucket.
  2. On the Data target properties – S3 tab, for Format, choose CSV.
  3. For S3 Target location, enter the S3 path for your target. 

For this post, we use the same bucket we used for the JSON file.

For this post, we use the same bucket we used for the JSON file.

  1. On the Job details page, for Name, enter a name for your job (for this post, json_to_csv).
  2. For IAM Role, choose the role you created earlier.

You should also have included the AmazonS3FullAccess policy earlier.

  1. Leave the rest of the fields at their default settings.

Leave the rest of the fields at their default settings.

  1. Choose Save.
  2. Choose Run.

It may take a few minutes for the job to run.

In your Amazon S3 bucket, you should now see the CSV file that you use in the next section.

Setting up Amazon Personalize

At this point, you have your data formatted in a file type that Amazon Personalize can use. Amazon Personalize is a fully managed service that uses ML and over 20 years of recommendation experience at to enable you to improve end-user engagement by powering real-time personalized product and content recommendations, and targeted marketing promotions. In this section, we go through how to create an Amazon Personalize solution to use your data to create personalized experiences.

  1. On the Amazon Personalize console, under New dataset groups, choose Get started.
  2. Enter the name for your dataset group.

A dataset group contains the datasets, solutions, and event ingestion API.

  1. Enter a dataset name, and enter in the schema details based on your data.

For this dataset, we use the following schema. You can change the schema according to the values in your dataset.

{ "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "TIMESTAMP", "type": "long" } ], "version": "1.0"

  1. Choose Next.
  2. Enter your dataset import job name to import data from Amazon S3.

Make sure that your IAM service role has access to Amazon S3 and Amazon Personalize, and that your bucket has the correct bucket policy.

  1. Enter the path to your data (the Amazon S3 bucket from the previous section).
  2. On the Dashboard page for your dataset groups, under Upload datasets, import the user-item-interactions data (user data and item data are optional but can enhance the solution).

On the Dashboard page for your dataset groups, under Upload datasets,

We include an example item.csv file in the GitHub repo. The following screenshot shows an example of the item data.

The following screenshot shows an example of the item data.

  1. Under Create solutions, for Solutions training, choose Start.

A solution is a trained model of the data you provided with the algorithm, or recipe, that you select.

  1. For Solution name, enter aws-user-personalization.
  2. Choose Next.
  3. Review and choose Finish.
  4. On the dashboard, under Launch campaigns, for Campaign creation, choose Start.

A campaign allows your application to get recommendations from your solution version.

  1. For Campaign name, enter a name.
  2. Choose the solution you created.
  3. Choose Create campaign.

You have now successfully used the data from your data lake and created a recommendation model that can be used to get various recommendations. With this dataset, you can get personalized recommendations for houseware products based off the user’s interactions with other products in the dataset.

Using Amazon Personalize to get your recommendations

To test your solution, go to the campaign you created. In the Test campaign results section, under User ID, enter an ID to get recommendations for. A list of IDs shows up, along with a relative score. The item IDs correlate with specific products recommended.

The following screenshot shows a search for user ID 1. They have been recommended item ID 59, which correlates to a wooden picture frame. The score listed next to the item gives you the predicted relevance of each item to your user.

The following screenshot shows a search for user ID 1.

To learn more about Amazon Personalize scores, see Introducing recommendation scores in Amazon Personalize.

To generate recommendations, you can call the GetRecommendations or GetPersonalizedRanking API using the AWS Command Line Interface (AWS CLI) or a language-specific SDK. With Amazon Personalize, your recommendations can change as the user clicks on the items for more real-time use cases. For more information, see Getting Real-Time Recommendations.


AWS offers a wide range of AI/ML and analytics services that you can use to gain insights and guide better business decisions. In this post, you used a JSON dataset that included additional columns of data, and cleaned and transformed that data using AWS Glue. In addition, you built a custom model using Amazon Personalize to provide recommendations for your customers.

To learn more about Amazon Personalize, see the developer guide. Try this solution out and let us know if you have any questions in the comments.

About the Authors

Zoish PithwafaZoish Pithawala is a Startup Solutions Architect at Amazon Web Services based out of San Francisco. She primarily works with startup customers to help them build secure and scalable solutions on AWS.




Sam TranSam Tran is a Startup Solutions Architect at Amazon Web Services based out of Seattle. He focuses on helping his customers create well-architected solutions on AWS.


Continue Reading
Blockchain4 days ago

Carrefour Shoppers in the UAE to Get Farm-to-Shelf Information with Blockchain Technology

Proposed hardware implementation of the QEC code. The circuit consists of two Josephson junctions coupled by a gyrator, highlighted in red. CREDIT M. Rymarz et al., Phys Rev X (2021), (CC BY 4.0)
Nano Technology4 days ago

Blueprint for fault-tolerant qubits: Scientists at Forschungszentrum Jülich and RWTH Aachen University have designed a circuit for quantum computers which is naturally protected against common errors

Proposed hardware implementation of the QEC code. The circuit consists of two Josephson junctions coupled by a gyrator, highlighted in red. CREDIT M. Rymarz et al., Phys Rev X (2021), (CC BY 4.0)
Nano Technology5 days ago

Blueprint for fault-tolerant qubits: Scientists at Forschungszentrum Jülich and RWTH Aachen University have designed a circuit for quantum computers which is naturally protected against common errors

PR Newswire4 days ago

International HPV Awareness Day Summit

Automotive5 days ago

SpaceX Starship ready to find out if third time’s the charm later this week

Proposed hardware implementation of the QEC code. The circuit consists of two Josephson junctions coupled by a gyrator, highlighted in red. CREDIT M. Rymarz et al., Phys Rev X (2021), (CC BY 4.0)
Nano Technology4 days ago

Blueprint for fault-tolerant qubits: Scientists at Forschungszentrum Jülich and RWTH Aachen University have designed a circuit for quantum computers which is naturally protected against common errors

AI4 days ago

I’m fired: Google AI in meltdown as ethics unit co-lead forced out just weeks after coworker ousted

Automotive4 days ago

FAA clears SpaceX Starship prototype for third launch and landing attempt

PR Newswire4 days ago

Anticoagulant Reversal Drugs Market Size Worth $1.81 Billion By 2027: Grand View Research, Inc.

Nano Technology4 days ago

Dynamics of nanoparticles using a new isolated lymphatic vessel lumen perfusion system

PR Newswire4 days ago

IAR Systems introduces 64-bit Arm core support in leading embedded development tools

PR Newswire4 days ago

Why Famtech Will Become a Major Trend in the Coming Years

Proposed hardware implementation of the QEC code. The circuit consists of two Josephson junctions coupled by a gyrator, highlighted in red. CREDIT M. Rymarz et al., Phys Rev X (2021), (CC BY 4.0)
Nano Technology5 days ago

Blueprint for fault-tolerant qubits: Scientists at Forschungszentrum Jülich and RWTH Aachen University have designed a circuit for quantum computers which is naturally protected against common errors

Proposed hardware implementation of the QEC code. The circuit consists of two Josephson junctions coupled by a gyrator, highlighted in red. CREDIT M. Rymarz et al., Phys Rev X (2021), (CC BY 4.0)
Nano Technology3 days ago

Blueprint for fault-tolerant qubits: Scientists at Forschungszentrum Jülich and RWTH Aachen University have designed a circuit for quantum computers which is naturally protected against common errors

Proposed hardware implementation of the QEC code. The circuit consists of two Josephson junctions coupled by a gyrator, highlighted in red. CREDIT M. Rymarz et al., Phys Rev X (2021), (CC BY 4.0)
Nano Technology5 days ago

Blueprint for fault-tolerant qubits: Scientists at Forschungszentrum Jülich and RWTH Aachen University have designed a circuit for quantum computers which is naturally protected against common errors

PR Newswire4 days ago

Heritage Health Solutions, Inc. Announces New President

Globe NewsWire4 days ago

Notice to the Annual General Meeting of Savosolar Plc

Bioengineer4 days ago

Graphene Oxide membranes could reduce paper industry energy costs

Nano Technology4 days ago

A speed limit also applies in the quantum world: Study by the University of Bonn determines minimum time for complex quantum operations

Bioengineer4 days ago

UH receives $5 million to combat HIV/AIDS epidemic