Connect with us

Big Data

Interactive Exploration and Analysis of Scientific Datasets




Click to learn more about author Martyna Pawletta.

The availability of scientific datasets in Google BigQuery opens new possibilities for the exploration and analysis of public life sciences data. The Google Cloud Platform (GCP) provides a place where SQL queries can be easily and intuitively created in order to explore huge datasets extremely fast. Here we present a practical example of how you can work with them effectively, on BigQuery stored datasets, using the open-source Analytics Platform. 

In this blog post, we will cover a use case relevant for life sciences research. We will focus on answering some questions from the area of pharmaceutical research by linking and querying different datasets stored in BigQuery.

But don’t worry: Even if you’re not a life science expert, you still might find it useful to see how easy it can be to connect to BigQuery, construct complex queries without needing to write SQL, and explore the results of the queries using the Analytics Platform

SciWalker Open Data

This example was inspired by the SciWalker Open Data sets that were added to Google BigQuery and announced at the American Chemical Society meeting in San Diego this year. You can find the abstract in the Chemical Information Bulletin, page 86-87, here.

SciWalker is a comprehensive resource that contains chemistry-related data like molecules, nucleotides, and peptide sequences (overall 211 million unique molecules) that are linked to additional scientific information. The datasets also include clinical and drug-related data with links to different ontologies that allow us to compare data coming from different data sources using different wording.

  • Set up a BigQuery Account first! You’ll find a detailed description on how to set up your BigQuery account in this blog article by Emilio Silvestri.

Once your BigQuery account is configured, you can create your first query using the other database nodes, as demonstrated in the short example below. These nodes let you create SQL queries in a visual way, without needing to write SQL yourself (although you can add SQL if you want/need to).

  • To learn more about nodes provided for databases, check out our Hub, where you’ll also find more example workflows.
  • Additionally, you will find documentation, the Database Extension Guide, here.

Selecting and Downloading Data

In the short workflow below, we select data from two tables: One contains general information about clinical trials and the other references to literature that has been linked to those clinical trials. They can be joined using the DB Joiner node on the nct_id column and filtered for certain columns like IDs, title, study phase, and the PubMed ID from the reference table using the DB Column Filter node. Additionally, we group the data according to nct_id and count how many PubMed references have been registered per study. 

In the last step, the DB Reader node is used in order to execute the query and download the data into a table.

Fig. 1: The workflow to select data from two tables: One contains general information about clinical trials and the other references to literature that has been linked to those clinical trials.

Time to Play

Now that you’ve connected to a BigQuery resource and queried it with the database nodes, we will demonstrate how to interactively explore the data in a few simple steps. In each step you can use an interactive view to select the data you’re interested in, which are then used to create further queries and pull the matching data from BigQuery – and all this without writing code!

Fig. 2: The workflow “Explore Scientific Data Stored on BigQuery.”

Step 1

In the very first step of our exploration journey, we retrieve a list of diseases that are included in the clinical data ( datasets and standardized according to the disease ontology that is part of the SciWalker data collection. We then use this list to create an autocomplete menu, which we can use to select the disease we want to investigate further. For example, here we will investigate schizophrenia.

Step 2

Selecting a disease brings us – after some data querying, joining, wrangling, and preprocessing – to the next step, where we can explore compounds that have been registered for clinical studies on schizophrenia. We calculate some chemical properties and merge the data with additional information about the clinical trial. In a second table, PubMed references from each study are visible.

To make the view even more interactive, we added web links to the study and reference IDs that will bring you directly to the web pages describing those studies/references.

Let’s select “methotrexate”here, which is known as a chemotherapy agent and immune system suppressant, and see what happens in the next step.

Fig. 3: Interactive view, with additional web links to the study and reference IDs that bring you directly to the web pages describing those studies/references.

Step 3

Here we once again take advantage of the ontologies available in SciWalker. 

The view below shows which chemical classes “methotrexate”belongs to, along with how many other compounds from each of those chemical classes have been registered for clinical studies. Here, one class should be selected to go to the next step. We selected “pteridines,” which seems to be not that popular (with only 21 compounds registered for clinical studies). In the next step, let’s check which 21 compounds those are and for which diseases the studies have been conducted.

Fig. 4: View showing which chemical classes “methotrexate” belongs to, plus how many other compounds from each of those chemical classes have been registered for clinical studies.

Step 4

This view shows a tag cloud with disease and condition names for which studies have been registered for compounds in the selected compound class (here: pteridines). When you select a disease from the tag cloud, the list of compounds in the selected class that are associated with that disease are displayed in the table below. 

When we select “Rheumatoid arthritis,” we see that within the class of pteridines three compounds are linked. We see that methotrexate has been tested for schizophrenia and rheumatoid arthritis.

Fig. 5: View showing a tag cloud with disease and condition names for which studies have been registered for compounds in the selected compound class.

Step 5

The last view shows all compounds found in the clinical trials dataset that have been tested for both schizophrenia and rheumatoid arthritis. If you are curious which compounds those are, check out the workflow on the Hub here.

Prerequisites to run the example:

  • BigQuery account
  • Simba Driver
  • KNIME Analytics Platform (4.1)
  • KNIME Big Data Extension
  • KNIME Community Extensions – Cheminformatics (including RDKit)

Wrapping Up

In this blog post, we highlighted how to interactively explore and analyze scientific data using Google BigQuery and KNIME Analytics Platform together. We showed that combining these two tools allows us to take advantage of the breadth of data available in BigQuery using the interactive query construction, data analysis, and visualization capabilities in the Analytics Platform. Maybe this sparks further ideas or questions or even allows you to create new hypotheses? 

Though we’ve focused on life sciences data here, the combination of an analytics platform and Google BigQuery can be applied in many different fields, so feel free to give it a try no matter what your use case or industry!

If this makes you curious, start playing with the workflow demonstrated today or look for other examples here on the Hub. 

If you want to explore and do more experiments using freely available scientific datasets on Google BigQuery, check out the Marketplace. There is a lot more data to explore! 

Coinsmart. Beste Bitcoin-Börse in Europa

Big Data

Aeva announces customer deal; shares soar even after results disappoint




By Stephen Nellis

(Reuters) – Aeva Technologies Inc said on Thursday it signed a deal to develop a sensor for a self-driving car to be made by an “undisclosed large company,” and its shares rose 13% even as it reported that its loss ballooned and sales came in far below forecasts.

Aeva reported first-quarter revenue of $300,000, down from $500,000 a year earlier and far below analysts’ estimate of $1.38 million, according to Refinitiv estimates. Its adjusted operating loss more than doubled to $15.6 million from $6.1 million a year ago.

Founded by two ex-Apple Inc engineers, Aeva makes a sensor that helps self-driving cars navigate through the use lidar technology that uses lasers, much like radar uses radio waves. The company became publicly traded through a reverse merger earlier this year and was one of several lidar firms to do so.

Aeva said on Thursday it had signed a “foundational agreement with an undisclosed large company to develop best-in-class lidar” for the customer’s autonomous driving program.

Aeva’s shares were up 13% at $8 in after-hours trading after the results and customer announcement.

In an interview, Aeva Chief Executive Soroush Salehian said the company could not disclose when it would go into mass production with the undisclosed customer but that “there’s going to be increased activity as we work toward production.

“This has a huge potential,” he said, “and based on what we know, it could be one of the largest programs in the industry.”

Aeva has deals with automotive suppliers Denso Corp and ZF Friedrichshafen AG to mass-manufacture its sensors.

Earlier this week, the company said it added to its advisory board Apple senior executive Steve Zadesky and Volkswagen AG Senior Vice President Alex Hitzinger, who was also once part of Apple’s self-driving car Project Titan. Porsche Automobil Holding SE, Volkswagen’s majority voting shareholder, is also an investor in Aeva.

(Reporting by Stephen Nellis in San Francisco; Editing by David Gregorio and Richard Chang)

Image Credit: Reuters

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading

Big Data

Elon Musk on crypto: to the mooooonnn! And back again




By John McCrank

NEW YORK (Reuters) – Bitcoin’s price tumbled after Elon Musk said on Wednesday his electric car maker Tesla Inc would no longer accept the cryptocurrency for purchases, citing environmental concerns for the reversal.

Here are some of Musk’s comments on bitcoin and other cryptocurrenies like Ether and Dogecoin, some of which moved the digital assets’ prices:

Nov. 27, 2017: Musk denies a theory he is Satoshi Nakamoto, the pseudonymous creator of bitcoin, tweeting, “Not true. A friend sent me part of a BTC a few years, but I don’t know where it is.”

Feb. 22, 2018: Musk tweeted “I literally own zero cryptocurrency, apart from .25 BTC that a friend sent me many years ago.”

Oct. 22, 2018: Musk temporarily lost Twitter access after tweeting, “Wanna buy some bitcoin?”

Feb. 19, 2019, Musk called bitcoin’s structure “quite brilliant” on a podcast with ARK Invest’s Cathie Wood. “But, I’m not sure it’s the best use of Tesla’s resources to get involved in crypto. We’re really just trying to accelerate the advent of sustainable energy, and it’s like, quite energy intensive.”

April 2, 2019, Musk tweeted “Dogecoin might be my fav cryptocurrency. It’s pretty cool.”

April 29, 2019, Musk tweeted “Ethereum,” and, “jk.”

May 15, 2019, Musk tweeted, in response to “Harry Potter” author J.K. Rowling, “… massive currency issuance by govt central banks is making Bitcoin Internet money look solid by comparison,” and, “I still only own 0.25 Bitcoins btw.”

July 2, 2020, Musk responded to “Star Trek” actor William Shatner with, “I’m not building anything on ethereum. Not for or against it, just don’t use it or own any.”

July 17, 2020, Musk tweeted a meme implying Dogecoin would become the standard of the global financial system, “It’s inevitable.”

Dec. 20, 2020, Dogecoin soared after Musk tweeted, “One word: Doge.”

Dec. 20, 2020, in a Twitter exchange with MicroStrategy Inc Chief Executive Michael Saylor, Musk asks about converting “large transactions” of Tesla’s balance sheet into bitcoin.

Jan. 29, Bitcoin spikes after Musk adds #bitcoin to his Twitter bio, tweeting, “In retrospect, it was inevitable.”

Feb. 1, Musk says on social media app Clubhouse he supports bitcoin, which was “on the verge of getting broad acceptance” and that he was “a little slow on the uptake.”

Feb. 4, Dogecoin surged more than 60% after Musk tweeted “Dogecoin is the people’s crypto.”

March 2, Musk tweeted, “Scammers & crypto should get a room.”

March 12, Musk tweeted, in reference to his tunneling company, “BTC (Bitcoin) is an anagram of TBC(The Boring Company) What a coincidence!”

March 24, Musk tweeted “You can now buy a Tesla with Bitcoin,” and “… Bitcoin paid to Tesla will be retained as Bitcoin, not converted to fiat currency.”

April 1, Musk tweeted “SpaceX is going to put a literal Dogecoin on the literal moon.”

April 15, Musk tweeted “Doge Barking at the Moon.”

April 28, Musk tweeted “The Dogefather SNL May 8,” ahead of hosting Saturday Night Live.

May 7, Musk tweeted “Cryptocurrency is promising, but please invest with caution!”

May 9, Dogecoin tanked after Musk called the cryptocurrency “a hustle” on SNL. Musk tweeted “SpaceX launching satellite Doge-1 to the moon next year – Mission paid for in Doge – 1st crypto in space – 1st meme in space To the mooooonnn!!”

May 11, Musk tweeted “Do you want Tesla to accept Doge?”

May 12, Musk tweeted Tesla would no longer accept bitcoin as a payment, and “Energy usage trend over past few months is insane.”

(Reporting by John McCrank; Editing by Richard Chang)

Image Credit: Reuters

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading

Big Data

Airbnb bookings jump 52% as vaccinations spur vacation rental demand




By Sanjana Shivdas

(Reuters) – Airbnb Inc beat Wall Street expectations for first-quarter gross bookings and revenue on Thursday, as speedy COVID-19 vaccinations and easing restrictions encouraged more people to check into its vacation rentals.

Gross bookings jumped 52% to $10.29 billion in the quarter, easily beating analysts’ estimates of $6.93 billion.

“For guests aged 60 and above in the U.S., who were amongst the first groups to benefit from vaccine rollouts, searches on our platform for summer travel increased by more than 60% between February and March 2021,” Airbnb said.

The San Francisco-based company expects second-quarter revenue to be similar to 2019 levels, adding that the return of urban and cross-border travel is likely to underpin growth over the coming quarters.

Airbnb is also set to benefit from demand for longer stays and a shift to traveling in groups by business travelers, Chief Executive Officer Brian Chesky said on a post-earnings call.

The company has weathered the pandemic better than rivals as people turned to its offering of larger spaces and locations away from major cities in the era of social distancing.

It recorded a surge in bookings in Britain after the government laid down plans in February to exit lockdown, while the easing of travel restrictions in France earlier this month also lifted demand.

Airbnb, however, said it was too early to predict if the recovery momentum would continue at the same pace in the second half of 2021.

Its revenue rose 5.4% to $886.9 million in the first quarter ended March 31, exceeding estimates of $714.4 million, according to Refinitiv IBES data.

Adjusted loss before interest, taxes, depreciation and amortization narrowed to $59 million, from $334 million a year earlier, largely due to cost cuts.

(Reporting by Sanjana Shivdas in Bengaluru; Editing by Aditya Soni)

Image Credit: Reuters

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading

Big Data

Disney’s streaming growth slows as pandemic lift fades, shares fall




By Lisa Richwine and Tiyashi Datta

(Reuters) -Disappointing growth of Walt Disney Co’s namesake streaming service on Thursday overshadowed better-than-expected overall profits, driving down shares of the entertainment company.

Shares of Disney fell 3.7% in after-hours trading.

CEO Robert Chapek said that movie and television shows were resuming normal production and new offerings would help bring in new customers to Disney+, ESPN+, Hulu and Hotstar.

Adjusted earnings-per-share for the fiscal second quarter came in at 79 cents for January through April 3, Disney said. Analysts had expected 27 cents, according to IBES data from Refinitiv.

Disney is focusing on quickly building its streaming service to challenge Netflix Inc as audiences move away from cable TV. The company’s popular theme parks remain in recovery mode with attendance limits due to the COVID-19 pandemic.

“(Disney+) growth is significantly decelerating as the initial pandemic boost has waned,” eMarketer analyst Eric Haggstrom said. “Given Disney’s content investments, subscriber growth should return strongly once this short-term turbulence ends.”

Upcoming Disney+ series include “Loki” about the Marvel villain and Star Wars series “The Book of Boba Fett.”

A total of 103.6 million customers subscribed to Disney+ as of early April, the company said. Two Marvel superhero series, “WandaVision” and “The Falcon and the Winter Soldier,” debuted during the quarter. Analysts had projected 109.3 million, according to FactSet.

The average monthly revenue per paid subscriber for Disney+ decreased from $5.63 to $3.99, the company said, due to the launch of the lower-priced Disney+ Hotstar in overseas markets. Factset estimates showed Wall Street was expecting average revenue of $4.10 per user.

Disney plans to launch Disney+ in Malaysia on June 1 and in Thailand on June 30, executives said on a call with analysts.

Overall revenue fell 13% to $15.61 billion in the second quarter ended April 3, a touch below what analysts estimated, according Refinitiv.

Net income from continuing operations rose to $912 million in the second quarter from $468 million a year earlier.

Operating income at Disney’s media division rose 74% from a year earlier to $2.9 billion as profit rose at domestic and international TV networks. The streaming media unit lost $290 million, less than half of what Wall Street expected, thanks in part to higher advertising revenue at Hulu and ESPN+ income from Ultimate Fighting Championship pay-per-view events.

The theme parks division posted an operating loss of $406 million. The Disneylands in California and Paris were closed for the full quarter. Disneyland in California reopened April 30.

Chief Financial Officer Christine McCarthy said reservations at Disney’s U.S. parks were strong, “demonstrating the strength of our brands as well as growing travel optimism.”

Chapek said Disney will continue to experiment with movie distribution while theaters try to lure audiences back. The company will offer late summer releases “Free Guy” and “Shang-Chi and the Legend of the 10 Rings” exclusively in theaters for 45 days, a shortened period that has been embraced by other studios to allow for home viewing sooner.

Disney renewed a deal with Major League Baseball with 30 exclusive regular season games through 2028. The deal includes an option to simulcast all live MLB coverage for ESPN networks on ESPN+.

(Reporting by Lisa Richwine in Los Angeles; Eva Mathews and Tiyashi Datta in Bengaluru; Editing by Sriraj Kalluvila and Lisa Shumaker)

Image Credit: Reuters

Coinsmart. Beste Bitcoin-Börse in Europa

Continue Reading
AI2 days ago

Build a cognitive search and a health knowledge graph using AWS AI services

Blockchain1 day ago

Shiba Inu: Know How to Buy the New Dogecoin Rival

Energy3 days ago

ONE Gas to Participate in American Gas Association Financial Forum

SaaS5 days ago

Blockchain2 days ago

Meme Coins Craze Attracting Money Behind Fall of Bitcoin

Blockchain5 days ago

Yieldly announces IDO

SaaS5 days ago

Esports3 days ago

Pokémon Go Special Weekend announced, features global partners like Verizon, 7-Eleven Mexico, and Yoshinoya

Blockchain5 days ago

Opimas estimates that over US$190 billion worth of Bitcoin is currently at risk due to subpar safekeeping

Fintech3 days ago

Credit Karma Launches Instant Karma Rewards

Esports2 days ago

Valve launches Supporters Clubs, allows fans to directly support Dota Pro Circuit teams

Esports1 day ago

‘Destroy Sandcastles’ in Fortnite Locations Explained

SaaS5 days ago

Business Insider3 days ago

Bella Aurora launches its first treatment for white patches on the skin

Blockchain2 days ago

Sentiment Flippening: Why This Bitcoin Expert Doesn’t Own Ethereum

Esports4 days ago

5 Best Mid Laners in League of Legends Patch 11.10

Esports3 days ago

How to download PUBG Mobile’s patch 1.4 update

Cyber Security4 days ago

Top Tips On Why And How To Get A Cyber Security Degree ?

Blockchain5 days ago

Decentraland Price Prediction 2021-2025: MANA $25 by the End of 2025

Private Equity3 days ago

Warburg Pincus leads $110m Aetion Series C in wake of company doubling revenue last year