Connect with us

Big Data

How Can You Distinguish Yourself from Hundreds of Other Data Science Candidates?



How Can You Distinguish Yourself from Hundreds of Other Data Science Candidates?

A few easy (and not-so-easy) ways to prove to employers that your skills and attitudes place you in a higher bracket.

Image source: Pixabay (Free to use)

Why bother distinguishing myself?

Because there is a tremendous amount of competition to get a job as a data scientist.

Getting A Data Science Job is Harder Than Ever – How to turn that to your advantage – KDnuggets
Although many aspiring Data Scientists are finding it is becoming more difficult to land a job than it was in previous…

Because there is a mad rush. Every kind of engineer, scientist, and working professional is calling himself or herself a data scientist.

Why So Many ‘Fake’ Data Scientists?
Have you noticed how many people are suddenly calling themselves data scientists? Your neighbor, that gal you met at a…

Because you are not sure if you can cut your teeth in this or not. Remember that imposter syndrome is well and alive in data science.

How to manage impostor syndrome in data science
What if they find out you’re clueless?

I can go on, but you get the idea…

So, how do you distinguish yourself from the masses? I don’t know whether you can, but I can tell you a few pointers to test yourself. That’s what this article is about.

Ask yourself a few simple questions


Image source: Pixabay (Free to use)


Ask yourself a few questions and count the number of YES answers. The more you have done these the more you are separated from the masses.

If you are a beginner

  • Have you published your own Python/R (whatever you code in) package?
  • If yes, have you written extensive documentation for it to be used easily by everyone else?
  • Have you taken your analysis from Jupyter notebook to a fully-published web app? Or, have you investigated tools that help you do it easily?
  • Have you written at least a few high-quality, detailed articles describing your hobby project?
  • Do you try to practice the Feynman method of learning i.e. “teach a concept you want to learn about to a student in the sixth grade”?

At a slightly more advanced phase

If you are not a beginner but consider yourself to be at a somewhat mature stage as a data scientist, do you do these?

  • Do you consciously try to integrate good software engineering practices (e.g. object-oriented programming, modularization, unit testing) in your data science code at every chance you got?
  • Do you make it a point to not stop at the scope of immediate data analysis that you had to do but imagine what would have happened for 100X data volume or 10X cost of making the wrong prediction? In other words, do you think consciously about data or problem scaling and its impact?
  • Do you make it a point to not stop at the traditional ML metrics, but also think about the cost of data acquisition and ML business value?

Building tools and creating documentation: two important skills to have


Image source: Pixabay (left) and Pixabay (right)


Do not spend all of your time and energy analyzing larger datasets or experimenting with the latest deep learning model.

Set aside at least 25% of your time learning to do a couple of things that are valued everywhere, in every organization, in all situations,

  • building small but focused utility tools for your daily data analysis. Your creative juice will flow freely in this exercise. You are creating something which may not have thousands of immediate users, but it will be novel and it will be your own creation.
  • reading and creating high-quality documentation related to new tools or frameworks or the utility tool you just built (see above). This will force you to learn how to communicate the utility and mechanics of your creation in a manner intelligible to a wide audience.

As you can see, these habits are fairly easy to develop and practice i.e. they do not need backbreaking work, years-long background in statistics, or advanced expertise in deep machine learning knowledge.

But, surprisingly, not everybody embraces them. And, that’s your chance to distinguish yourself.

How to take advantage of those habits in a job interview?


Image source: Pixabay (Free to use)


Imagine yourself at a job interview. If you did have many YES answers to the questions above, you could have mentioned to your interviewer,

  • Hey, check out the cool Python package I built for generating synthetic time-series data at will”.
  • I also wrote a detailed documentation which is hosted at website. It’s built with Sphinx and Jekyll”.
  • I write data science articles regularly for the largest online platform Towards Data Science. Based on those, I even got a book publishing offer from a well-known publisher like Packt or Springer”.
  • Everybody can fit an ML model in a Jupyter notebook. But, I can hack out a basic web app demo of that Scikit-learn function where you can send data through a REST API and get back the prediction”.
  • I can help in the cost-benefit analysis of a new Machine learning program and tell you if the benefit outweighs the data collection effort and how to do it optimally”.

Imagine how different you will sound to the interview board from all the other candidates who do well on regular questions of statistics and gradient descent, but do not offer demonstrable proof of all-around capabilities.

They show that you are inquisitive about data science problems.

They show that you read, you analyze, you communicate. You create and document for others to create.

They show your thinking goes beyond notebooks and classification accuracy to the realm of business value addition and customer empathy. Which company wouldn’t love that kind of candidate?

… these habits are fairly easy to develop and practice i.e. they do not need backbreaking work, years-long background in statistics, or advanced expertise in deep machine learning knowledge. But, surprisingly, not everybody embraces them. And, that’s your chance to distinguish yourself.

Where can I get help?

There are so many great tools and resources to help you practice. It is impossible to even list a good fraction of them in the space of one little article. I am just showing some representative examples. The key idea is to explore along these lines and discover helping aids for yourself.

Build installable software packages using only Jupyter notebooks

nbdev: use Jupyter Notebooks for everything

How to make an awesome Python package — step by step

How to make an awesome Python package in 2021

Learn how to integrate unit testing principles in your own ML models and modules development

PyTest for Machine Learning — a simple example-based tutorial

Learn how to integrate object-oriented programming principles in a data science task

Object-oriented programming for data scientists: Build your ML estimator

Build interactive web apps using simple Python scripts — no HTML/CSS knowledge required

PyWebIO: Write Interactive Web App in Script Way Using Python

Write whole programming and technology books right from your Jupyter notebook. Use this for documentation building, too.

Books with Jupyter

Understand the multi-faceted complexity of a real-life analytics problem and how it is much more than just modeling and prediction

Why a Business Analytics Problem Demands all of your data science skills

Imagine how different you will sound to the interview board from all the other candidates who do well on regular questions of statistics and gradient descent, but do not offer demonstrable proof of all-around capabilities.

A couple of things about MOOCs/ online courses

Image source: Author’s own creation


Don’t jump the steps while learning. Follow the steps.

Image source: Author’s own creation

Read board topics and books at every chance

Don’t just focus on reading the latest deep learning trick or blog post about the latest Python library. At every chance, read board topics off the industry’s top forums and good books. Some of the books and forums that I enjoy are as follows,

Image source: Author’s own creation


Data science and associated skills of machine learning and artificial intelligence are in extremely high demand right now in the job market as more and more businesses adopt and embrace these transformative technologies. There is a lot of competition and miscommunication between the demand and supply sides of talent.

A burning question is: how to distinguish oneself from a hundred co-applicants?

We listed some key questions that you can ask yourself and estimate your uniqueness in some of the skills and habits that make you stand apart from the others. We showed some imaginary conversation snippets that you can have with an interview board showcasing these skills and habits. We also gave a shortlist of resources to help you get started on these.

We listed a couple of approaches for taking MOOCs and suggested reading resources.

Wishing you the best in your data science journey…

You can check the author’s GitHub repositories for code, ideas, and resources in machine learning and data science. If you are, like me, passionate about AI/machine learning/data science, please feel free to add me on LinkedIn or follow me on Twitter.

Original. Reposted with permission.


PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.


Big Data

How much Mathematics do you need to know for Machine Learning?



Mathematics For Machine Learning | Maths to understand ML Algorithms

Learn everything about Analytics

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.


Continue Reading

Big Data

If you did not already know



ML Health google

Deployment of machine learning (ML) algorithms in production for extended periods of time has uncovered new challenges such as monitoring and management of real-time prediction quality of a model in the absence of labels. However, such tracking is imperative to prevent catastrophic business outcomes resulting from incorrect predictions. The scale of these deployments makes manual monitoring prohibitive, making automated techniques to track and raise alerts imperative. We present a framework, ML Health, for tracking potential drops in the predictive performance of ML models in the absence of labels. The framework employs diagnostic methods to generate alerts for further investigation. We develop one such method to monitor potential problems when production data patterns do not match training data distributions. We demonstrate that our method performs better than standard ‘distance metrics’, such as RMSE, KL-Divergence, and Wasserstein at detecting issues with mismatched data sets. Finally, we present a working system that incorporates the ML Health approach to monitor and manage ML deployments within a realistic full production ML lifecycle. …

Guided Zoom google

We propose Guided Zoom, an approach that utilizes spatial grounding to make more informed predictions. It does so by making sure the model has ‘the right reasons’ for a prediction, being defined as reasons that are coherent with those used to make similar correct decisions at training time. The reason/evidence upon which a deep neural network makes a prediction is defined to be the spatial grounding, in the pixel space, for a specific class conditional probability in the model output. Guided Zoom questions how reasonable the evidence used to make a prediction is. In state-of-the-art deep single-label classification models, the top-k (k = 2, 3, 4, …) accuracy is usually significantly higher than the top-1 accuracy. This is more evident in fine-grained datasets, where differences between classes are quite subtle. We show that Guided Zoom results in the refinement of a model’s classification accuracy on three finegrained classification datasets. We also explore the complementarity of different grounding techniques, by comparing their ensemble to an adversarial erasing approach that iteratively reveals the next most discriminative evidence. …

UniParse google

This paper describes the design and use of the graph-based parsing framework and toolkit UniParse, released as an open-source python software package. UniParse as a framework novelly streamlines research prototyping, development and evaluation of graph-based dependency parsing architectures. UniParse does this by enabling highly efficient, sufficiently independent, easily readable, and easily extensible implementations for all dependency parser components. We distribute the toolkit with ready-made configurations as re-implementations of all current state-of-the-art first-order graph-based parsers, including even more efficient Cython implementations of both encoders and decoders, as well as the required specialised loss functions. …

Sparse Constraint Preserving Matching (SPM) google

Many problems of interest in computer vision can be formulated as a problem of finding consistent correspondences between two feature sets. Feature correspondence (matching) problem with one-to-one mapping constraint is usually formulated as an Integral Quadratic Programming (IQP) problem with permutation (or orthogonal) constraint. Since it is NP-hard, relaxation models are required. One main challenge for optimizing IQP matching problem is how to incorporate the discrete one-to-one mapping (permutation) constraint in its quadratic objective optimization. In this paper, we present a new relaxation model, called Sparse Constraint Preserving Matching (SPM), for IQP matching problem. SPM is motivated by our observation that the discrete permutation constraint can be well encoded via a sparse constraint. Comparing with traditional relaxation models, SPM can incorporate the discrete one-to-one mapping constraint straightly via a sparse constraint and thus provides a tighter relaxation for original IQP matching problem. A simple yet effective update algorithm has been derived to solve the proposed SPM model. Experimental results on several feature matching tasks demonstrate the effectiveness and efficiency of SPM method. …

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.


Continue Reading

Big Data

Nokia lifts full-year forecast as turnaround takes root



HELSINKI (Reuters) -Telecom equipment maker Nokia reported a stronger-than-expected second-quarter operating profit on Thursday and raised its full-year outlook as promised, thanks to a turnaround of its business.

The Finnish company’s April-June comparable operating profit rose to 682 million euros ($808.51 million) from 423 million euros a year earlier, beating the 408-million euro mean estimate in a Refinitiv poll of analysts.

Shifting geopolitics and a sharp round of cost cutting have put Nokia firmly back in the global 5G rollout race just a year after CEO Pekka Lundmark took the reins, allowing it to gain ground on Swedish arch-rival Ericsson.

“We have executed faster than planned on our strategy in the first half which provides us with a good foundation for the full year,” Lundmark said in a statement on Thursday, but added that Nokia still expects the 2021 second-half results to be less pronounced.

Nokia said it now expects full-year net sales of 21.7 billion-22.7 billion euros, up from its prior estimate of 20.6 billion-21.8 billion euros, with an operating profit margin of 10-12% instead of the 7% to 10% expected previously.

The company had announced on July 13 that it would raise its outlook, but did not provide any details.

($1 = 0.8435 euros)

(Reporting by Essi Lehto; editing by Terje Solsvik and Sriraj Kaluvila)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.


Continue Reading

Big Data

Robinhood, gateway to ‘meme’ stocks, raises $2.1 billion in IPO



By Echo Wang and David French

(Reuters) -Robinhood Markets Inc, the owner of the trading app which emerged as the go-to destination for retail investors speculating on this year’s “meme’ stock trading frenzy, raised $2.1 billion in its initial public offering on Wednesday.

The company was seeking to capitalize on individual investors’ fascination with cryptocurrencies and stocks such as GameStop Corp, which have seen wild swings after becoming the subject of trading speculation on social media sites such as Reddit. Robinhood’s monthly active users surged from 11.7 million at the end of December to 21.3 million as of the end of June.

The IPO valued Robinhood at $31.8 billion, making it greater as a function of its revenue than many of its traditional rivals such as Charles Schwab Corp, but the offering priced at the bottom of the company’s indicated range.

Some investors stayed on the sidelines, citing concerns over the frothy valuation, the risk of regulators cracking down on Robinhood’s business, and even lingering anger with the company’s imposition of trading curbs when the meme stock trading frenzy flared up at the end of January.

Robinhood said it sold 55 million shares in the IPO at $38 apiece, the low end of its $38 to $42 price range. This makes it one of the most valuable U.S. companies to have gone public year-to-date, amid a red-hot market for new listings.

In an unusual move, Robinhood had said it would reserve between 20% and 35% of its shares for its users.

Robinhood’s platform allows users to make unlimited commission-free trades in stocks, exchange-traded funds, options and cryptocurrencies. Its simple interface made it popular with young investors trading from home during the COVID-19 pandemic.

Robinhood enraged some investors and U.S. lawmakers earlier this year when it restricted trading in some popular stocks following a 10-fold rise in deposit requirements at its clearinghouse. It has been at the center of many regulatory probes.

The company disclosed this week that it has received inquiries from U.S. regulators looking into whether its employees traded shares of GameStop and AMC Entertainment Holdings, Inc before the trading curbs were placed at the end of January.

In June, Robinhood agreed to pay nearly $70 million to settle an investigation by Wall Street’s own regulator, the Financial Industry Regulatory Authority, for “systemic” failures, including systems outages, providing “false or misleading” information, and weak options trading controls.

The brokerage has also been criticized for relying on “payment for order flow” for most of its revenue, under which it receives fees from market makers for routing trades to them and does not charge users for individual trades.

Critics argue the practice, which is used by many other brokers, creates a conflict of interest, on the grounds that it incentivizes brokers to send orders to whoever pays the higher fees. Robinhood contends that it routes trades based on what is cheapest for its users, and that charging a commission would be more expensive. The U.S. Securities and Exchange Commission is examining the practice.

Robinhood was founded in 2013 by Stanford University roommates Vlad Tenev and Baiju Bhatt. They will hold a majority of the voting power after the offering, these filings showed, with Bhatt having around 39% of the voting power of outstanding stock while Tenev will hold about 26.2%.

The company’s shares are scheduled to start trading on Nasdaq on Thursday under the ticker “HOOD”

Goldman Sachs and J.P. Morgan were the lead underwriters in Robinhood’s IPO.

(Reporting by Echo Wang and David French in New York; Editing by Leslie Adler)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.


Continue Reading
AR/VR3 days ago

Review: Winds & Leaves

Esports5 days ago

Legends of Runeterra adding new Lab of Legends mode: The Saltwater Scourge

IOT5 days ago

The Current State of Indoor Positioning with IoT | Navigine’s Alexey Panyov and Elvina Sharafutdinova

Cleantech4 days ago

The Grim Reaper & The Republican Party Embracing Climate Action Are The Only Things That Will Eliminate US Climate Change Deniers

Esports4 days ago

New World Faction Armor Sets

Esports4 days ago

Are Splitgate’s servers down? Here’s how to check server status

Esports5 days ago

Best Build for Lucario in Pokemon UNITE

Esports5 days ago

How to level up every trade skill in New World

Esports5 days ago

Valorant PBE Server Status: How to Check

Blockchain4 days ago

DigiMax Expands Global Marketing into Asia with Signing of Collaboration Deal in Hong Kong

Energy5 days ago

Innovation at the Interconnection Solves Inrush Current Issue on PV Solar+Storage Site in NC

Esports5 days ago

Twitch and Facebook Gaming set new records, YouTube Gaming sees viewership decrease in Q2 2021

Esports5 days ago

Broken Blade talks about Schalke 04’s disappointing 2021 Summer Split, his responsibility as one of the older players on the roster

Esports5 days ago

TSM and Gen.G VALORANT meet again in NA VCT Stage 3 Challengers 2 opener

Esports5 days ago

LCO caster Rusty condemns Chiefs Esports Club for ‘disrespectful’ behavior in week 7 win over Mammoth

Blockchain5 days ago

European Digital Identity: Talao Announces Professional Credential Solution

Energy5 days ago

Duke Energy helps build North Carolina workforce with $615,000 in grants to community colleges, HBCUs and nonprofits

SPAC Insiders4 days ago

Mercury Ecommerce Acquisition Corp. (MEACU) Prices $175M IPO

Esports4 days ago

How Many People Can Play Tribes of Midgard Together?

AR/VR5 days ago

Oculus Quest 2 Sales Paused Until Late August Due to Facial Interface, Silicone Cover Incoming