AI
OCR in 2020 – From Character Recognition to Information Extraction

Published
2 months agoon
Introduction
Simply defined, OCR is a set of computer vision tasks that convert scanned documents and images into machine readable text. It takes images of documents, invoices and receipts, finds text in it and converts it into a format that machines can better process. You want to read information off of ID cards or read numbers on a bank cheque, OCR is what will drive your software.
You might need to read the different characters from a cheque, extract the account number, amount, currency, date etc. But how do you know which character corresponds to which field? What if you want to extract a meter reading, how do you know what parts are the meter reading and what are the numbers printed to identify the meter?

Learning how to extract text from images or how to apply deep learning for OCR is a long process and a topic for another blog post. The focus of this one is going to be understanding where the OCR technology stands, what do OCR products offer, what is lacking and what can be done better.
Want to digitize invoices, PDFs or number plates? Head over to Nanonets and start building OCR models for free!
The OCR landscape
OCR is perceived to be a solved problem by many, but in reality, the products available to us as open-source tools or provided by technological giants are far from perfect – too rigid, often inaccurate and fail in the real world.

The APIs provided by many are limited to solving a very limited set of use cases and are averse to customizations. More often than not, a business planning to use OCR technology needs an in-house team to build on the OCR API available to them to actually apply it to their use case. The OCR technology available in the market today is mostly a partial solution to the problem.
Where the current OCR APIs fail
Product shortcomings

Don’t allow working with custom data – One of the biggest roadblocks to adopting OCR is that every use-case has its nuances and require our algorithms to deal with different kinds of data. To be able to get good results for our use-case, it is important that a model can be trained on data that we’ll be dealing with the most. This is not a possibility with the OCR APIs available to us. Consider any task involving OCR in the wild, reading traffic signs or reading shipping container numbers, for example. Current OCR APIs do not allow for vertical reading which makes the detection task in the image above a lot harder. These use cases need you to specifically get bounding boxes for characters in the images you will be most dealing with.

Require a considerable amount of post-processing – All the OCR APIs currently extract text from given images. It is up to us to build on top of the output so it can be useful for our organization. Getting text out of an invoice does no good. If you need to use it, you’ll have to build a layer of ocr software on top of it that allows you to extract dates, company names, amount, product details, etc. The path to such an end to end product can be filled with roadblocks due to inconsistencies in the input images and lack of organization in the extracted text. To be able to get meaningful results, the text extracted from the OCR models needs to be intelligently structured and loaded in a usable format. This could mean that you need an in-house team of developers to use existing OCR APIs to build software that your organization can use.

Work well only in specific constraints – Current OCR methods perform well on scanned documents with digital text. On the other hand, handwritten documents, images of text in multiple languages at once, images with low resolution, images with new fonts and varying font sizes or images with shadowy text, etc can make cause your OCR model to make a lot of errors and leave you with poor accuracy. Rigid models that are averse to customization limit the scope of applications for the technology where they can perform with at least reasonable effectiveness.
Technological barriers

Tilted text in images – While current research suggests that object detection should be able to work with rotated images by training them on augmented data, it is surprising to find that none of the OCR tools available in the market actually adopt object detection in their pipeline. This has several drawbacks, one of which is that your OCR model won’t pick up the characters and words that are tilted. Take, for example, reading number plates. A camera attached to street light will capture a moving car on a different angle, depending on the distance and the direction of the car. In such cases, the text will appear to be tilted. Better accuracy might mean stronger traffic law enforcement and a decrease in the rate of accidents.
OCR in natural scenes – OCR has historically evolved to deal with documents and though much of our documentation and paperwork happens with computers these days, there still are several use cases that require us to be able to process images taken in a variety of settings. One such example is reading shipping container numbers. Classical approaches tend to find the first character and go in a horizontal line looking for characters that follow. This approach is useless when trying to run OCR on images in the wild. These images can be blurry and noisy. The text in them can be at a variety of locations, the font might be something your OCR model hasn’t seen before, the text can be tilted, etc.
Handwritten text, cursive fonts, font sizes – The OCR annotation process requires you to identify each character as a separate bounding box and models trained to work on such data get thrown off when they are faced with handwritten text or cursive fonts. This is because a gap between any two characters makes it easy to separate one from another. These gaps don’t exist for cursive fonts. Without these gaps, the OCR model thinks that all the characters that are connected are actually one single pattern that doesn’t fit into any of the character descriptions in its vocabulary. These issues can be addressed by powering your OCR engine with deep learning.
Text in languages other than English – OCR models provided by Google and Microsoft work well on English but do not perform well with other languages. This is mostly due to the lack of enough training data and varying syntactical rules for different languages. Any platform or company that intends to use OCR for data in their native languages will have to struggle with bad models and inaccurate results. It is possible that you might want to analyze documents that contain multiple languages at once, like forms to deal with government processes. Working with such cases is not possible with the available OCR APIs.
Noisy/blurry images – Noisy images can very often throw off your classifier to generate wrong results. A blurry image can confuse your OCR model between ‘8’ and ‘B’ or ‘A’ and ‘4’. De-noising images is an active area of research and is being actively studied in the fields of deep learning, computer vision. Making models that are robust to noise can go a long way in creating a generalized approach to character recognition and image classification and understanding de-noising and applying it in character recognition tasks can improve accuracy to a great extent.
Should I even consider using OCR then?
Short answer is Yes.
Anywhere there is a lot of paperwork or manual effort involved, OCR technology can enable image and text based process automation. Being able to digitize information in an accurate way can help business processes become smoother, easier and a lot more reliable along with reducing the manpower required to execute these processes. For big organizations who have to deal with a lot of forms, invoices, receipts, etc, being able to digitize all the information, storing and structuring the data, making it searchable and editable is a step closer to a paper-free world.
Think about the following use cases.
Number plates – number plate detection can be used to implement traffic rules, track cars in your taxi service parking, enhance security in public spaces, corporate buildings, malls, etc.

Legal documents – Dealing with different forms of documents – affidavits, judgments, filings, etc. digitizing, databasing and making them searchable.
Table extraction – Automatically detect tables in a document, get text in each cell, column headings for research, data entry, data collection, etc.
Banking – analyzing cheques, reading and updating passbooks, ensuring KYC compliance, analyzing applications for loans, accounts and other services.

Menu digitization – extracting information from menus of different restaurants and putting them into a homogeneous template for food delivery apps like swiggy, zomato, uber eats, etc.
Healthcare – have patients medical records, history of illnesses, diagnoses, medication, etc digitized and made searchable for the convenience of doctors.
Invoices – automating reading bills, invoices and receipts, extracting products, prices, date-time data, company/service name for retail and logistics industry.

Automating business processes has proven to be a boon for organizations. It has helped them become more transparent, making communication and coordination between different teams easier, increased business throughput, increased employee retention rates, enhanced customer service and delivery, increased employee productivity, and performance. Automation has helped speed up business processes while simultaneously cut costs. It has made processes less chaotic, more reliable and helped increase employee morale. Moving towards digitization is a must to stay competitive in today’s world.
What do these OCR API need?
OCR has a lot of potentials but most products available today do not make it easier for businesses to adopt the technology. What OCR does is convert images with text or scanned documents into machine-readable text. What to do with the text is left upto the people using these OCR technologies, which might seem like a good thing at first. This allows people to customize the text they are working with as they want, given they’re ready to spend resources required to make it happen. But beyond a few use cases like scanned document reading and analyzing invoices and receipts, these technologies fail to make their case for widespread adoption.
A good OCR product would improve on the following fronts.
How it deals with the images coming in
- Does it minimize the pre-processing required?
- Can the annotation process be made easier?
- How many formats does it accept our images in?
- Do we lose information while pre-processing?
How it performs in real-world problems
- How is the accuracy?
- Does it perform well in any language?
- What about difficult cases like tilted text, OCR in the wild, handwritten text?
- Is it possible to constantly improve your models?
- How does it fare against other OCR tools and APIs?
How it uses the machine-readable text
- Does it allow us to give it a structure?
- Does it make iterating over the structure easier?
- Can I choose the information I want to keep and discard the rest?
- Does it make storage, editing and search easier?
- Does it make data analysis easier?
Nanonets and OCR
We at Nanonets have worked to build just the kind of product that solves these problems. We have been able to productize a pipeline for OCR by working with it as not just character recognition but as an object detection and classification task.

But the benefits of using Nanonets over other OCR APIs go beyond just better accuracy. Here are a few reasons you should consider using the Nanonets OCR API.
Automated intelligent structured field extraction – Assume you want to analyze receipts in your organization to automate reimbursements. You have an app where people can upload their receipts. This app needs to read the text in those receipts, extract data and put them into a table with columns like transaction ID, date, time, service availed, the price paid, etc. This information is updated constantly in a database that calculates the total reimbursement for each employee at the end of each month. Nanonets makes it easy to extract text, structure the relevant data into the fields required and discard the irrelevant data extracted from the image.
Works well with several languages – If you are a company that deals with data that isn’t in English, you probably already feel like you have wasted your time looking for OCR APIs that would actually deliver what they promise. We can provide an automated end to end pipeline specific to your use case by allowing custom training and varying vocabulary of our models to suit your needs.
Performs well on text in the wild – Reading street signs to help with navigation in remote areas, reading shipping container numbers to keep track of your materials, reading number plates for traffic safety are just some of the use cases that involve images in the wild. Nanonets utilizes object detection methods to improve searching for text in an image as well as classifying them even in images with varying contrast levels, font sizes, and angles.
Train on your own data to make it work for your use-case – Get rid of the rigidity your previous OCR services forced your workflow into. You won’t have to think of what is possible with this technology. With Nanonets, you will be able to focus on finding the way to make the best out of it for your business. Being able to use your own data for training broadens the scope of applications, like working with multiple languages at once, as well as enhances your model performance due to test data being a lot more similar to training data.
Continuous learning – Imagine you are expanding your transportation service to a new state. You are faced with the risk of your model becoming obsolete in the future due to the new language your truck number plates are in. Or maybe you have a video platform that needs to moderate explicit text in videos and images. With new content, you are faced in with more edge cases where the model’s predictions are not very confident or in some cases, false. To overcome such roadblocks, Nanonets OCR API allows you to re-train your models with new data with ease, so you can automate your operations anywhere faster.
No in-house team of developers required – No need to worry about hiring developers and acquiring talent to personalize the technology for your business requirements. Nanonets will take care of your requirements, starting from the business logic to an end to end product deployed that can be integrated easily into your business workflow without worrying about the infrastructure requirements.
OCR with Nanonets
The Nanonets OCR API allows you to build OCR models with ease. You can upload your data, annotate it, set the model to train and wait for getting predictions through a browser based UI.
1. Using a GUI: https://app.nanonets.com/
You can also use the Nanonets-OCR- API by following the steps below:
2. Using NanoNets API: https://github.com/NanoNets/nanonets-ocr-sample-python
Below, we will give you a step-by-step guide to training your own model using the Nanonets API, in 9 simple steps.
Step 1: Clone the Repo
git clone https://github.com/NanoNets/nanonets-ocr-sample-python
cd nanonets-ocr-sample-python
sudo pip install requests
sudo pip install tqdm
Step 2: Get your free API Key
Get your free API Key from https://app.nanonets.com/#/keys
Step 3: Set the API key as an Environment Variable
export NANONETS_API_KEY=YOUR_API_KEY_GOES_HERE
Step 4: Create a New Model
python ./code/create-model.py
Note: This generates a MODEL_ID that you need for the next step
Step 5: Add Model Id as Environment Variable
export NANONETS_MODEL_ID=YOUR_MODEL_ID
Step 6: Upload the Training Data
Collect the images of object you want to detect. Once you have dataset ready in folder images
(image files), start uploading the dataset.
python ./code/upload-training.py
Step 7: Train Model
Once the Images have been uploaded, begin training the Model
python ./code/train-model.py
Step 8: Get Model State
The model takes ~30 minutes to train. You will get an email once the model is trained. In the meanwhile you check the state of the model
watch -n 100 python ./code/model-state.py
Step 9: Make Prediction
Once the model is trained. You can make predictions using the model
python ./code/prediction.py PATH_TO_YOUR_IMAGE.jpg
Conclusion
While OCR is a widely studied problem, it is generally a research field that had stagnated until deep learning approaches came to the fore to drive the research in the field. While many OCR products available today have progressed to applying deep learning based approaches to OCR, there is a dearth of products that actually make the OCR process easier for a user, a business or any other organisation.
Lazy to code, don’t want to spend on GPUs? Head over to Nanonets and build computer vision models for free!
Further Reading
Update:
Added more reading material about the importance of Character Recognition in Information Extraction.
Source: https://nanonets.com/blog/ocr-apis-to-extract-text-from-images/

You may like
-
Apple’s first mixed reality headset is reportedly coming in mid-2022
-
How The Boeing 720 Got Its Name
-
What The Big Three US Carriers’ 2020 Revenue Streams Reveal About 2021 Capacity
-
Mike Novogratz: Mark Cuban Shouldn’t Accept Doge
-
Mark Cuban sees $1 written in DOGE’s tea leaves
-
Wagepay: the new ‘buy now, pay later’ product for everything you can’t buy now, pay later

Ethereum has been rising for the past two days, increasing by $150 yesterday (10%) to $1697 with it currently trading at $1640.
The ratio reversal coincides with the approval of EIP1559, a cryptoeconomics change that burn fees and will increase capacity both by doubling it as well as by removing miner’s incentives to play with capacity through spamming the network.
Momentum then was added by that tokenized Jack Dorsey tweet, which can be taken as the debut of NFTs because even the BBC covered it.
This space has now seemingly grown considerably from pretty much non existent to about $10 million in trading volumes.
The most expensive NFT, excluding Jack Dorsey’s tweet, is the Formula 1 Grand Prix de Monaco 2020 1A that went for 1,079 eth, worth about $2 million.
That’s the actual race course used in F1® Delta Time, a blockchain game running on Ethereum.
That’s one of quite a few games now running on eth. You can see some footballers’ cards if you browse by recently sold, Decentraland plots of lands, Gods Unchained cards, then there’s an actual planet.
Bitcoin will not be measured against the dollar anymore. “You’d measure it against whatever you’d buying with it, such as planets or solar systems,” said Jesse Powell, Kraken’s founder.
Well, for now these planets are being bought with eth on this 0xUniverse that describes itself as “a revolutionary blockchain-based game. Become an explorer in a galaxy of unique, collectable planets. Colonize them to extract resources and build spacecrafts. Keep expanding your fleet until you’ve uncovered the final mysteries of the universe!”
One mystery they might discover is why is a bull gif worth 112 eth? Apparently this was made prior to the election and changed based on who won it, but $200,000 for that?
Then there’s actual art, like the featured image we’ve chosen above called American censorship.
We’re no art critics, but it’s an image that made us stop for a bit and look at it. Now you can look at it too, and since this is a news article and the image is part of the news, they can’t file copyright complaints as far as we are aware.
So why is someone willing to pay 0.2 eth for this, or $330? Well, because he or she likes it of course, and because it is an original piece of work and he is the owner of it and if he wanted to, he could enforce copyrights where it concerns for business use like selling a print version of this or a card version.
He could sell licensing rights and if the artist of this work becomes famous, he has a proof that it is by this artist and more importantly a proof that this is the original by the artist him/her self.
We could have modified that featured image and obviously we saying we haven’t has plenty of credibility, but having proof of the original itself perhaps has as much value as having the original Mona Lisa rather than a copy.
In any event, the mysteries of art speculation we don’t profess to know, but what seems to be happening is an art experimentation, and many artists must be looking to see just what is going on.
Some of those ‘normie’ artists may well Tik Tok about it and they may even go viral in this day and age where any of us can sometime be the show.
So that frothiness should translate to eth not least because all of this is being priced in eth. The son’s and daughters of traditional art collectors, therefore, may well turn some of their dady’s fiat into eth to speculate on what may well become a new world for art.
Whether it will, remains to be seen, but there’s something new and clearly people seem to be excited about it.
Checkout PrimeXBT
Trade with the Official CFD Partners of AC Milan
The Easiest Way to Way To Trade Crypto.
Check out Nord
Make your Money Grow with Mintos
Source: https://www.trustnodes.com/2021/03/07/ethereum-rises-on-nft-boom
Checkout PrimeXBT
Trade with the Official CFD Partners of AC Milan
The Easiest Way to Way To Trade Crypto.
Source: https://coingenius.news/ethereum-rises-on-nft-boom/?utm_source=rss&utm_medium=rss&utm_campaign=ethereum-rises-on-nft-boom
Artificial Intelligence
Generative Adversarial Transformers: Using GANsformers to Generate Scenes

Published
12 hours agoon
March 7, 2021

@whatsaiLouis Bouchard
I explain Artificial Intelligence terms and news to non-experts.
They basically leverage transformers’ attention mechanism in the powerful StyleGAN2 architecture to make it even more powerful!
Watch the Video:
Chapters:
0:00 Hey! Tap the Thumbs Up button and Subscribe. You’ll learn a lot of cool stuff, I promise.
0:24 Text-To-Image translation
0:51 Examples
5:50 Conclusion
References
Paper: https://arxiv.org/pdf/2103.01209.pdf
Code: https://github.com/dorarad/gansformer
Complete reference:
Drew A. Hudson and C. Lawrence Zitnick, Generative Adversarial Transformers, (2021), Published on Arxiv., abstract:
“We introduce the GANsformer, a novel and efficient type of transformer, and explore it for the task of visual generative modeling. The network employs a bipartite structure that enables longrange interactions across the image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. It iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes. In contrast to the classic transformer architecture, it utilizes multiplicative integration that allows flexible region-based modulation, and can thus be seen as a generalization of the successful StyleGAN network. We demonstrate the model’s strength and robustness through a careful evaluation over a range of datasets, from simulated multi-object environments to rich real-world indoor and outdoor scenes, showing it achieves state-of-theart results in terms of image quality and diversity, while enjoying fast learning and better data efficiency. Further qualitative and quantitative experiments offer us an insight into the model’s inner workings, revealing improved interpretability and stronger disentanglement, and illustrating the benefits and efficacy of our approach. An implementation of the model is available at https://github.com/dorarad/gansformer.”
Video Transcript
Note: This transcript is auto-generated by Youtube and may not be entirely accurate.
00:00
the basically leveraged transformers
00:02
attention mechanism in the powerful stat
00:04
gun 2 architecture to make it even more
00:06
powerful
00:10
[Music]
00:14
this is what’s ai and i share artificial
00:16
intelligence news every week
00:18
if you are new to the channel and would
00:19
like to stay up to date please consider
00:21
subscribing to not miss any further news
00:24
last week we looked at dali openai’s
00:27
most recent paper
00:28
it uses a similar architecture as gpt3
00:31
involving transformers to generate an
00:33
image from text
00:35
this is a super interesting and complex
00:37
task called
00:38
text to image translation as you can see
00:41
again here the results were surprisingly
00:43
good compared to previous
00:45
state-of-the-art techniques this is
00:47
mainly due to the use of transformers
00:49
and a large amount of data this week we
00:52
will look at a very similar task
00:54
called visual generative modelling where
00:56
the goal is to generate a
00:58
complete scene in high resolution such
01:00
as a road or a room
01:02
rather than a single face or a specific
01:04
object this is different from delhi
01:06
since we are not generating the scene
01:08
from a text but from a trained model
01:10
on a specific style of scenes which is a
01:13
bedroom in this case
01:14
rather it is just like style gun that is
01:17
able to generate unique and non-existing
01:19
human faces
01:20
being trained on a data set of real
01:22
faces
01:24
the difference is that it uses this gan
01:26
architecture in a traditional generative
01:28
and discriminative way
01:29
with convolutional neural networks a
01:32
classic gun architecture will have a
01:34
generator
01:35
trained to generate the image and a
01:36
discriminator
01:38
used to measure the quality of the
01:40
generated images
01:41
by guessing if it’s a real image coming
01:43
from the data set
01:44
or a fake image generated by the
01:46
generator
01:48
both networks are typically composed of
01:50
convolutional neural networks where the
01:52
generator
01:53
looks like this mainly composed of down
01:56
sampling the image using convolutions to
01:58
encode it
01:59
and then it up samples the image again
02:02
using convolutions to generate a new
02:04
version
02:05
of the image with the same style based
02:07
on the encoding
02:08
which is why it is called style gun then
02:12
the discriminator takes the generated
02:14
image or
02:15
an image from your data set and tries to
02:17
figure out whether it is real or
02:18
generated
02:19
called fake instead they leverage
02:22
transformers attention mechanism
02:24
inside the powerful stargane 2
02:26
architecture to make it
02:27
even more powerful attention is an
02:30
essential feature of this network
02:32
allowing the network to draw global
02:34
dependencies between
02:36
input and output in this case it’s
02:39
between the input at the current step of
02:41
the architecture
02:42
and the latent code previously encoded
02:44
as we will see in a minute
02:46
before diving into it if you are not
02:48
familiar with transformers or attention
02:50
i suggest you watch the video i made
02:52
about transformers
02:54
for more details and a better
02:55
understanding of attention
02:57
you should definitely have a look at the
02:58
video attention is all you need
03:01
from a fellow youtuber and inspiration
03:03
of mine janik
03:04
kilter covering this amazing paper
03:07
alright
03:07
so we know that they use transformers
03:09
and guns together to generate better and
03:12
more realistic scenes
03:13
explaining the name of this paper
03:15
transformers
03:17
but why and how did they do that exactly
03:20
as for the y they did that to generate
03:22
complex and realistic scenes
03:24
like this one automatically this could
03:26
be a powerful application for many
03:28
industries like movies or video games
03:30
requiring a lot less time and effort
03:33
than having an
03:34
artist create them on a computer or even
03:36
make them
03:37
in real life to take a picture of it
03:40
also
03:40
imagine how useful it could be for
03:42
designers when coupled with text to
03:44
image translation generating many
03:46
different scenes from a single text
03:48
input
03:48
and pressing a random button they use a
03:51
state-of-the-art style gun architecture
03:53
because guns are powerful generators
03:55
when we talk about the general image
03:58
because guns work using convolutional
04:00
neural networks
04:01
they are by nature using local
04:03
information of the pixels
04:05
merging them to end up with the general
04:07
information regarding the image
04:09
missing out on the long range
04:11
interaction of the faraway pixel
04:13
for the same reason this causes guns to
04:15
be powerful generators for the overall
04:18
style of the image
04:19
still they are a lot less powerful
04:21
regarding the quality of the small
04:23
details in the generated image
04:25
for the same reason being unable to
04:27
control the style of localized regions
04:30
within the generated image itself this
04:33
is why they had the idea to combine
04:34
transformers and gans in one
04:36
architecture they called
04:38
bipartite transformer as gpt3 and many
04:41
other papers already proved transformers
04:44
are powerful for long-range interactions
04:46
drawing dependencies between them and
04:48
understanding the context of text
04:50
or images we can see that this simply
04:53
added attention layers
04:54
which is the base of the transformer’s
04:56
network in between the convolutional
04:58
layers of both the generator and
05:00
discriminator
05:01
thus rather than focusing on using
05:03
global information and controlling
05:05
all features globally as convolutions do
05:07
by nature
05:08
they use this attention to propagate
05:10
information from the local pixels to the
05:12
global high level representation
05:14
and vice versa like other transformers
05:17
applied to images
05:18
this attention layer takes the pixel’s
05:20
position and the style gun to latent
05:23
spaces w
05:24
and z the latent space w is an encoding
05:27
of the input into an intermediate latent
05:30
space
05:30
done at the beginning of the network
05:32
denoted here
05:34
as a while the encoding z is just the
05:37
resulting features of the input at the
05:39
current step of the network
05:40
this makes the generation much more
05:42
expressive over the whole image
05:44
especially in generating images
05:46
depicting multi-object
05:48
scenes which is the goal of this paper
05:51
of course this was just an overview of
05:53
this new paper by facebook ai research
05:55
and stanford university
05:57
i strongly recommend reading the paper
05:59
to have a better understanding of this
06:00
approach it’s the first link in the
06:02
description below
06:03
the code is also available and linked in
06:05
the description as well
06:07
if you went this far in the video please
06:08
consider leaving a like
06:10
and commenting your thoughts i will
06:12
definitely read them and answer you
06:14
and since there’s still over 80 percent
06:16
of you guys that are not subscribed yet
06:18
please consider clicking this free
06:20
subscribe button
06:21
to not miss any further news clearly
06:23
explained
06:24
thank you for watching
06:33
[Music]
Related
Tags
Create your free account to unlock your custom reading experience.
Checkout PrimeXBT
Trade with the Official CFD Partners of AC Milan
The Easiest Way to Way To Trade Crypto.
Source: https://hackernoon.com/generative-adversarial-transformers-using-gansformers-to-generate-scenes-013d33h4?source=rss
- FDA greenlights Memic’s breakthrough robotic surgery
- Elon Musk’s genius is battery powered
- Webinar, Mar. 10: Making families affordable
- D-ID drives MyHeritage deep nostalgia animation
- Rewire banks $20M, led by OurCrowd
- Consumer Physics’ SCiO lets you wake up and taste the coffee
- Medisafe raises $30M for AI that helps people take medication
- Arcadia guides companies on homeworking energy costs
- Volvo unit teams with Israel’s Foretellix on autonomous safety
- Tmura: Startup options help those in need
- Introductions
- 1,000 high-tech jobs
FDA greenlights Memic’s breakthrough robotic surgery
The FDA has granted De Novo Marketing Authorization for the breakthrough Hominis robotic surgery system developed by Memic – a notable first. The patented technology allows the surgeon to control tiny, human-like robotic arms that enables procedures currently considered unfeasible, and reduces scarring. “This FDA authorization represents a significant advance in the world of robot-assisted surgery and fulfills an unmet need in the world of robotic gynecological surgery,” said Professor Jan Baekelandt, MD, PhD, a gynecologist at Imelda Hospital in Bonheiden, Belgium, who performed the first hysterectomy using the Hominis system. “Research shows vaginal hysterectomy provides optimal clinical benefits to patients including reduced pain, recovery time and rates of infection. Hominis is the only robot specifically developed for transvaginal surgery and is therefore small and flexible enough to perform surgery through a small incision.” Memic’s strong management team is led by Chairman Maurice R. Ferré MD, who founded Mako Surgical, a surgical robotics company that was sold for $1.65B.
Elon Musk’s genius is battery powered
In 2004, PayPal co-founder Elon Musk took what appeared to be a huge and perhaps reckless gamble. Departing from the seamless technology of the booming online payments business, Musk sank $30 million of his newly-minted internet fortune into a year-old startup that dreamed of transforming a century-old global industry mired in heavy industrial plants, expensive freight and complex global supply chains. Musk’s startup, of course, was Tesla, and the dream was an all-electric car. Today, the electric car is a mass-market reality and Musk’s gamble has sent him powering past Gates and Bezos straight to the Forbes top slot – at least for a while. Musk’s winning bet was not on automobiles, but on energy. The heart of Tesla is not mobility or cars, it is power and battery storage. In 2004, Musk was way ahead of the curve in foreseeing the transformation of energy from fossil fuels to renewables. Today, the signs are everywhere. Read more in my regular ‘Investors on the Frontlines’ column here.
Webinar, Mar. 10: Making families affordable
A key pandemic-driven trend is the acceleration of the fertility tech sector, including egg freezing and IVF, with a market predicted to be worth $37+ billion in less than 10 years. Hear from:
- Claire Tomkins, PhD, founder and CEO of Future Family, startup veteran who was formerly an Investment Bank advisor and director of Richard Branson’s Carbon War Room accelerator, who will speak on the category potential and her startup’s disruptive solution
- Angeline N. Beltsos, Chief Medical Officer and CEO, Vios Fertility Institute, on the challenges and progress represented by fertility issues
- Ashley Gillen Binder, Future Family client, on her life-changing experience with the company
Future Family is the first company to bring together financing, technology, and concierge care in an easy-to-use online platform. With established revenues and growth, Future Family is a leading example of next-gen FinTech, which will provide customized solutions for specific verticals. Moderated by Richard Norman, Managing Director, OurCrowd.
Wednesday, March 10th at 9:00AM San Francisco | 12:00PM New York | 7:00PM Israel
Register Here.
D-ID drives MyHeritage deep nostalgia animation
Jimmy Kimmel didn’t get much work done on Wednesday. He said he was too busy using technology developed by OurCrowd portfolio company D-ID to bring a photograph of his great grandfather to life. Genealogy platform MyHeritage released a feature that animates faces in still photos using video reenactment technology designed by D-ID. Called Deep Nostalgia, it produces a realistic depiction of how a person could have moved and looked if they were captured on video, using pre-recorded driver videos that direct the movements in the animation and consist of sequences of real human gestures. Users can upload a picture regardless of the number of faces in it. The photos are enhanced prior to animation using the MyHeritage Photo Enhancer, USA Today reports. MyHeritage tells the BBC that some people might find the feature “creepy” while others would consider it “magical”. It’s been a good few days for MyHeritage, which was acquired by Francisco Partners last week for a reported $600M.
Rewire banks $20M, led by OurCrowd
OurCrowd led a successful Series B $20M round for Rewire, a provider of cross-border online banking services for expatriate workers. The finance will be used to add new products to its platform, such as bill payments, insurance, savings, and credit and loan services. New investors included Renegade Partners, Glilot Capital Partners and AME Cloud Ventures as well as previous investors Viola FinTech, BNP Paribas through their venture capital fund Opera Tech Ventures, Moneta Capital, and private angel investors. Rewire has also secured an EU Electronic Money Institution license from the Dutch Central Bank to support its European expansion plans, PitchBook reports. Rewire was also granted an expanded Israeli Financial Asset Service Provider license. Acquiring these licenses is another major step for the FinTech startup in its mission to provide secure and accessible financial services for migrant workers worldwide, Globes reports.
Top Tech News
Consumer Physics’ SCiO lets you wake up and taste the coffee
Israeli and Colombian agriculture tech startup Demetria emerged from stealth with $3M seed round to support its artificial-intelligence-powered green coffee quality analysis system. Demetria has added AI to the SCiO technology developed by OurCrowd portfolio company Consumer Physics to enable fast sampling and quality control for coffee. “We use this sensory fingerprint in combination with complex AI and cupping data to discern ‘taste,’” Felipe Ayerbe, CEO and co-founder of Demetria, tells Daily Coffee News. “Demetria is an accurate predictor of cupping analysis. We have conducted the same process that is required to train and certify a ‘cupper,’ but instead of using human senses of taste and smell, we use state-of-the-art sensors that read the biochemical markers of taste, and couple that information with AI.” The technology “has pioneered the digitization of coffee aroma and taste, the most important quality variables of the coffee bean. For the first time, quality and taste can now be assessed at any stage of the coffee production and distribution process, from farm to table,” says Food and Drink International.
Medisafe raises $30M for AI that helps people take medication
Medisafe, an OurCrowd portfolio company developing a personalized medication management platform to help patients stay on top of their prescriptions, raised $30M in a round led by Sanofi Ventures and Alive Israel HealthTech Fund with participation from Merck Ventures, Octopus Ventures, OurCrowd and others. About 20% to 30% of medication prescriptions are never filled, and approximately 50% of medications for chronic disease aren’t taken as prescribed, causing approximately 125,000 deaths and costing the U.S. health care system more than $100B a year. “Medisafe can email and text patients to remind them to take their medications on time. Moreover, the platform can target ‘rising-risk’ patients with analytics and insights based on real-time behavioral assessments, boosting adherence up to 20%”, VentureBeat reports.
Arcadia guides companies on homeworking energy costs
With millions of staff working from home, who should foot the bill for heating and other energy costs? Biotechnology company Biogen and financial giant Goldman Sachs are both working with alternative energy firm Arcadia to help employees switch their homes to wind or solar power – just a couple of the companies assisting staff to explore renewable energy sources for residential use. Alexa Minerva, Arcadia’s senior director of partnerships, says the offering is just one of many new kinds of employee benefits that may arise as work becomes more geographically flexible in the pandemic era and beyond, and perks like an office cafeteria become less of a draw. “It says something not just about…how you value a person, but it also says what you value as a company,” Minerva tells Time magazine. “It’s a great hiring strategy and retention tactic.”
Volvo unit teams with Israel’s Foretellix on autonomous safety
Volvo Autonomous Solutions has formed a partnership with OurCrowd portfolio company Foretellix to jointly create a coverage-driven verification solution for autonomous vehicles and machines on public roads and in confined areas such as mines and quarries. The technology will facilitate testing of millions of different scenarios, which will validate autonomous vehicles and machines’ ability to deal with anything they might encounter. Foretellix has developed a novel verification platform that uses intelligent automation and big data analytics tools that coordinate and monitor millions of driving scenarios, to expose bugs and edge cases, including the most extreme cases. “The partnership with Foretellix gives us access to the state-of-the-art verification tools and accelerating our time to market,” Magnus Liljeqvist, Global Technology Manager, Volvo Autonomous Solutions tells Globes.
Tmura: Startup options help those in need
The sale of MyHeritage to Francisco Partners for a reported $600M will directly help Israeli nonprofits working in education and youth projects through the sale of options the startup donated in 2013 to Tmura, an Israeli nonprofit, The Times of Israel reports. OurCrowd has supported Tmura since its inception, encouraging all our startups to donate a percentage of their options. If they have a big exit, it benefits those who need help. A total of 718 Israeli companies have made donations to Tmura, with a record number of 62 new donor companies contributing in 2020, as nonprofit organizations struggled to raise funds elsewhere during the pandemic. Exit proceeds in 2020 totaled $1.9 million, the second-highest year ever, after 2013, when Waze was sold to Google for some $1 billion. Waze’s options,
Introductions
Your portfolio gets stronger when the OurCrowd network gets involved. Visit our Introductions page to see which of our companies are looking for connections that you may be able to help with.
1,000 High-Tech Jobs
Read the OurCrowd Quarterly Jobs Index here.
Despite the coronavirus pandemic, there are hundreds of open positions at our global portfolio companies. See some opportunities below:
Search and filter through Portfolio Jobs to find your next challenge.
Checkout PrimeXBT
Trade with the Official CFD Partners of AC Milan
The Easiest Way to Way To Trade Crypto.
Source: https://blog.ourcrowd.com/elon-musks-genius-surgical-robots/
Artificial Intelligence
How I’d Learn Data Science If I Were To Start All Over Again

Published
22 hours agoon
March 6, 2021
A couple of days ago I started thinking if I had to start learning machine learning and data science all over again where would I start? The funny thing was that the path that I imagined was completely different from that one that I actually did when I was starting.
I’m aware that we all learn in different ways. Some prefer videos, others are ok with just books and a lot of people need to pay for a course to feel more pressure. And that’s ok, the important thing is to learn and enjoy it.
So, talking from my own perspective and knowing how I learn better I designed this path if I had to start learning Data Science again.
As you will see, my favorite way to learn is going from simple to complex gradually. This means starting with practical examples and then move to more abstract concepts.
Kaggle micro-courses
I know it may be weird to start here, many would prefer to start with the heaviest foundations and math videos to fully understand what is happening behind each ML model. But from my perspective starting with something practical and concrete helps to have a better view of the whole picture.
In addition, these micro-courses take around 4 hours/each to complete so meeting those little goals upfront adds an extra motivational boost.
Kaggle micro-course: Python
If you are familiar with Python you can skip this part. Here you’ll learn basic Python concepts that will help you start learning data science. There will be a lot of things about Python that are still going to be a mystery. But as we advance, you will learn it with practice.
Link: https://www.kaggle.com/learn/python
Price: Free
Kaggle micro-course: Pandas
Pandas is going to give us the skills to start manipulating data in Python. I consider that a 4-hour micro-course and practical examples is enough to have a notion of the things that can be done.
Link: https://www.kaggle.com/learn/pandas
Price: Free
Kaggle micro-course: Data Visualization
Data visualization is perhaps one of the most underrated skills but it is one of the most important to have. It will allow you to fully understand the data with which you will be working.
Link: https://www.kaggle.com/learn/data-visualization
Price: Free
Kaggle micro-course: Intro to Machine Learning
This is where the exciting part starts. You are going to learn basic but very important concepts to start training machine learning models. Concepts that later will be essential to have them very clear.
Link: https://www.kaggle.com/learn/intro-to-machine-learning
Precio: Free
Kaggle micro-course: Intermediate Machine Learning
This is complementary to the previous one but here you are going to work with categorical variables for the first time and deal with null fields in your data.
Link: https://www.kaggle.com/learn/intermediate-machine-learning
Price: Free
Let’s stop here for a moment. It should be clear that these 5 micro-courses are not going to be a linear process, you are probably going to have to come and go between them to refresh concepts. When you are working in the Pandas one you may have to go back to the Python course to remember some of the things you learned or go to the pandas documentation to understand new functions that you saw in the Introduction to Machine Learning course. And all of this is fine, right here is where the real learning is going to happen.
Now, if you realize these first 5 courses will give you the necessary skills to do exploratory data analysis (EDA) and create baseline models that later you will be able to improve. So now is the right time to start with simple Kaggle competitions and put in practice what you’ve learned.
Kaggle Playground Competition: Titanic
Here you’ll put into practice what you learned in the introductory courses. Maybe it will be a little intimidating at first, but it doesn’t matter it’s not about being first on the leaderboard, it’s about learning. In this competition, you will learn about classification and relevant metrics for these types of problems such as precision, recall, and accuracy.
Link: https://www.kaggle.com/c/titanic
Kaggle Playground Competition: Housing Prices
In this competition, you are going to apply regression models and learn about relevant metrics such as RMSE.
Link: https://www.kaggle.com/c/home-data-for-ml-course
By this point, you already have a lot of practical experience and you’ll feel that you can solve a lot of problems, buuut chances are that you don’t fully understand what is happening behind each classification and regression algorithms that you have used. So this is where we have to study the foundations of what we are learning.
Many courses start here, but at least I absorb this information better once I have worked on something practical before.
Book: Data Science from Scratch
At this point, we will momentarily separate ourselves from pandas, scikit-learn ,and other Python libraries to learn in a practical way what is happening “behind” these algorithms.
This book is quite friendly to read, it brings Python examples of each of the topics and it doesn’t have much heavy math, which is fundamental for this stage. We want to understand the principle of the algorithms but with a practical perspective, we don’t want to be demotivated by reading a lot of dense mathematical notation.
Link: Amazon
Price: $26 aprox
If you got this far I would say that you are quite capable of working in data science and understand the fundamental principles behind the solutions. So here I invite you to continue participating in more complex Kaggle competitions, engage in the forums, and explore new methods that you find in other participants’ solutions.
Online Course: Machine Learning by Andrew Ng
Here we are going to see many of the things that we have already learned but we are going to watch it explained by one of the leaders in the field and his approach is going to be more mathematical so it will be an excellent way to understand our models even more.
Link: https://www.coursera.org/learn/machine-learning
Price: Free without the certificate — $79 with the certificate
Book: The Elements of Statistical Learning
Now the heavy math part starts. Imagine if we had started from here, it would have been an uphill road all along and we probably would have given up easier.
Link: Amazon
Price: $60, there is an official free version on the Stanford page.
Online Course: Deep Learning by Andrew Ng
By then you have probably already read about deep learning and play with some models. But here we are going to learn the foundations of what neural networks are, how they work, and learn to implement and apply the different architectures that exist.
Link: https://www.deeplearning.ai/deep-learning-specialization/
Price: $49/month
At this point it depends a lot on your own interests, you can focus on regression and time series problems or maybe go more deep into deep learning.
I wanted to tell you that I launched a Data Science Trivia game with questions and answers that usually come out on interviews. To know more about this Follow me on Twitter.
Also published at https://towardsdatascience.com/if-i-had-to-start-learning-data-science-again-how-would-i-do-it-78a72b80fd93
Related
Tags
Create your free account to unlock your custom reading experience.
Checkout PrimeXBT
Trade with the Official CFD Partners of AC Milan
The Easiest Way to Way To Trade Crypto.
Source: https://hackernoon.com/how-id-learn-data-science-if-i-were-to-start-all-over-again-5o2733tn?source=rss

Betfred Sports, Represented by SCCG Management, Signs Multi-year Marketing Agreement with the Colorado Rockies

‘Bitcoin Senator’ Lummis Optimistic About Crypto Tax Reform

Dogecoin becomes the most popular cryptocurrency

Billionaire Hedge Fund Manager and a Former CFTC Chairman Reportedly Invested in Crypto Firm

Bitcoin Price Analysis: Back Above $50K, But Facing Huge Resistance Now

NEXT Chain: New Generation Blockchain With Eyes on the DeFi Industry

‘Farpoint’ Studio Impulse Gear Announces a New VR Game Coming This Year

Institutional Investors Continue to Buy Bitcoin as Price Tops $50K: Report

Elrond & Reef Finance Team Up for Greater Connectivity & Liquidity

Online learning platform Coursera files for U.S. IPO

Here’s why Bitcoin could be heading towards $45,000

SushiSwap Goes Multi-Chain after Fantom Deployment

Dogecoin price reclaims $0.050 as crypto firm supports DOGE at 1800 ATMs

UK Budget Avoids Tax Hikes for Bitcoin Gains

eToro and DS TECHEETAH Change Face of Sponsorship With Profit-Only Deal

Ethereum’s price prospects: What you need to know

Apple Pay Users Can Now Buy COTI Via Simplex

Calls for Bitcoin Breakout Above $50,000 Grow; 3 Reasons Why

Wall Street people moves of the week: Here’s our rundown of promotions, exits, and hires at firms like Goldman Sachs, JPMorgan, and Third Point

Tron Dapps Market Gets A Boost As Bridge Oracle All Set to Launch MainNet Soon
Trending
-
Blockchain1 week ago
PancakeSwap Review: Leading AMM on Binance Smart Chain
-
Covid191 week ago
How Precious Metals Royalty and Streaming Companies Create Value
-
SPAC Insiders1 week ago
SilverBox Engaged Merger Corp I (SBEAU) Prices $300M IPO
-
SPACS1 week ago
A Fundamentally Different Kind of SPAC
-
Esports6 days ago
PowerOfEvil on TSM’s Spring Split playoff preparation: ‘A lot of things are going to change in the next couple of days’
-
SPACS1 week ago
Shake Shack Creator Danny Meyer Launches SPAC With Charitable Partner
-
Gaming5 days ago
Betfred Sports, Represented by SCCG Management, Signs Multi-year Marketing Agreement with the Colorado Rockies
-
SPACS1 week ago
Insurance Services Firm, SPAC Sponsor, and SPAC Execs Hit with Post-deSPAC Securities Suit
-
Bioengineer1 week ago
Lethal house lures reduce incidence of malaria in children
-
Bioengineer1 week ago
Bioinformatics tool accurately tracks synthetic DNA
-
Bioengineer1 week ago
Unburdening China of cancer: Trend analysis to assist prevention measures
-
Automotive1 week ago
Michelin launches first EV tire designed for Electric Sports Vehicles