Connect with us

AI

Algorithm quickly simulates a roll of loaded dice

Avatar

Published

on

The fast and efficient generation of random numbers has long been an important challenge. For centuries, games of chance have relied on the roll of a die, the flip of a coin, or the shuffling of cards to bring some randomness into the proceedings. In the second half of the 20th century, computers started taking over that role, for applications in cryptography, statistics, and artificial intelligence, as well as for various simulations — climatic, epidemiological, financial, and so forth.

MIT researchers have now developed a computer algorithm that might, at least for some tasks, churn out random numbers with the best combination of speed, accuracy, and low memory requirements available today. The algorithm, called the Fast Loaded Dice Roller (FLDR), was created by MIT graduate student Feras Saad, Research Scientist Cameron Freer, Professor Martin Rinard, and Principal Research Scientist Vikash Mansinghka, and it will be presented next week at the 23rd International Conference on Artificial Intelligence and Statistics. 

Simply put, FLDR is a computer program that simulates the roll of dice to produce random integers. The dice can have any number of sides, and they are “loaded,” or weighted, to make some sides more likely to come up than others. A loaded die can still yield random numbers — as one cannot predict in advance which side will turn up — but the randomness is constrained to meet a preset probability distribution. One might, for instance, use loaded dice to simulate the outcome of a baseball game; while the superior team is more likely to win, on a given day either team could end up on top.

With FLDR, the dice are “perfectly” loaded, which means they exactly achieve the specified probabilities. With a four-sided die, for example, one could arrange things so that the numbers 1,2,3, and 4 turn up exactly 23 percent, 34 percent, 17 percent, and 26 percent of the time, respectively.

To simulate the roll of loaded dice that have a large number of sides, the MIT team first had to draw on a simpler source of randomness — that being a computerized (binary) version of a coin toss, yielding either a 0 or a 1, each with 50 percent probability. The efficiency of their method, a key design criterion, depends on the number of times they have to tap into this random source — the number of “coin tosses,” in other words — to simulate each dice roll. 

In a landmark 1976 paper, the computer scientists Donald Knuth and Andrew Yao devised an algorithm that could simulate the roll of loaded dice with the maximum efficiency theoretically attainable. “While their algorithm was optimally efficient with respect to time,” Saad explains, meaning that literally nothing could be faster, “it is inefficient in terms of the space, or computer memory, needed to store that information.” In fact, the amount of memory required grows exponentially, depending on the number of sides on the dice and other factors. That renders the Knuth-Yao method impractical, he says, except for special cases, despite its theoretical importance.

FLDR was designed for greater utility. “We are almost as time efficient,” Saad says, “but orders of magnitude better in terms of memory efficiency.” FLDR can use up to 10,000 times less memory storage space than the Knuth-Yao approach, while taking no more than 1.5 times longer per operation.

For now, FLDR’s main competitor is the Alias method, which has been the field’s dominant technology for decades. When analyzed theoretically, according to Freer, FLDR has one clear-cut advantage over Alias: It makes more efficient use of the random source — the “coin tosses,” to continue with that metaphor — than Alias. In certain cases, moreover, FLDR is also faster than Alias in generating rolls of loaded dice.

FLDR, of course, is still brand new and has not yet seen widespread use. But its developers are already thinking of ways to improve its effectiveness through both software and hardware engineering. They also have specific applications in mind, apart from the general, ever-present need for random numbers. Where FLDR can help most, Mansinghka suggests, is by making so-called Monte Carlo simulations and Monte Carlo inference techniques more efficient. Just as FLDR uses coin flips to simulate the more complicated roll of weighted, many-sided dice, Monte Carlo simulations use a dice roll to generate more complex patterns of random numbers. 

The United Nations, for instance, runs simulations of seismic activity that show when and where earthquakes, tremors, or nuclear tests are happening on the globe. The United Nations also carries out Monte Carlo inference: running random simulations that generate possible explanations for actual seismic data. This works by conducting a second series of Monte Carlo simulations, which randomly test out alternative parameters for an underlying seismic simulation to find the parameter values most likely to reproduce the observed data. These parameters contain information about when and where earthquakes and nuclear tests might actually have occurred. 

“Monte Carlo inference can require hundreds of thousands of times more random numbers than Monte Carlo simulations,” Mansinghka says. “That’s one big bottleneck where FLDR could really help. Monte Carlo simulation and inference algorithms are also central to probabilistic programming, an emerging area of AI with broad applications.” 

Ryan Rifkin, Director of Research at Google, sees great potential for FLDR in this regard. “Monte Carlo inference algorithms are central to modern AI engineering … and to large-scale statistical modeling,” says Rifkin, who was not involved in the study. “FLDR is an extremely promising development that may lead to ways to speed up the fundamental building blocks of random number generation, and might help Google make Monte Carlo inference significantly faster and more energy efficient.”

Despite its seemingly bright future, FLDR almost did not come to light. Hints of it first emerged from a previous paper the same four MIT researchers published at a symposium in January, which introduced a separate algorithm. In that work, the authors showed that if a predetermined amount of memory were allocated for a computer program to simulate the roll of loaded dice, their algorithm could determine the minimum amount of “error” possible — that is, how close one comes toward meeting the designated probabilities for each side of the dice. 

If one doesn’t limit the memory in advance, the error can be reduced to zero, but Saad noticed a variant with zero error that used substantially less memory and was nearly as fast. At first he thought the result might be too trivial to bother with. But he mentioned it to Freer who assured Saad that this avenue was worth pursuing. FLDR, which is error-free in this same respect, arose from those humble origins and now has a chance of becoming a leading technology in the realm of random number generation. That’s no trivial matter given that we live in a world that’s governed, to a large extent, by random processes — a principle that applies to the distribution of galaxies in the universe, as well as to the outcome of a spirited game of craps.


Topics: Research, Computer Science and Artificial Intelligence Laboratory, Electrical Engineering & Computer Science (eecs), Mathematics, Brain and cognitive sciences, School of Engineering, School of Science, Algorithms, Data, MIT Schwarzman College of Computing, Cyber security

Source: http://news.mit.edu/2020/algorithm-simulates-roll-loaded-dice-0528

Continue Reading

AI

XBRL: scrapping quarterlies, explaining AI and low latency reporting

Avatar

Published

on

Here is our pick of the 3 most important XBRL news stories this week.

1 FDIC considers scrapping quarterly bank reports

The Federal Deposit Insurance Corp. is moving to boost the way it monitors for risks at thousands of U.S. banks, potentially scrapping quarterly reports that have been a fixture of oversight for more than 150 years yet often contain stale data.

The FDIC has been one of the cheerleaders and case studies for the efficiency increasing impact of XBRL based reporting forever. Therefore it will be fascinating to observe this competition and its outcome.

2 XBRL data feeds explainable AI models

Amongst several fascinating presentations at the Eurofiling Innovation Day this week was an interesting demonstration on how XBRL reports can be used as the basis of explainable AI for bankruptcy prediction.

The black box nature of many AI models is one biggest issues of applying AI in regulated environments, where causal linkages are the bedrock of litigation etc. Making them explainable would remove a major headache for lots of use cases.

3 Low latency earnings press release data

Standardized financials from Earnings Press Release and 8-Ks are now available via the Calcbench API minutes after published.  Calcbench is leveraging our expertise in XBRL to get many of the numbers from the Income Statement, Balance Sheet and Statement of Cash Flows from the earnings press release or 8-K.  

The time lag between the publication of earnings information and its availability in the XBRL format continues to be a roadblock for the wholesale adoption of XBRL by financial markets until regulators require immediate publication in the XBRL format in real time. The Calcbench API is a welcome stop gap measure. 

 

—————————————————————

Christian Dreyer CFA is well known in Swiss Fintech circles as an expert in XBRL and financial reporting for investors.

 We have a self-imposed constraint of 3 news stories each week because we serve busy senior leaders in Fintech who need just enough information to get on with their job.

 For context on XBRL please read this introduction to our XBRL Week in 2016 and read articles tagged XBRL in our archives. 

 New readers can read 3 free articles.  To  become a member with full access to all that Daily Fintech offers,  the cost is just USD 143 a year (= USD 0.39 per day or USD 2.75 per week). For less than one cup of coffee you get a week full of caffeine for the mind.

Source: https://dailyfintech.com/2020/07/02/xbrl-scrapping-quarterlies-explaining-ai-and-low-latency-reporting/

Continue Reading

AI

AI- hot water for insurance incumbents, or a relaxing spa?

Avatar

Published

on

Frog-in-boiling-water

The parable of the frog in the boiling water is well known- you know, if you put a frog into boiling water it will immediately jump out, but if you put the frog into tepid water and gradually increase the temperature of the water it will slowly boil to death.  It’s not true but it is a clever lede into the artificial intelligence evolution within insurance.  Are there insurance ‘frogs’ in danger of tepid water turning hot, and are there frogs suffering from FOHW (fear of hot water?)

image source

Patrick Kelahan is a CX, engineering & insurance consultant, working with Insurers, Attorneys & Owners in his day job. He also serves the insurance and Fintech world as the ‘Insurance Elephant’.

The frog and boiling water example is intuitive- stark change is noticed, gradual change not so much.  It’s like Ernest Hemmingway’s quotation in “The Sun Also Rises”- “How did you go bankrupt?  Gradually, and then suddenly!”  In each of the examples the message is similar- adverse change is not always abrupt, but failure to notice or react to changing conditions can lead to a worst-case scenario.  As such with insurance innovation.

A recent interview in The Telegraph by Michael Dwyer of Peter Cullum, non-executive Director of Global Risk Partners (and certainly one with a CV that qualifies him as a knowing authority), provided this view:

“Insurance is one business that is all about data. It’s about numbers. It’s about the algorithms. Quite frankly, in 10 years’ time, I predict that 70pc or 80pc of all underwriters will be redundant because it will be machine driven.

“We don’t need smart people to make what I’d regard as judgmental decisions because the data will make the decision for you.”

A clever insurance innovation colleague, Craig Polley, recently posed Peter’s insurance scenario for discussion and the topic generated lively debate- will underwriting become machine driven, or is there an overarching need for human intuition?  I’m not brave enough to serve as arbiter of the discussion, but the chord Craig’s question struck leads to the broader point- is the insurance industry sitting in that tepid water now, and are the flames of AI potentially leading to par boiling?

I offered a thought recently to an AI advocate looking for some insight into how the concept is embraced by insurance organizations.  In considering the fundamentals of insurance, I recounted that insurance as a product thrives best in environments where risk can be understood, predicted, and priced across populations with widely varied individual risk exposures as best determined by risk experience within the population or application of risk indicators.  Blah, blah, blah. Insurance is a long-standing principle of sharing of the ultimate cost of risk where no one participant is unduly at a disadvantage, and no one party is at a financial advantage- it is a balance of cost and probability.

Underwriting has been built on a model of proxy information, on the law of large numbers, of historical performance, of significant populations and statistical sampling.  There is not much new in that description, but what if the dynamic is changed, to an environment where the understanding of risk factors is not retrospective, but prospective?

Take commercial motor insurance for example.  Reasonably expensive, plenty of human involvement in underwriting, high maximum loss outcomes for occurrences.  Internal data are the primary source of rating the book of business.  There are, however,  new approaches being made in the industry that supplant traditional internal or proxy data with robust analysis of external data.  Luminant Analytics is an example of a firm that leverages AI in providing not only provide predictive models for motor line loss frequency and severity trends, but also analytics that help companies expanding into new markets, where historical loss data is unavailable.  Traditional underwriting has remained a solid approach, but is it now akin to turning the heat up on the industry frog?

The COVID-19 environment has by default prompted a dramatic increase in virtual claim handling techniques, changing what was not too long ago verboten- waiver of inspection on higher value claims, or acceptance of third party estimates in lieu of measure by the inch adjuster work.  Yes, there will be severity hangovers and spikes in supplements, but carriers will find expediency trumps detail- as long as the customer is accepting of the change in methods.  If we consider the recent announcement by US P&C carrier Allstate of significant staff layoffs as an indicator of the inroads of virtual efforts then there seemingly is hope for that figurative frog.

Elsewhere it was announced that the All England Club has not had its Wimbledon event cancellation cover renewed for 2021 (please recall that the Club was prescient in having cancellation cover in force that included pandemic benefits).  The prior policy’s underwriters are apparently reluctant to shell out another potential $140 million with a recurrence of a pandemic, but are there other approaches to pandemic cover?  The consortium of underwriting firms devised the cover seventeen years ago; can the cover for a marquee event benefit from AI methodology that simply didn’t exist in 2003?  It’s apparent the ask for cover for the 2021 event attracted knowledgeable frogs that knew to jump out of hot water, but what if the exposure burner is turned down through better understanding of the breadth of data affecting the risk, that there is involvement of capital markets in diversifying the risk perhaps across many unique events’ outcomes and alternative risk financing, and leveraging of underwriting tools that are supported by AI and machine learning?  Will it be found in due time that the written rule that pandemics cannot be underwritten as a peril will have less validity because well placed application of data analysis has wrangled the risk exposure to a reasonable bet by an ILS fund?

There are more examples of AI’s promise but let us not forget that AI is not the magic solution to all insurance tasks.  Companies that invest in AI without a fitting use case simply are moving their frog to a different but jest as threatening a pot.  Companies that invest in innovation that cannot bridge their legacy system to meaningful outcomes because there is no API functionality are turning the heat up themselves.  Large scale innovation options that are coming to a twenty-year anniversary (think post Y2K) may have compounding legacy issues- old legacy and new legacy.

The insurance industry needs to consider not just individual instances of the gradual heat of change being applied.

What prevents the capital markets from applying AI methods (through design or purchase) in predicting or betting on risk outcomes?  The more comprehensive and accurate risk prediction methods become the more direct the path between customer and risk financing partner also becomes.  Insurance frogs need not fear the heat if there are fewer pots to work from, but no pots, no business.

The risk sharing/risk financing industry has evolved through application of available technology and tools, what’s to say AI does not become a double-edged sword for the insurance industry- a clever tool in the hands of insurers, or a clever tool in the hands of alternative financing that serves to cut away some of the insurers’ business?  If asked, Peter Cullum might opine that it’s not just underwriting that AI will affect, but any other aspect of insurance that AI can effectively influence.  Frogs beware.

You get three free articles on Daily Fintech; after that you will need to become a member for just US $143 per year ($0.39 per day) and get all our fresh content and archives and participate in our forum

Source: https://dailyfintech.com/2020/07/02/ai-hot-water-for-insurance-incumbents-or-a-relaxing-spa/

Continue Reading

AI

MIT takes down 80 Million Tiny Images data set due to racist and offensive content

Avatar

Published

on


Creators of the 80 Million Tiny Images data set from MIT and NYU took the collection offline this week, apologized, and asked other researchers to refrain from using the data set and delete any existing copies. The news was shared Monday in a letter by MIT professors Bill Freeman and Antonio Torralba and NYU professor Rob Fergus published on the MIT CSAIL website.

Introduced in 2006 and containing photos scraped from internet search engines, 80 Million Tiny Images was recently found to contain a range of racist, sexist, and otherwise offensive labels such as nearly 2,000 images labeled with the N-word, and labels like “rape suspect” and “child molester.” The data set also contained pornographic content like non-consensual photos taken up women’s skirts. Creators of the 79.3 million-image data set said it was too large and its 32 x 32 images too small, making visual inspection of the data set’s complete contents difficult. According to Google Scholar, 80 Million Tiny Images has been cited more 1,700 times.

Above: Offensive labels found in the 80 Million Tiny Images data set

“Biases, offensive and prejudicial images, and derogatory terminology alienates an important part of our community — precisely those that we are making efforts to include,” the professors wrote in a joint letter. “It also contributes to harmful biases in AI systems trained on such data. Additionally, the presence of such prejudicial images hurts efforts to foster a culture of inclusivity in the computer vision community. This is extremely unfortunate and runs counter to the values that we strive to uphold.”

The trio of professors say the data set’s shortcoming were brought to their attention by an analysis and audit published late last month (PDF) by University of Dublin Ph.D. student Abeba Birhane and Carnegie Mellon University Ph.D. student Vinay Prabhu. The authors say their assessment is the first known critique of 80 Million Tiny Images.

VB Transform 2020 Online – July 15-17. Join leading AI executives: Register for the free livestream.

Both the paper authors and the 80 Million Tiny Images creators say part of the problem comes from automated data collection and nouns from the WordNet data set for semantic hierarchy. Before the data set was taken offline, the coauthors suggested the creators of 80 Million Tiny Images do like ImageNet creators did and assess labels used in the people category of the data set. The paper finds that large-scale image data sets erode privacy and can have a disproportionately negative impact on women, racial and ethnic minorities, and communities at the margin of society.

Birhane and Prabhu assert that the computer vision community must begin having more conversations about the ethical use of large-scale image data sets now in part due to the growing availability of image-scraping tools and reverse image search technology. Citing previous work like the Excavating AI analysis of ImageNet, the analysis of large-scale image data sets shows that it’s not just a matter of data, but a matter of a culture in academia and industry that finds it acceptable to create large-scale data sets without the consent of participants “under the guise of anonymization.”

“We posit that the deeper problems are rooted in the wider structural traditions, incentives, and discourse of a field that treats ethical issues as an afterthought. A field where in the wild is often a euphemism for without consent. We are up against a system that has veritably mastered ethics shopping, ethics bluewashing, ethics lobbying, ethics dumping, and ethics shirking,” the paper states.

To create more ethical large-scale image data sets, Birhane and Prabhu suggest:

  • Blur the faces of people in data sets
  • Do not use Creative Commons licensed material
  • Collect imagery with clear consent from data set participants
  • Include a data set audit card with large-scale image data sets, akin to the model cards Google AI uses and the datasheets for data sets Microsoft Research proposed

The work incorporates Birhane’s previous work on relational ethics, which suggests that the creators of machine learning systems should begin their work by speaking with the people most affected by machine learning systems, and that concepts of bias, fairness, and justice are moving targets.

ImageNet was introduced at CVPR in 2009 and is widely considered important to the advancement of computer vision and machine learning. Whereas previously some of the largest data sets could be counted in the tens of thousands, ImageNet contains more than 14 million images. The ImageNet Large Scale Visual Recognition Challenge ran from 2010 to 2017 and led to the launch of a variety of startups like Clarifai and MetaMind, a company Salesforce acquired in 2017. According to Google Scholar, ImageNet has been cited nearly 17,000 times.

As part of a series of changes detailed in December 2019, ImageNet creators including lead author Jia Deng and Dr. Fei-Fei Li found that 1,593 of the 2,832 people categories in the data set potentially contain offensive labels, which they said they plan to remove.

“We indeed celebrate ImageNet’s achievement and recognize the creators’ efforts to grapple with some ethical questions. Nonetheless, ImageNet as well as other large image datasets remain troublesome,” the Birhane and Prabhu paper reads.

Source: http://feedproxy.google.com/~r/venturebeat/SZYF/~3/knS0Ix3IHxA/

Continue Reading
Blockchain20 mins ago

3 snippets to begin your day: Bitcoin’s been busy, another crypto-ETP and more

Blockchain32 mins ago

GTA Online Is Bigger Than Ever, Let’s Review it in 2020

Gaming36 mins ago

Evening Reading – July 1, 2020

Blockchain50 mins ago

Cardano, IOTA, Dash Price Analysis: 02 July

Blockchain51 mins ago

Analyst Expects Bitcoin Above $9.5K in Near-Term as Risk-On Sentiment Improves

IOT51 mins ago

Panavise Speedwheel #3DThursday #3DPrinting

Cannabis52 mins ago

Former NBA Star John Salley Joins Insurance Pro Daron Phillips To Offer Cannabis Coverage

Cannabis56 mins ago

CA Media Report: Border Patrol Seizing Cash and Cannabis From Legal California Operators

Cannabis1 hour ago

Congressman Cohen Wishes To Investigate and Consider the Impeachment of Attorney General William P. Barr Includes Reference To “pretextual antitrust investigations against industries he disfavors”

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
BBC1 hour ago

One in six jobs to go as BBC cuts 450 staff from regional programmes

IOT1 hour ago

Spinwheel – fidget toy #3DThursday #3DPrinting

IOT1 hour ago

Tube Cutter with Peephole easy fit #3DThursday #3DPrinting

Cannabis1 hour ago

Is THC Most Important in Good Weed?

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Covid191 hour ago

Mudslide at Myanmar jade mine kills more than 100 people

Blockchain1 hour ago

Blockchain Exec Says Decentralized Platforms Won’t Necessarily Replace YouTube

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Publications1 hour ago

Tracking the path of the coronavirus in the U.S. is going to get more difficult, strategist says

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Publications1 hour ago

Companies around Europe preparing for a recession, Intrum CEO says

Blockchain1 hour ago

Sri Lanka Central Bank Selects Shortlist for Blockchain Proof-of-Concept

CovId191 hour ago

Samsung is selling a wireless charger that also sterilizes your phone

Blockchain1 hour ago

Bitcoin to reach ‘$14,000 much faster than people expect’

Blockchain1 hour ago

Bitcoin Fails at $9,300 as DeFi Altcoins Surge: Thursday’s Price Watch

Blockchain1 hour ago

Almost 70% Don’t Ever See Gold Price Flipping Bitcoin Price

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Publications1 hour ago

Macro environment a concern — but company finances are healthy, Aviva CFO says

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Publications1 hour ago

Seeing good pick up and strong demand as hotels reopen, Melia Hotels COO says

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Blockchain1 hour ago

This Eerie Nasdaq Fractal Predicts Bitcoin Will Rocket to $15,000 in 2020

Publications2 hours ago

US-China spat will probably continue even after the presidential election: State Street

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Publications2 hours ago

There’s ‘very strong demand’ for physical gold amid the coronavirus outbreak: J. Rotbart & Co.

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
CovId192 hours ago

Labour’s real struggle is not left v right – it is to keep young and minority voters | Owen Jones

Cannabis2 hours ago

The Effectiveness of Cannabis Flower for Immediate Relief from Symptoms of Depression.

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Cannabis2 hours ago

A Systematic Review of Minor Phytocannabinoids with Promising Neuroprotective Potential.

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
Cannabis2 hours ago

Interactions between cannabidiol and Δ9 -tetrahydrocannabinol in modulating seizure susceptibility and survival in a mousae model of Dravet syndrome.

Blockchain2 hours ago

Bitcoin Price Prediction: Bitcoin (BTC) Shows Weakness above $9,200 Support, Signals an Imminent Downside Move

Blockchain2 hours ago

Bitcoin, Stocks at Risk as Dr. Fauci Predicts 100K Daily New COVID Cases

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
CovId192 hours ago

Martin and Gary Kemp: ‘Most brothers in music seem to hate each other’

Payments2 hours ago

UBI Banca offers digital salary backed loans with FinLeap

AI2 hours ago

XBRL: scrapping quarterlies, explaining AI and low latency reporting

venezuela-raises-petrol-prices-mandates-support-for-petro-at-gas-stations-3.jpg
CovId192 hours ago

The Breakdown | Six Nations may decide to take the pay-TV money and run

Blockchain2 hours ago

Researchers Discover Flaw in Major Bitcoin Wallets That Could Trick Users to Think They Received BTC

Blockchain2 hours ago

Crypto exchange Binance reportedly confirms partnership with Swipe.

Cannabis2 hours ago

Top 5 Medical Cannabis Stocks to Watch by Market Capitalization

Trending