This is an edited version of a post that originally ran here.
Neuroscience and AI have a long, intertwined history. Artificial intelligence pioneers looked to the principles of the organization of the brain as inspiration to make intelligent machines. In a surprising reversal, AI is now helping us understand its very source of inspiration: the human brain. This approach of using AI to build models of the brain is referred to as neuroAI. Over the next decade, we’ll make ever more precise in silico brain models, especially models of our two most prominent senses, vision and hearing. As a result, we’ll be able to download and use sensory models, on demand, with the same convenience that we can do object recognition or natural language processing.
Many neuroscientists and artificial intelligence researchers are – understandably! – very excited about this: brains on demand! Discovering what it means to see, to feel, to be human! Less well recognized is that there are wide practical applications in industry. I have long been a researcher in this field, having worked on how the brain transforms vision into meaning since my PhD. I’ve seen the progression of the field from its inception, and I think now is the time to pursue how neuroAI can drive more creativity and improve our health.
I predict that neuroAI will first find widespread use in art and advertising, especially when connected to new generative AI models like GPT-3 and DALL-E. While current generative AI models can produce creative art and media, they can’t tell you if that media will ultimately communicate a message to the intended audience – but neuroAI could. For instance, we might replace the trial and error of focus groups and A/B tests and directly create media that communicates exactly what we want. The tremendous market pressures around this application will create a virtuous cycle that improves neuroAI models.
The resulting enhanced models will enable applications in health in medicine, from helping people with neurological problems to enhancing the abilities of the well. Imagine creating the right images and sounds to help a person recover their sight or hearing more quickly after LASIK surgery or after getting a cochlear implant, respectively.
These innovations will be made far more potent by other technologies coming down the pipe: augmented reality and brain-computer interfaces. However, to fully realize the potential utility of on demand downloadable sensory systems we’ll need to fill current gaps in tooling, talent and funding.
In this piece I’ll explain what neuroAI is, how it might start to evolve and start to impact our lives, how it complements other innovations and technologies, and what is needed to push it forward.
What is neuroAI?
NeuroAI is an emerging discipline that seeks to 1) study the brain to learn how to build better artificial intelligence and 2) use artificial intelligence to better understand the brain. One of the core tools of neuroAI is using artificial neural nets to create computer models of specific brain functions. This approach was kickstarted in 2014, when researchers at MIT and Columbia showed that deep artificial neural nets could explain responses in a part of the brain that does object recognition: the inferotemporal cortex (IT). They introduced a basic recipe to compare an artificial neural net to a brain. Using this recipe and repeating iterative testing across brain processes – shape recognition, motion processing, speech processing, control of the arm, spatial memory – scientists are building a patchwork of computer models for the brain.
A recipe for comparing brains to machines
So how do you build a NeuroAI model? Since its inception in 2014, the field has followed the same basic recipe:
1. Train artificial neural networks in silico to solve a task, for example for object recognition. The resulting network is called task-optimized. Importantly, this typically involves training on just images, movies and sounds, not brain data.
2. Compare the intermediate activations of trained artificial neural networks to real brain recordings. Comparison is done using statistical techniques like linear regression or representational similarity analysis.
3. Pick the best performing model as the current best model of these areas of the brain.
This recipe can be applied with data collected inside the brain from single neurons or from non-invasive techniques like magneto-encephalography (MEG) or functional magnetic resonance imaging (fMRI).
A neuroAI model of part of the brain has two key features. It’s computable: we can feed this computer model a stimulus and it will tell us how a brain area will react. It’s also differentiable: it’s a deep neural net that we can optimize in the same way that we optimize models that solve visual recognition and natural language processing. That means neuroscientists get access to all the powerful tooling that has powered the deep learning revolution, including tensor algebra systems like PyTorch and TensorFlow.
What does this mean? We went from not understanding big chunks of the brain to being able to download good models of it in less than a decade. With the right investments, we’ll soon have excellent models of large chunks of the brain. The visual system was the first to be modeled; the auditory system was not far behind; and other areas will surely fall like dominoes as intrepid neuroscientists rush to solve the mysteries of the brain. Apart from satisfying our intellectual curiosity–a big motivator for scientists!– this innovation will allow any programmer to download good models of the brain and unlock myriad applications.
Art and advertising
Let’s start with this simple premise: 99% of the media that we experience is through our eyes and ears. There are entire industries that can be boiled down to delivering the right pixels and tones to these senses: visual art, design, movies, games, music and advertising are just a few of them. Now, it’s not our eyes and ears themselves that interpret these experiences, as they are merely sensors: it’s our brains that make sense of that information. Media is created to inform, to entertain, to bring about desired emotions. But determining whether the message in a painting, a professional headshot or an ad is received as intended is a frustrating exercise in trial-and-error: humans have to be in the loop to determine whether the message hits, which is expensive and time-consuming.
Large-scale online services have figured out ways around this by automating trial-and-error: A/B tests. Google famously tested which of 50 shades of blue to use for the links on the search engine results page. According to The Guardian, the best choice caused improvements in revenue over the baseline of 200M$ in 2009, or roughly 1% of Google’s revenue at that time. Netflix customizes the thumbnails to the viewer to optimize its user experience. These methods are available to online giants with massive traffic, which can overcome the noise inherent in people’s behavior.
What if we could predict how people will react to media before getting any data? This would make it possible for small businesses to optimize their written materials and websites despite having little pre-existing traction. NeuroAI is getting closer and closer to being able to predict how people will react to visual materials. For instance, researchers at Adobe are working on tools to predict and direct visual attention in illustrations.
Researchers have also demonstrated editing photos to make them more visually memorable or aesthetically pleasing. It could be used, for example, to automatically select a professional headshot most aligned to the image people want to project of themselves–professional, serious, or creative. Artificial neural networks can even find ways of communicating messages more effectively than realistic images. OpenAI’s CLIP can be probed to find images which are aligned to emotions. The image best aligned to the concept of shock would not be out of place next to Munch’s Scream.
Over the last year, OpenAI and Google have demonstrated generative art networks with an impressive ability to generate photorealistic images from text prompts. We haven’t quite hit that moment for music, but with the pace of progress in generative models, this will surely happen in the next few years. By building machines that can hear like humans, we may be able to democratize music production, giving anyone the ability to do what highly skilled music producers can do: to communicate the right emotion during a chorus, whether melancholy or joy; to create an earworm of a melody; or to make a piece irresistibly danceable.
There are tremendous market pressures to optimize audiovisual media, websites, and especially ads, and we’re already integrating neuroAI and algorithmic art into this process. This pressure will lead to a virtuous cycle where neuroAI will get better and more useful as more resources are poured into practical applications. A side effect of that is that we’ll get very good models of the brain which will be useful far outside of ads.
Accessibility and algorithmic design
One of the most exciting applications of neuroAI is accessibility. Most media is designed for the “average” person, yet we all process visual and auditory information differently. 8% of men, and 0.5% of women are red-green colorblind, and a large amount of media is not adapted to their needs. There are a number of products that simulate color blindness today, but require a person with normal color vision to interpret the results and make necessary changes. Static color remapping doesn’t work for these needs either, as some materials don’t preserve their semantics with color remapping (e.g. graphs that become hard to read). We could automate the generation of color-blindness-safe materials and websites through neuroAI methods that maintain the semantics of existing graphics.
Another example is to help people with learning disabilities, like dyslexia, which affect up to 10% of people worldwide. One of the underlying issues in dyslexia is sensitivity to crowding, which is the difficulty recognizing shapes with similar underlying features, including mirror-symmetric letters like p and q. Anne Harrington and Arturo Deza at MIT are working on neuroAI models that model this effect and getting some very promising results. Imagine taking models of the dyslexic visual system to design fonts that are both aesthetically pleasing and easier to read. With the right data about a specific person’s visual system, we can even personalize the font to a specific individual, which has shown promise in improving reading performance. These are potentially large improvements in quality of life waiting here.
Many neuroscientists enter the field with the hope that their research will positively impact human health, in particular for people living with neurological disorders or mental health issues. I’m very hopeful that neuroAI will unlock new therapies: with a good model of the brain, we can craft the right stimuli so the right message gets to it, like a key fits a lock. In that sense, neuroAI could be applied similarly to algorithmic drug design, but instead of small molecules, we deliver images and sounds.
The most approachable problems involve the receptors of the eyes and ears, which are already well characterized. Hundreds of thousands of people have received cochlear implants, neuroprosthetics which electrically stimulate the cochlea of the ear, allowing the deaf or hard-of-hearing to hear again. These implants, which contain a few dozen electrodes, can be difficult to use in noisy environments with multiple speakers. A brain model can optimize the stimulation pattern of the implant to amplify speech. What’s remarkable is that this technology, developed for people with implants, could be adapted to help people without implants better understand speech by modifying sounds in realtime, whether they have an auditory processing disorder or they’re simply frequently in loud environments.
Many people experience changes to their sensory systems throughout their lifetime, whether it’s recovering from cataract surgery or becoming near-sighted with age. We know that after such a change, people can learn to re-interpret the world correctly through repetition, a phenomenon called perceptual learning. We may be able to maximize this perceptual learning so that people can regain their skills faster and more effectively. A similar idea could help people who have lost the ability to move their limbs fluidly after a stroke. If we could find the right sequence of movements to strengthen the brain optimally, we may be able to help stroke survivors regain more function, like walking more fluidly or simply holding a cup of coffee without spilling. In addition to helping people recover lost physical functions, the same idea could help healthy people reach peak sensory performance – whether they be baseball players, archers, or pathologists.
Finally, we could see these ideas being applied to the treatment of mood disorders. I went to many visual art shows to relieve my boredom during the pandemic, and it lifted my mood tremendously. Visual art and music can lift our spirits, and it’s a proof-of-concept that we may be able to deliver therapies for mood disorders through the senses. We know that controlling the activity of specific parts of the brain with electrical stimulation can relieve treatment-resistant depression; perhaps controlling the activity of the brain indirectly through the senses could show similar effects. By deploying simple models – low-hanging fruit – that affect well-understood parts of the brain, we’ll get the ball rolling on building more complex models that can help human health.
Enabling technology trends
NeuroAI will take many years to be tamed and deployed in applications, and it will intercept other emerging technology trends. Here I highlight two trends in particular that will make neuroAI far more powerful: augmented reality (AR), which can deliver stimuli precisely; and brain-computer interfaces (BCI), which can measure brain activity to verify that stimuli act in the expected way.
A trend that will make neuroAI applications far more powerful is the adoption of augmented reality glasses. Augmented reality (AR) has the potential to become a ubiquitous computing platform, because AR integrates into daily life.
The hypothesis of Michael Abrash, chief scientist at Meta Reality Labs, is that if you build sufficiently capable AR glasses, everybody will want them. That means building world-aware glasses that can create persistent world-locked virtual objects; light and fashionable frames, like a pair of Ray-Bans; and giving you real-life superpowers, like being able to interact naturally with people regardless of distance and enhancing your hearing. If you can build these–a huge technical challenge–AR glasses could follow an iPhone-like trajectory, such that everybody will have one (or a knockoff) 5 years after launch.
To make this a reality, Meta spent 10 billion dollars last year on R&D for the metaverse. While we don’t know for sure what Apple is up to, there are strong signs that they’re working on AR glasses. So there’s also a tremendous push on the supply side to make AR happen.
This will make widely available a display device that’s far more powerful than today’s static screens. If it follows the trajectory of VR, it will eventually have eye tracking integrated. This would mean a widely available way of presenting stimuli that is far more controlled than is currently possible, a dream for neuroscientists. And these devices are likely to have far-reaching health applications, as told by Michael Abrash in 2017, such as enhancing low-light vision, or enabling people to live a normal life despite macular degeneration.
The significance for neuroAI is clear: we could deliver the right stimulus in a highly controlled way on a continuous basis in everyday life. This is true for vision, and perhaps less obviously for hearing, as we can deliver spatial audio. What that means is that our tools to bring about neuroAI therapies for people with neurological issues or for accessibility improvements will become far more powerful.
With a great display and speakers, we can control the major inputs to the brain precisely. The next, more powerful stage in delivering stimuli through the senses is to verify that the brain is reacting in the expected way through a read-only brain-computer interface (BCI). Thus, we can measure the effects of the stimuli on the brain, and if they’re not as expected, we can adjust accordingly in what’s called closed-loop control.
To be clear, here I’m not talking about BCI methods like Neuralink’s chip or deep-brain stimulators that go inside the skull; it’s sufficient for these purposes to measure brain activity outside of the skull, non-invasively. No need to directly stimulate the brain either: glasses and headphones are all you need to control most of the brain’s inputs.
There are a number of non-invasive read-only BCIs that are commercialized today or in the pipeline that could be used for closed-loop control. Some examples include:
- EEG. Electroencephalography measures the electrical activity of the brain outside of the skull. Because the skull acts as a volume conductor, EEG has high temporal resolution but low spatial resolution. While this has limited consumer application to meditation products (Muse) and niche neuromarketing applications, I’m bullish on some of its uses in the context of closed-loop control. EEG can be much more powerful when one has control over the stimulus, because it’s possible to correlate the presented stimulus with the EEG signal and decode what a person was paying attention to (evoked potential methods). Indeed, NextMind, which made an EEG-based “mind click” based on evoked potentials, was acquired by Snap, which is now making AR products. OpenBCI is planning to release a headset which integrates its EEG sensors with Varjo’s high-end Aero headset. I would not count EEG out.
- fMRI. Functional magnetic resonance imaging measures the small changes in blood oxygenation associated with neural activity. It’s slow, it’s not portable, it requires its own room and it’s very expensive. However, fMRI remains the only technology that can non-invasively read activity deep in the brain in a spatially precise way. There are two paradigms which are fairly mature and relevant for closed-loop neural control. The first is fMRI-based biofeedback. A subfield of fMRI shows that people can modulate their brain activity by presenting it visually on a screen or headphones. The second is cortical mapping, including approaches like population receptive fields and estimating voxel selectivity with movie clips or podcasts, which allow one to estimate how different brain areas respond to different visual and auditory stimuli. These two methods hint that it should be possible to estimate how a neuroAI intervention affects the brain and steer it to be more effective.
- fNIRS. Functional near infrared spectroscopy uses diffuse light to estimate cerebral blood volume between a transmitter and a receptor. It relies on the fact that blood is opaque and increased neural activity leads to a delayed blood influx in a given brain volume (same principle as fMRI). Conventional NIRS has low spatial resolution, but with time gating (TD-NIRS) and massive oversampling (diffuse optical tomography), spatial resolution is far better. On the academic front, Joe Culver’s group at WUSTL have demonstrated decoding of movies from the visual cortex. On the commercial front, Kernel is now making and shipping TD-NIRS headsets which are impressive feats of engineering. And it’s an area where people keep pushing and progress is rapid; my old group at Meta demonstrated a 32-fold improvement in signal-to-noise ratio (which could be scaled to >300) in a related technique.
- MEG. Magnetoencephalography measures small changes in magnetic fields, thus localizing brain activity. MEG is similar to EEG in that it measures changes in the electromagnetic field, but it doesn’t suffer from volume conduction and therefore has better spatial resolution. Portable MEG that doesn’t require refrigeration would be a game changer for noninvasive BCI. People are making progress with optically pumped magnetometers, and it is possible to buy individual OPM sensors on the open market, from manufacturers such as QuSpin.
In addition to these better known techniques, some dark horse technologies like digital holography, photo-acoustic tomography, and functional ultrasound could lead to rapid paradigm shifts in this space.
While consumer-grade non-invasive BCI is still in its infancy, there are a number of market pressures around AR use cases that will make the pie larger. Indeed, a significant problem for AR is controlling the device: you don’t want to have to walk around with a controller or muttering to your glasses if you can avoid it. Companies are quite serious about solving this problem, as evidenced by Facebook buying CTRL+Labs in 2019, Snap acquiring NextMind, and Valve teaming up with OpenBCI. Thus, we’re likely to see low-dimensional BCIs being rapidly developed. High-dimensional BCIs might follow the same trajectory if they find a killer app like AR. It’s possible that the kinds of neuroAI applications I advocate for here are precisely the right use case for this technology.
If we can control the input to the eyes and ears as well as measure brain states precisely, we can deliver neuroAI-based therapies in a monitored way for maximum efficacy.
What’s missing from the field
The core science behind NeuroAI applications is rapidly maturing, and there are a number of positive trends that will increase its general applicability. So what’s missing to bring neuroAI applications to the market?
- Tooling. Other subfields within AI have benefited tremendously from toolboxes that enable rapid progress and sharing of results. This including tensor algebra libraries such as Tensorflow and PyTorch, training environments like OpenAI Gym and ecosystems to share data and models like 🤗 HuggingFace. A centralized repository of models and methods, as well as evaluation suites, potentially leveraging abundant simulation data, would push the field forward. There’s already a strong community of open source neuroscience organizations, and they could serve as natural hosts for these efforts.
- Talent. There are a vanishingly small number of places where research and development is done at the intersection of neuroscience and AI. The Bay Area, with labs at Stanford and Berkeley, and the Boston metro area with numerous labs at MIT and Harvard will likely see most of the investment from the pre-existing venture capital ecosystem. A third likely hub is Montreal, Canada, lifted by massive neuroscience departments at McGill and Universite de Montreal, combined with the pull of Mila, the artificial intelligence institute founded by AI pioneer Yoshua Bengio. Our field would benefit from specialized PhD programs and centers of excellence in neuroAI to kickstart commercialization.
- New funding and commercialization models for medical applications. Medical applications have a long road to commercialization, and protected intellectual property is usually a prerequisite to obtain funding to de-risk investment in the technology. AI-based innovations are notoriously difficult to patent, and software-as-a-medical-device (SaMD) is only starting to come to the market, making the road to commercialization uncertain. We’ll need funds which are focused on bringing together AI and medical technology expertise to nurture this nascent field.
Let’s build neuroAI
Scientists and philosophers have puzzled over how brains work from time immemorial. How does a thin sheet of tissue, a square foot in area, enable us to see, hear, feel and think? NeuroAI is helping us get a handle on these deep questions by building models of neurological systems in computers. By satisfying that fundamental thirst for knowledge – what does it mean to be human? – neuroscientists are also building tools that could help millions of people live richer lives.
Posted August 4, 2022
Technology, innovation, and the future, as told by those building it.
Views expressed in “posts” (including articles, podcasts, videos, and social media) are those of the individuals quoted therein and are not necessarily the views of AH Capital Management, L.L.C. (“a16z”) or its respective affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation.
This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.
Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.