AI Is Turning Phones Into Smarter Accessibility Tools. And It's Just Getting Started

On a sunny day in July, Kevin Chao and I sit on a breezy rooftop in Oakland, California, overlooking the downtown skyline. He pulls out a pair of Meta’s Ray-Ban smart glasses, puts them on and asks, “Describe what you see.”

It takes a bit of troubleshooting (the glasses don’t initially want to cooperate), but after a few tries, an AI assistant uses the built-in camera to describe a patio, with a black table and chairs facing a skyline.

That description checks out. Within a few seconds, Chao, who is blind, can get an overview of his surroundings, thanks to the Meta AI function on the glasses. All it takes is pairing them to his iPhone via the Meta View app.  

“I do a lot of outdoor stuff like rock climbing and skiing,” Chao, an accessibility advocate, says. In addition to using his Meta Ray-Bans to get the lay of the land, they’re also handy for snapping photos. “I just kind of look in the general direction, as opposed to worrying about my phone and the exact positioning,” he notes. Along with the glasses, he uses apps like Aira and Be My Eyes on his phone to access both AI and sighted volunteers and get details about his surroundings.   

Meta AI’s description of the rooftop isn’t particularly detailed or nuanced — it doesn’t note that the chairs are made of wicker, for instance, or that there’s lavender bordering the roof. But the current capabilities and future promises of this technology demonstrate its potential to make digital accessibility more intuitive and helpful for people with disabilities. It can also help democratize accessible tech by eliminating the need for pricey add-on equipment, instead enabling people to use powerful tools right from their phones — or a $300 pair of glasses (not pocket change, but not totally cost prohibitive, either).

In recent years, companies like Apple and Google have tapped into the generative AI arms race to bolster their mobile accessibility offerings. Apple’s Live Speech feature, for instance, lets someone type what they want to say and then have it spoken aloud across their Apple devices, while Eye Tracking allows people to control their iPhone and iPad with just their eyes. Google is using AI to power features like Guided Frame, which helps blind and low-vision Pixel users snap well-framed photos via audio and haptic cues, as well as Lookout, which can identify objects and generate detailed image descriptions. Both tech giants have rolled out real-time captioning tools that can help deaf or hard-of-hearing people access audio content. 

One of the most prominent players in the generative AI space, OpenAI, has also partnered with Be My Eyes to launch Be My AI, which gives real-time, detailed descriptions of someone’s surroundings via a human-like voice. If someone is hailing a cab, for instance, Be My AI can tell them if the taxi light is on and where the driver is pulling over. It’s a remarkable demonstration of how AI can make assistive tech more personalized and accessible. And all it takes is an app and a smartphone.

Filling in the gaps

Screen reader technology like VoiceOver on iPhones and TalkBack on Android phones has been a game changer for many people who are blind or low-vision. The capability, which has been available on smartphones for well over a decade, can read aloud content on a person’s device and let them navigate their touchscreen via custom gestures.

Google supercharged its TalkBack screen reader in May by incorporating its Gemini Nano AI model for smartphones. Now, TalkBack can offer more detailed descriptions of unlabeled images, like the style and cut of clothes while online shopping. 

Adding that kind of nuance and detail is where AI can really make a difference, says accessibility specialist Joel Isaac, who is blind.

“There’s a real cognitive layer between what you see, and what you understand you’re seeing,” Isaac says. “For me, that cognitive gap is missing.” Getting a straightforward description of his surroundings or what’s on his screen can be helpful, but it doesn’t always paint the full picture. 

“AI has the potential to bridge that gap,” Isaac says. “Getting descriptions out of something like a ChatGPT [powered] device or something that comes afterwards, that’s really something.”

Speech recognition technology is also benefiting from the AI boom. Google’s Gemini assistant and Apple’s Siri have both become more conversational and context-aware, so they can respond to follow-up questions and offer more comprehensive responses. An Android beta app called Project Relate can connect to Google Assistant so people with non-standard speech will be better understood by it. And a new Siri feature called Listen for Atypical Speech allows the voice assistant to better decipher a wider range of speech, using on-device machine learning to recognize those patterns. 

The Siri update is the product of an initiative called the Speech Accessibility Project, a collaboration between the University of Illinois at Urbana-Champaign and prominent companies including Apple, Amazon, Google, Meta and Microsoft. The goal is to improve speech recognition for people with a range of speech patterns and disabilities. 

“One of the groups that would benefit the most [from speech technology] are people who have physical disabilities of many different kinds,” Mark Hasegawa-Johnson, project leader and professor of electrical and computer engineering at UIUC, told CNET in a previous interview. “And too often, those are the people for whom the speech technology doesn’t work.” 

As smart assistants become more ubiquitous and powerful, it’s even more imperative to ensure support for non-standard speech, says Mary Bellard, Microsoft’s principal architect on accessibility. The companies involved in the initiative quickly realized the most effective way to make this happen was through collaboration. 

“We all needed very similar data,” Bellard says. “We wanted to make sure that a person with a disability, regardless of whatever technology they choose or need to use, is going to have a better experience for speech recognition.”

The power of personalization 

AI can give accessibility features a more personalized, authentic touch. Along with Apple’s Live Speech feature, which speaks aloud typed-out phrases, there’s also Personal Voice, which lets users who are at risk of speech loss create a voice that sounds like them. After training the feature by uttering a series of text prompts aloud, someone’s iPhone or iPad will generate an almost identical voice powered by machine learning. 

endever* corbin, who is semi-speaking autistic, says they use Personal Voice every day.

“I’m nonbinary, so it’s super hard to find a digitized voice that sounds like me,” corbin says. “Virtually every voice available is very high pitched, very low pitched, very cisgender coded or very young sounding.” 

Personal Voice offers a welcome alternative. “I want my voice to be midrange, trans-coded, adult sounding and roughly matching the accent I grew up around. … Prior to the release of Personal Voice, the only options for voice banking were quite expensive, and of course I had no idea whether I’d like the result. So Personal Voice is really a game changer.”

Google’s Project Relate app is designed to help people with nonstandard speech more easily communicate with others by transcribing what they say and restating it using a computerized voice. The Android app can also be custom-trained on people’s unique speech patterns.

Dimitri Kanevsky is a research scientist at Google DeepMind, the company’s AI research lab. He says he uses Project Relate in all his meetings and while giving presentations; in fact, he used it while chatting with me over Google Meet. On one half of the screen, I saw Kanevsky, and on the other, I saw the live transcriptions generated by Project Relate. It wasn’t always precise (sometimes he had to repeat himself), but in general, the app did a solid job of interpreting his speech. 

Kanevsky envisions a future where apps like Project Relate are built into glasses, for a more seamless and nuanced back and forth. 

“When you communicate with people, you’re still missing their expression if you’re looking at your phone,” he notes.

Andrea Peet, president and founder of the Team Drea Foundation, which raises money for ALS research, also uses Project Relate when giving presentations and to draft emails and texts. As someone with ALS, she says the app is much faster than using an eye gaze computer, which lets people control a device with eye movements, rather than a keyboard or mouse. 

She connects Project Relate to her Google Home, so it can better understand her speech and allow her to carry out commands like turning on the lights and changing the thermostat. 

“It is so much easier and more efficient than having to get up for every little thing, and I can save my time and energy for movement for tasks that are more meaningful,” Peet says. “And it helps me preserve my independence and safety since I don’t have to rely on others for so much help.”

It’s just one example of how AI advancements can make tasks both big and small easier for people with disabilities.

“For most people, I can see how the current AI tools seem like a shiny new way to cut corners and increase productivity,” Peet says. “But I genuinely hope that AI will enable people with disabilities to participate more fully in society.”

Tech for everyone’s benefit

One of the best side effects of accessible tech is that it ends up benefiting everyone, not just people with disabilities. Take closed captions, for example: many people have become reliant on them, especially as movie and TV dialogue becomes harder to decipher. Other features like dark mode, which can improve readability for some people, and text-to-speech (now a staple on social media apps like TikTok) have also become commonplace. 

“We all use accessibility features every day. We may not realize it, but we do — be it color contrast, pinch and zoom on a phone [or] increasing the size of the text on your screen,” says Eamon McErlean, vice president and global head of accessibility at ServiceNow. “Companies are realizing: ‘By default, if we focus on accessibility, it helps all users.'”

AI features that weren’t explicitly created for the sole purpose of accessibility can still prove beneficial in that area. For instance, Google’s AI Overviews, which summarizes search results in the form of a short blurb, has been a huge help to people like Sean Dougherty, director of Accessible User Experience at San Francisco’s LightHouse for the Blind and Visually Impaired. Instead of sifting through pages of Google Search results using a screen reader, Dougherty, who is low-vision, can consult AI Overviews at the top of the page to get a brief summary of what he’s looking for. 

“That does close the gap for finding information, which is useful for everybody,” he says, “but when you’re someone with a disability, it makes things a lot easier and more efficient.” 

And now, with the rollout of Google’s Gemini Live, people can also interact with an AI model using just their voice, meaning they don’t have to rely solely on text input to get what they need. That can make for a more intuitive and streamlined experience. (ChatGPT has a similar feature called Advanced Voice Mode, which also lets users converse with AI.)

Self-driving cars can also offer people greater autonomy when hailing a ride, especially as companies incorporate more digital accessibility features. 

As an early tester for Waymo, the self-driving arm of Google’s parent company Alphabet, Dougherty notes one of the challenges was pinning down an arriving car without a human driver to communicate with. Now, the Waymo One app includes a directional GPS capability that can help guide someone to their ride. By using VoiceOver on iOS or TalkBack on Android, blind and low-vision users can get real-time directional feedback spoken aloud to help them track the vehicle’s location and distance. Once they’re in the car, they can use their phone to play music or enable turn-by-turn GPS to get more details on where they’re going.

AI tools can also afford users more privacy if they need assistance with sensitive documents or information. Someone may not want to share tax or financial information with another person, for instance, but might feel more comfortable using a secure and encrypted AI-powered app to scan and read those documents. 

That’s not to say there aren’t times when a human connection is still valuable. After all, Be My AI continues to give users the option to reach out to human volunteers too. 

“Sometimes there is value in connecting directly with another human,” Dougherty says. “Because there are so many accessibility barriers that still exist – whether they’re physical barriers or barriers in digital spaces that aren’t optimized – [people in the disability community] are used to reaching out to individuals around us that can help and support.” 

Ultimately, what’s important is ensuring technologies can work in different ways for different people, says Joe Devon, a web accessibility advocate and co-founder of Global Accessibility Awareness Day.

“It might be a lot easier to build something that works for the average person,” Devon says, “but if you go after all the edge cases and you get the edge cases to work, then that’s when new technology is on another level.”

Building a more accessible future

Much of the current work in improving digital accessibility starts with remedying the widespread oversight of people with disabilities. And when it comes to AI, accessibility advocates want to ensure history doesn’t repeat itself, especially as the technology rapidly evolves. 

“AI models are trained based on data, and that data includes all kinds of biases that can impact marginalized communities, especially people with disabilities,” says Ed Summers, head of accessibility at GitHub, who is blind. “We have a real challenge to identify the ground truth for accessibility and the training data that we use to build models.”

To achieve that, it’s important companies heed a mantra frequently used among the disability community: “Nothing about us without us.” That means employing people with disabilities and consulting them when building products and tools or writing code.

“If there are decisions that are going to be made about us,” Summers says, “we need a seat at that table.”

In addition to all the common anxieties related to AI misinformation and data privacy, there’s an underlying fear that rapid advancements will lead to some groups being overlooked — again.

“As with all things AI, I know there is potential for misuse of data,” Peet says. “But I’m actually much more worried that the technological leaps will happen so fast that no one will be paying attention to who is being left behind because there just isn’t the bandwidth to focus on accessibility.”

AI’s ability to quickly generate code is just one example. Because the models pull from what’s already out there, and much of that source material isn’t accessible to start with, the code they generate isn’t, either.

“Garbage in, garbage out,” says accessibility specialist Taylor Arndt, who is blind. She notes it’s imperative for coders to learn to write accessible code and to check their work and for educational institutions to prioritize accessibility in their curriculum. 

Organizations like Teach Access, a collaboration between industry partners and faculty from partner universities, offer free programs for students and educators to learn about accessibility. Arndt and others have also created custom GPTs in OpenAI’s ChatGPT store that can check code to make sure it’s accessible and fix any issues. She predicts the rise of AI will lead to a shift in how we scrutinize code.

“In 10 years, we’re not going to be critiquing humans for accessibility auditing anymore,” Arndt says. “We’re going to be critiquing AI development.”

As for the future of AI and what’s possible, Summers has high hopes. He dreams of a world where AI robots can do more complex tasks like ordering food from the grocery store, preparing a meal and then cleaning up afterward. That would be helpful not just to someone who’s disabled, but to anyone who wants a hand with everyday tasks, he notes. 

Peet envisions a world where household devices understand nonstandard speech, so she can preheat the oven, check what’s in the fridge or answer her Ring doorbell as seamlessly as people with standard speech can with modern-day tech. 

“It’s a novelty and convenience for most people, but it would be an absolute game changer for people who struggle with mobility and speech,” she says.

In the meantime, today’s rapid AI developments are already opening up a world of possibilities for people like Chao, the accessibility advocate who demoed the Meta Ray-Bans for me on the Oakland rooftop. He now shows me how he uses apps like Aira and Microsoft’s Seeing AI, which describe the phone and sunglasses sitting on the table via audio output.

Using Seeing AI, Chao snaps a picture of me, and the app proceeds to describe me as a “26-year-old woman wearing a hat looking happy.” As someone who is actually 30 years old and wears a hijab, I’m delighted by this response. In a way, it offers a snapshot of what AI is capable of, its limits and how long the road ahead is to more accurate, nuanced responses. 

Chao also has grand visions for what AI can someday offer. Perhaps it’ll eventually serve as more of a scout that can describe hiking terrains and rock climbing routes, “so I’m not limited or dependent on volunteers or paid professionals to help adapt sports for me.”

It may only be a matter of time.



Fuente