Smart Glasses Use Cases Master List

The Big List of Smart Glasses Use Cases

Author: Cayden Pierce


This is a list of use cases for all types of smart glasses. There are many, many more possiblities than just what I list here, but this is a start to get one thinking about the possiblities of smart glasses immediately in our daily lives. I’m pretty certain that in the next 5-10 years, we’ll see a startup show up for every single smart glasses use case (i.e. app) on this list.

This is a work in progress. There are some errors, please report anything inaccurate.

General

WearableDisplay

Basically, people want to consume the things they’d do on their phone/laptop, but on the smart glasses display.

This is very similiar to the

WearableWebBrowserAndReader

, but many people are talking about watching streams/movies/videos. The

WearableWebBrowserAndReader

is a bit more focused on the interface and interaction for reading content.

Assistive

BlindVision

The following tools can be combined to restore many visual skills to the blind:

WearableTextReadingOCR


WearableRecognizerIdentify


WearableRecognizerFind

WearableTextReadingOCR

For the blind, read text out loud to increase independence and understanding of the world around oneself.

Tools

LifeTimer_SmartGlasses

Time everything that you do, every single day, with simple annotation voice commands that start a timer fo whatever event you have queued in

Turn unimportant necessity actions into algorithms, show the algorithm steps live on smart glsses beside a live timer

If we turn life into an algorithm and get very good at following algorithms as reccomended by the smart glasses system, then we can make our very lives programmatic – literally Turing complete capability to do modification on any part of my day to day life. One could greatly optmize how they live their life.

WearableWebBrowserAndReader

This can essentially be thought of as a wearable web browser.

Read while you walk, run, do errands, commute, etc.

Not only read, but have direct input to the computer, so you could read a static source (e.g. book) or just as easily be actively searching the web.

It’s for use at times where it’s not realistic to get a lot of human output, past where one would use their phone, i.e. a time when you can read, listen, observe, but not actively create and write

Wearable web browser would really be the best, always any time google (the only really working interface) right in front of you.

Speech to text should read out loud what you select – see Thirst for Google glass –

https://mashable.com/archive/thirst-google-glass-app

Make it easy to save content, bookmarks content, send to your computer, share with a friends

Natural language information query

Ask a straightforward question. Note that all of our functions will eventually be natural language query, but this is a specific, broad class of information based questions.

Image Search

Search for images based on a text query. Learn what something looks like that you don’t know.

HUD videoconferencing

Ability to video call people while you are in a scenic, calm location, handsfree.

Basically just this but done well in a comfortable, long lasting pair of glasses:

http://vcai.mpi-inf.mpg.de/projects/EgoChat/

This might be just a great way to impress people – imagine calling a VC or CEO walking through the woods with this tech.

Shopping price checker

When in the store, whenver I look at a product, I want to be able to see the price of that product at many other stores. This will tell me if the product is outrageously overpriced and I should buy it somewhere else, or perhaps it’s a good deal and I should get 2 while it’s on sale.

The search could be automated and the glasses could read the price from other stores, find the average, compute the “bargain score” of every item, and overlay that score on the grocery items one sees.

Multi-media notes

This is simpler than MXT cache, tag bins, etc. This is just a single list of notes you can save easily with a voice command

When it comes down to it, these are all the same thing, and it’s all about the UI that gives it back to you

https://zackfreedman.com/2013/03/08/the-five-lowest-hanging-fruit-of-google-glass/

This is sortof done well on smartphones already. However, the egocentric, always on capability of smart glasses makes it much faster and less obtrusive.

Add calendar entry with voice

simple voice ui to specify a title and time – never again forget to enter information in a calendar, do it mid-convo

Could be mid-conversation, during a run, when you remember you need to do something tomorrow and don’t want to change tasks, etc.

HUD Cooking

Read recipes aloud

Display the recipe algorithm steps overload on your vision

Voice commands to move through the recipe

IOT connection to your cookingware and AR display to create an AR NUI to understand that cooking instruments

Sample UI:

https://www.linkedin.com/posts/laurencason_ar-mr-xr-activity-6876256868193316866-4PBA

DefineWordsWhileReadExplainWhatYouRead

AR overlay which defines words you don’t know when you read or hear them.

The system would have a vocabulary model of all the words that you know (which you’ve used or been exposed to many time) and a model of what words are more rare, to score words based on how likely it is that you don’t know them.

Then, words you don’t know can be automatically defined when one is exposed to them, with no need to request a definition from the computer.

An early application of wearable BCI could watch for ERPs that indicate novelty/surprise (e.g. P300) which could signal that a word heard was not understood.

Science

ARDataVisualization

Visualizing scientific and business data in 3D, 4D, and higher dimensionsal spaces. Smart glasses will allow one to go past a computer monitor, adding an extra dimension to ones sensory perception. This has already been proven effective by numerous studies.

Navigation

HUD Navigation

A map overlaid on your vision with head up directoins. Voice command or mobile phone could enter data, it’s then displayed on the glasses.

Smart watch good for with heart overlay

I personally get super mad when biking and having to constantly stop to check my phone. Or worse, when runnign and have to stop the run, take off gloves, pull out phone from pocket, etc.

Further this would help you stay in the moment, as the nav could disapear and only appear if you go off course

HUD indoor navigation

When inside a building, one wants to be able to ask where a specific location within that building is, and be guided on how to get there.

For example, being in a grocery store, one might ask where the peanut butter is, and the glasses could tell you where to go.

Better, put in a shopping list before you start, the system defines a route, and the route is overlaid on your glasses, and when you get to the product you want to buy, it is highlighted and pointed at in AR.

Or in a mall, find the store you need.

Or find the nearest washroom

Places this would work well:

Sensing

POV define what you see

I am having a hard time finding a good api for defining whatever I am looking at.

Google Lens is the prime example – it’s AMAZING and it can identify ANYTHING you throw it, almost uncannily accurate and can pull up the exact make and model of anything you scan if a good picture, good lighting, and properly cropped.

I can’t find a Google API or otherwise that can do this… I’m now asking around.

One thing I know – we need the human in the loop, because even Google lens (the best one I’ve found) will have half of the results not be what I am searching for… we need to do what is called a “Visual search”, then present the user with the results, let them pick the top 1-3 results, and then provide links/info/names/object-rec for those picked.

NOTES:

Overlay of satellite overhead view of current location

See your exact location, looking down from above, in a satellite view.

Extend you experience and awareness of the 3D environment to an expanded area and scope.

Help imporve your spatial awareness

appreciate the beaty and the size of the place where you are

Could explore drone/camera, topo, weather, map, etc. different views to help you understand your environment

BrainAndBodySensingSmartGlasses

Smart glasses are the ideal location for biosensing because they allow for subsequent body and brain sensing, capturing almost all of the same body signals that a wrist worn sensor can capture.

Wearable sensing of the body has already proven itself extremely important in terms of physical health and performance. Wearable brain sensing is now doing the same thing for mental health.

Self-sensing

what is my speaking rate?

Use speaker diarization, facial recognition, and ASR to recognize when you are in a conversation and figure out your WPM speaking rate

where do I spend the most time

detailed map-based view of your location as a time-density plot – understand where you spend the most time, when you spend your time there, daily, weekly, monthly, yearly patterns, etc.

Exercise+fitness mode

Fitness/sports assistant providing you with live fitness and sport related information.

All physically demanding sports could features live overlay of fitness physiological metrics:

Sports involving movement (running, biking, etc.) could feature live

HUD Navigation

with satellite overview, HUD overlay distance tracking, speed tracking on HUD

Sport specific metrics could include:

Control the media you are consuming while you exercise;

Heart overlay

Biosignals overlay on your vision immediatly available at any time that you request it.

Alerts if your heart rate is abnormal – for example goes very high when you aren’t exercising.

This is like

Exercise+fitness mode

but you would access the biosignals at times you aren’t exercising

Communication / Conversational

Define a word live while it’s being said

During a conversation, when someone says a word I don’t know, I want to be able to define it immediatly.

This may be a highly abstract word, requiring a semantic definition, maybe it’s a thing that is better shown (image), or maybe it’s somewhere in between. The wearable should decide and display the correct thing.

The system should know what I know (know my vocabulary) and thus should be able to guess with high cetainty if I will need a word defined.

Live language translation

Two people who speak different langauges can each wear a pair of smart glasses and use live language translation to have a full conversation.

Live language translation is a computational pipeline that runs:

(microphone) -> (Foreign language ASR) -> (translate to native language) -> display output (or TTS audio)

Live fact checker

Constantly run fact checker on transcript stream, figure out what ture, what is false, if anything comes up as REALLY wrong – then raise red flag, pull up information that refutes it

Presentation

WearableTeleprompter

It is what it says.
This is the

WearablePresenter

but with just a teleprompter for when you want to read exactly what’s on the screen

Use ASR so the text automatically scrolls based on the person using it.

Use cases:

WearablePresenter

This could be possible to use anywhere if one has a

WearableVisualProjector

This has been started at

https://github.com/caydenpierce/semanticwebspeech

Social

AffectiveComputingEQSmartGlasses

Live in a human-machine feedback loop where the computer co-processor provides you with insights into the body language being displayed around you.

This functionality is useful to anyone who wishes to improve their emotional intelligence by better understanding the affective state of others around oneself. Combining body language, facial expression, tone of voice, language sentiment, facial expressions, and more, one can generate an affective summary of others in the room. Then, the system can feed the user specific useful information, such as acute stress responses, confidence, etc.

This use case is especially useful for autistic people who may have a hard time understanding affective communication, as it makes quantitative what is otherwise qualitative information.

[1]: Picard –

https://mitpress.mit.edu/books/affective-computing

Wearable Face Recognizer

A feature of the

Personal Person Database

Recognize faces and tell the user the name of the person they’re looking.

Could pull up information about that person such as:

Option to add all the people you’ve seen before but weren’t recognized

option to add people that you’ve never seen before

Should run on ALL faces in image at once – in case you don’t see someone you know in a crowd

Never forget names.

Who do I know that can help?

This ties into the

Personal Person Database

When I am stuck on a problem, or trying to figure out how to do something, or operating in an area that I don’t know a lot about, the best thing to do is to “go fishing” and ask my network for help + guidance

If I had constantly on audio, constantly embedded transcripts/conversations, constant facial recognition, I could put together a database that creates profiles of the expertise, knowledge, skills of everyone I know. I could then make queries to this knowledge/skills human netwrok representation and ask things like:

this should obviously tie into LinkedIn, personal websites, resumes etc. as more information about each person to deduced their skill area

What is this person like?

This fits in the

Social Graph

and

Personal Person Database

I want to be able to pull up a profile of someone I know and see an overview of everything I know about them.

Like a generative linked in, but also with affective information, personal/relationship information, historical information,

Could tell me what we ussually talk about

A timeline of every time I’ve seen them

How they react to me (affective information)

Their personality profile as extracted form online, offline, and personal interactions, etc.

Eventually would be able to predict what they would like, dislike, how they would respond to things, etc.

HumanTimers

Set a human timer – any time you think of something you want to discuss with a specific person, make a short note of it and set it as a human timer. Then, next time you see that person, the note will popup under “Human timers” – a reminder of the things you wanted to discuss.

A feature of the

Wearable Face recognizer

, probably

WearableConversationBank

Show a list of all the conversation you’ve ever had.

Conversations can be sorted by time of course, but they can also be sorted by semantic information (who you were with, what you were talking about, where you were, what the weather was like, the rough time of day)

Opening a conversation will show the raw transcript, and a list of topics.

POVStreamerGlogMultimodalWearable

Live stream to Twitch or Youtube

See the live chat overlaid on your glasses

Glogging – stream multimodal sensor streams to others.

Twitch would probably be the best low hanging fruit – send Twitch and RTMP/RTSP stream.

Lots of people want this, it seems.

Semantic Memory + Knowledge

Contact + phone number memory

Input a phone number into your device hands free and add a new contact using voice command on smart glasses.

Remembrance Agent

Listen to (transcribe) everything that’s being said, pull up relevant info from my own personal knowledge database (eventually personal knowledge graph) that is closely related to what is being discussed. This would remind of things I know that are closely related including:

WearableConversationSummarizer

After a conversation, I want to be able to review a summary of that conversation, read it out.

I should be able to quickly/easily search through past convos (on phone or glasses), pick one, and get the summary/highlight reel/most imporant points.

(MODIFEID QUOTE) EXCERPT FROM RELATED NOTE

Meeting minutes wearable

:


Example: “we were at Bria’s house and I was chatting with Britnney and Matt about 2 weeks ago. “

The user need only filter the conversations down to an acceptably size list, and can then use a smart glasses ui to pick the correct converstaion from a list of conversation using the thumbnail and meta data to identify the correct conversation.

When a past convo has been selected, the system automatically displays and/or reads out loud a summary of the conversation, with a user setting defining the length (longer is more detailed, short is more abstract) of the summary.

This could be a memory upgrade of the most educational and important time we spend – communicating with other people.

The metadata by which to remember conversations:

Meeting minutes wearable

Similiar to

WearableConversationSummarizer

, but this would be for industry.

People try to take meeting minutes. The best of the best often aren’t in meetings per se, but many, many microinteractions in a day (when one is working inside of a company). Imagine a wearable that could create notes, subconsciously, of every one of those conversations. Whenever the user wants to remember a conversation, they can pull it up on their glasses with a natural language memory

Example: “we were at Acme’s headquarter and I was talking with Matt Smith about 2 weeks ago. “

The user need only filter the conversations down to an acceptably size list, and can then use a smart glasses ui to pick the correct converstaion from a list of conversation using the thumbnail and meta data to identify the correct conversation.

When a past convo has been selected, the system automatically displays and/or reads out loud a summary of the conversation, with a user setting defining the length (longer is more detailed, short is more abstract) of the summary.

This could be a memory upgrade of the most educational and important time we spend – communicating with other people.

The metadata by which to remember conversations:

MXTCache_UseCase

A running cache which you can fill will all of yours ideas related to the thing that you are currently thinking about.

One would keep making a new cache all of the time, when you are done thinking through an idea (e.g. end of a walk) you should close the previous cache with a voice command.

Next time you want to remember some thing, you should be able to view a quick highlight reel of automatically generated summary of whichever previous cache you want to pull up

PKMS_SmartGlasses

A personal knowledge management system is used to record information in a way that leads to better ideas, memory, knowledge connection/cohesion, etc. Current system are largely laptop based, with smart phones serving as an input interface while on the go. By bringing a PKMS onto your face with smart glasses, one can manage all of their knowledge, because smart glasses can see what you see and hear what you hear.

HUD TODO list

Voice commands to add items to

TODO

list

simple voice command to pull up list, auto scroll, use voice command to scroll, or scroll on glasses touchpad

delete items with command

upload to do to computer/ PKMS

One of the real benefits and groundbreaking new HCIs here will be contextual

TODO

– what if the

TODO

list always sense what you were doing, where you were, the urgency/priority of your TODOs, etc, and combined all of that info to figure out, contextually, when the best time to give you your

TODO

list is?

Some times, things enter your head, and you need to just deal with them later. They’re too important to leave to change of memory, and too important to just be another note in a list. But you don’t want to or can’t spend the time to think about them right now. So, there should be a way to throw off items into a todo list with a rough deadline, and when you have some free time soon, pull them up to deal with those thing.

An example use case is to “throw off” all of the things you really need to do asap but can’t do today, and then load this note while you plan tomorrow’s schedule, and you’ll know exactly what you have to fit in.

Episodic Memory

Where is my x? Did I do x?

where are my keys?

Where did I park my car?

Where is my wallet?

Did I eat breakfast this morning?

What did I eat for dinner last night?

What restaurant did we go to last time we met?

Smart glasses can answer these questions by running object, scene, person, etc. recognition on one’s POV at all times. Then, when a question about the past is requested, a semantic search/QA NLP system can be used to return the answer.

Art

MultimediaContentGenerationWearable

Egocentric audio, video, biosensing, all come together to form a stream of infomration that can capture the unique view of the user.

Simple use case is just taking a picture when a voice command is given.

Voice commands to take a picture, start/stop audio recording, start/stop video recording, show a viewfinder, review a video that was taken, etc.

This is a content generation device for artists, youtubers, industry making training videos, etc.

Guitar generative tab smart glasses overlay

Put on a backing track, start playing guitar with smart glasses on. The glasses will listen to the music, generate mulitple possible guitar music to accompany it, convert it to tab, and overlay on your glasses.

This would create a human-commputer-instrument system where sometime the human would improvise, and sometime the human could interface/work with the computer on what to play, by choosing the best generated tab that the smart glasses are presenting, live, for the current piece.

Industry

WearableQuickReaderInformationStream_UseCase

This is

WearableWebBrowserAndReader

and

WearableDisplay

but with a focus on content that is industry specific.

A doctor working in emergency wards who has needs to view patient information live as she interacts with them.

There are many professions that could improve by having instant access to information industry-specific:

Picking

Barcode scan any item, live HUD with which items you need to get, AR directions to exactly where the item it, voice commands to annotate when you’ve complete and action so you can continue to work and enter all information compeltely hands-free.

Checklists

Voice command check lists – see checklists overlaid on your vision, gradually check things off and scroll down through the list until everything is done.

Training video

Basically

MultimediaContentGenerationWearable

, but the specific purpose of generating training content for trainees in your company

Someone with tons of experience films themselves doing something, explains what they are doing, etc. Then trainees watch that video and learn how to do it hands on

Hardware digital twins intelligence

Imagine a specific commercial hardware unit, like a furnace. Everytime it’s worked on, imaginge the GPS location was used to track which unit the tech was working on. Immediately upon walking up to the unit, information about the unit is overlaid on their vision, with the ability to go deep into the maintenance log to understand more information about that specific unit. They could use voice commands to record any novel information about the system, notes about what they changed, what they noticed, etc.

All the data the field worker records could directly update tables, values, and notes that comprise the digital twin of that hardware/machinery

Remote Expert + Telemaintenance + remote support

The single biggest use case for smart glasses in industry today.

Have a remote call with an expert where they can see your POV video and audio. The remote expert can walk you through a number of steps/actions, and can draw on your AR overlay to highlight parts, etc.

http://files.vuzix.com/Content/pdfs/VuzixTelemaintenanceWhitePaper-042016.pdfMW