I’m a big fan of ChatGPT, but it’s tedious to type things out on mobile. Much nicer would be an AI assistant that I can talk to and have it talk back, in long conversations without fiddling with a screen. I haven’t seen something like this, so I figured I’d do it myself!
Here’s the feature set I had in mind:
- Siri-like voice interface, totally hands free
- Continuous conversations, not just one-off questions
- Ability to start new conversations, and go back to old ones
- Fast to use, easy to customize
After some googling around, I found that some people were using Shortcuts to do a single basic query to the OpenAI API. Seemed like a simple enough concept, so starting with that I took a Sunday afternoon and got exactly the hands-free, full-conversation AI working that I wanted. Check out the demo video below!
- Install the free Shortcut on a mac or iPhone from here: https://www.icloud.com/shortcuts/d05ca89d2d674c9fa3846b96a55359f8
- Set up an OpenAI API key, which you can get from: https://platform.openai.com/account/api-keys. This isn’t free, however it’s perhaps a penny for every question and is well worth it
- Edit the shortcut, and paste in the API key where indicated
- To run it, you can show the shortcut on your homescreen, or activate Siri and say “Hey GPT”
- Say “New chat” to finish setting it up
- Ask ChatGPT anything, and it will respond back to you
- Once it finishes speaking, it will auto-prompt you for another question
- To stop it prompting you for questions, end your phrase with “stop”. When you use it again, it will pick up the conversation where you left off
- To start a new conversation, say “New chat”
- All conversation history is saved to the Notes app in the “Notes” folder. You can view or edit those past conversations, and since the Shortcut uses the last modified note to grab conversation history, make a small edit like adding a space at the end to make that the active conversation
- If you get the error message “”, then either you have a bad API key, or your most recent conversation history is corrupted
Since this uses built-in Shortcuts and isn’t an app, you can edit it and see exactly what it’s doing!
- If you don’t like the auto-prompting, can change the number of repeats from 100 to 1
- Change the voice and how fast it speaks
- Change the model you want to use. Currently gpt-3.5-turbo is much faster, but gpt-4 is more accurate.
- Store and read from a different notes folder than the default “Notes”
When in doubt, making sure your API key is right and saying “new chat” should clear most errors.
|In “‘, no value was found for dictionary key ‘choices’||Invalid API key||Sign up for an OpenAI API key and make sure you paste the whole thing into the right box in the shortcut. This requires paying separately from ChatGPT Pro. Also make sure your key is not expired|
|In “‘, no value was found for dictionary key ‘choices’||No funds||If you had free OpenAI API credits they may have expired and you’ll have to add payment info. Or you are over your usage limits and will have to raise them|
|In “‘, no value was found for dictionary key ‘choices’||Conversation log is invalid json||Say “new chat” to get a new conversation log going. You can’t have quotation marks or new lines in your questions|
|In “‘, no value was found for dictionary key ‘choices’||No gpt-4 access||Some people have reported that gpt-4 has not worked for them, but gpt-3.5-turbo does. Change the model in the shortcut|
|In “‘, no value was found for dictionary key ‘choices’||No connection to OpenAI server||Make sure you are connected to internet, and that ChatGPT is available in your country|
|The operation couldn’t be completed. (AppIntents.EntityPropertyQueryError error 2.)||It can’t find the right notes folder||Make sure both the code blocks to read and write to the note are pointing to a valid folder. If it’s grayed out and says “Folder”, you’ll need to click on it and select one|
|Numerical argument out of domain. the range you specified was invalid (you asked for items 2 through 1).||Conversation log contains no json||Say “new chat” to get a new conversation log going|
|There was a problem running the shortcut “Hey GPT”||?||Try running it again|
What’s it like?
On my drives to work this week, I’ve been chatting with GPT just learning about stuff. It’s awesome. Today I was having it teach me about the solar cycle (in honor of the Aurora Borealis reaching Colorado this week), the finer points of running a kickstarter campaign, suggestions for an email I had to write, and the history of Graz, Austria (which I had seen a pretty picture of over lunch). You can’t trust it 100% of the time since it will sometimes confidently make things up, but I don’t think it’s actually much worse than a human in that regard. And a dash of skepticism doesn’t make it any less fun! Once Apple gets Siri up to this level, it’s going to totally change the way people interact with their phones. I’m happy to get that little slice of the future a little early.
The groundwork for my interest here was laid a long time ago. I think I was in 7th grade in 2004 when I read Time’s Eye by Arthur C. Clarke. I don’t remember much of the plot, but I do remember that one of the main characters was a teenage girl from the future who had a phone with a built-in AI that served as a companion, fount of the world’s knowledge, and essentially played the role of a miniature demigod. At the time it struck me as ludicrous that the author thought that phones would be where our future computers came alive – at home ours was plugged into the wall and it could hardly remember a dozen numbers. Two decades later, it seems almost inevitable that this is where the miniature supercomputers in our pockets will soon go.