The South African hacker who built a completely offline AI-powered talking duck
Software developer Dale Nunns presented his experience creating an offline conversation engine and building it into the body of a 3D printed duck at Devconf this year.
Nunns, who is the technical lead at DealX by day, explained that the idea to build Miss Duckworth riffed off the concept of rubber duck debugging or “rubberducking”.
When you’re stuck on a programming problem, rubberducking is the practice of debugging your code by explaining it line by line to an inanimate object.
However, the concept can also be more broadly applied to other technical problems.
The idea is that explaining the problem and your attempted solutions out loud helps the brain process it differently than when thinking about it silently.
Nunns joked that he wanted to see if having a rubber duck that could talk back would improve the debugging process.
In addition to running offline, Nunns said he didn’t want to break the bank on building Miss Duckworth. Therefore, he opted against using expensive graphics processing hardware to run the large language model and tried to use gear he already owned.
For that reason, the engine is powered by a Raspberry Pi 5 using several expansion boards. In Raspberry Pi nomenclature expansion boards are called HATs — a backronym for Hardware Attached on Top.
For audio input and output, Nunns used the WM8960 Hi-Fi Sound Card HAT, which he had lying around.
Unfortunately, he discovered that it does not work well with the Raspberry Pi 5 and its new operating system.
Rather than abandoning the HAT and buying a USB sound card, Nunns said he instead spent three weeks of the three months he had to build Miss Duckworth before Devconf hacking at the sound card’s drivers.
“When you’re in the moment, all those things disappear,” he said of considering a USB sound card as an option. “And anyway, I was having fun.”
Nunns also used a 16-channel hobby servo driver HAT that he already owned to operate the motors opening and closing the duck’s beak.
He said Miss Duckworth’s head also moves, but he disabled this functionality due to some problems he couldn’t iron out before demo day.
In hindsight, Nunns said using the servo driver HAT was a mistake, and he should have used a separate microcontroller.
He also said he made some mistakes when selecting the servos he would use.
“I thought I was being smart and went one up from the cheapest,” he said.
“I should have just gone for the most expensive servos I could find because I destroyed a few of these in the process — including at about 10 pm on the Sunday before this talk.”
In addition to audio hardware and servos, Nunns also gave Miss Duckworth a pair of LCD eyes and a camera that allowed her to track the face of the person speaking to her.
Her body and head were designed in Blender because Nunns said it works very well with curved shapes. The internal mechanical components were done with OnShape.
These were all printed on a Creality K1 3D printer using PLA filament.
Nunns said it took at least 150 hours of printing to get the correct size and shapes for all the components.
Regarding the underlying software, Nunns said he used OpenAI’s Whisper voice recognition library for Miss Duckworth to convert speech into text to be fed into the large language model (LLM).
For the LLM itself, Nunns said he used a tool called Ollama to help him get up and running. It also allowed him to change LLMs more easily when he needed to.
Nunns said he wanted a large language model that wouldn’t arbitrarily refuse to answer certain questions or respond to certain topics, so he started with Meta’s Llama 2.
However, he found it to be very slow on the Raspberry Pi processor.
“When you’re choosing a language model for a duck, you don’t want accuracy. Accuracy doesn’t matter,” said Nunns.
“You also don’t want a model that tells you that it can’t tell you the answer.”
Nunns said if he asks about the colour of the sky or how the duck feels that day, he doesn’t want it to run into guard rails and give back an error message.
“I don’t want to have the duck go, ‘As an AI model, I don’t understand because I don’t have a body’, or, ‘As an AI model, I don’t have an opinion.’ That’s boring,” said Nunns.
“I want models that just freak out, hallucinate, and dump a whole lot of rubbish there because that’s entertaining and fun.”
When Llama 2 didn’t work out, Nunns used Google’s Gemma 2b model for a while. However, he abandoned it after a recent update that introduced guard rails, which Nunns said made it less fun.
Nunns said he switched to TinyLlama for the Devconf demo.
“It’s not that good. It doesn’t know a lot of stuff — but it hallucinates fantastically, so that’s good enough for me,” he said.
The output of the LLM is put through a library called Piper, which is an Open Home Foundation project and part of the Rhasspy offline voice assistant.
The results of Nunns’ work can be seen in the short demo video embedded above. Nunns’ full talk from Devconf 2024 is included below.