Make Yourself into an Anime Figurine with AI Image Generation

If you’ve ever wanted to see a figurine of yourself but you have no artistic talent, AI image generation can make that dream come true, and you can try it for free. I jump right into the “how to” and add my boring commentary to the end of this post, so you can skip it.

Brett, in real life

I’m using DALL-E for my image generation which requires a paid subscription, but you can get free access to it through Microsoft Bing Image Creator (requires a free Microsoft account). Once you have signed in, look for the text input field next to the two buttons “Create” and “Surprise Me”. The text field is where you describe what image you want AI to generate, then you click “Create” and a few seconds (or minutes) later, up to four images will be displayed. This process is called “prompting”, which is a common way to guide AI to generate the desired output. But getting AI to do exactly what you want is a little like herding drunk cats, so crafting the prompt can take some effort and some understanding of how things work under the hood. We’ll skip that for now and just start making fun things…

Anime figurine Brett with laptop and margarita

The structure for the prompt is “Anime figurine of <my description, skin tone, eye color, hairstyle, outfit>. The figurine is displayed inside a box with <text on box> and logo for the box, allowing visibility of the figure, typography, 3D render”. To make something that looks sort of like me, I used “Anime figurine of a shaved head, bald on top, nerd, white skin tone, dark gray hair, blue eye color, brown short beard, brown eyebrows, black shirt, jeans, Converse high tops, wearing blue rimmed glasses, wearing a watch, holding a laptop and a margarita. The figurine is displayed inside a box with Brett and logo for the box, allowing visibility of the figure, typography, 3D render

AI reminding Brett of what he lost

Once you’ve tried this for yourself, you probably noticed a few things… Most obviously, somehow the AI didn’t do what you thought you told it. For example, while I prompted “bald on top“, one of my images clearly had hair, which might be the AI getting confused with the conflicting “dark gray hair” in the prompt. I have found replicating hairstyles, even bald hair styles (if… that’s a hair style?), can be challenging. I’ve yet to be able to get any consistency with hair only on the sides and back of the head. The other thing you will probably notice is the wild things that can show up in the image, especially when it comes to text generation, where AI tends to get… creative. Some of the words you use in your prompt may show up in the image, and misspelling is not uncommon.

Cheers!

There is considerable variation in the images, some looking more like the giant-headed Funko Pop figurines, and others having pretty realistic proportions. Prompting for another common outfit I wear, “Anime figurine of a shaved head, bald on top, nerd, white skin tone, dark gray hair, blue eye color, brown short beard, brown eyebrows, black shirt, tan pants, brown leather boots, wearing blue rimmed glasses, wearing a watch, holding a laptop and a pint of beer. The figurine is displayed inside a box with Brett and logo for the box, allowing visibility of the figure, typography, 3D render” created something a little more proportional.

Funko Pop Brett

So play around a little and see what you get… if anime isn’t your thing and you really love the Funko Pop style, try swapping out the prompt, “Funko style figurine of a shaved head, bald on top, nerd, white skin tone, dark gray hair, blue eye color, brown short beard, brown eyebrows, black shirt, jeans, Converse high tops, wearing blue rimmed glasses, wearing a watch, holding a laptop and a margarita. The figurine is displayed inside a box with Brett and logo for the box, allowing visibility of the figure, typography, 3D render“.

This gallery contains more examples:

Boring Commentary

A little over a year ago I wrote Robots Building Robots: AI Image Generation, where I used my laptop for AI image generation, meaning I had to use substantially less powerful AI models than are available in the cloud, where processing power and memory can be massive. The less powerful model was fine for the specific application I had in mind (a cartoon-like sketch of a robot for a sticker), but a few people commented that the quality of the AI images was average, and some were skeptical about AI’s capability.

In that same post, I mentioned Midjourney, which at the time version 4 was just coming out and already looking pretty amazing. In the 14 months since then, the quality and capability has continued to improve at an astonishing pace. For a detailed look at Midjourney specifically, check out this post from Yubin Ma at AiTuts. In less than two years, this model has gone from distorted human faces (some almost unrecognizable) to photo realism.

Female knight generated by Midjourney, V1 (Feb 2022), V4 (Nov 2022), V6 (Dec 2023), images from AiTuts
Vintage photo of girl smoking generated by Midjourney, V1 (Feb 2022), V4 (Nov 2022), V6 (Dec 2023), images from AiTuts

I have been surprised by both the rate at which the quality and the versatility of AI generated images has increased, with the anime figurines being one of the more recent (and delightful) examples of something AI can create unexpectedly well. I’m limiting this post to still image generation, but the same is happening for music, video, and even writing code (my last three hobby programming projects were largely created by AI). It’s reasonable to assume that AI will make substantial improvements to generating 3D image files, so soon you’ll be able to 3D print your cool little anime figurine.

There are, of course, significant implications of having computers provide a practical alternative to work that used to require humans, and much like the disappearance of travel agents once the Internet democratized access to booking travel, we should expect to see a dramatic reduction in demand for human labor, and this will be disruptive and upsetting… some professions will be nearly eliminated. I don’t want to be dismissive about the human impact of more powerful automation.

At the same time, AI can empower people, and create entirely new opportunities. Large language models (LLM) create the opportunity for customized learning, where eventually individuals all across the planet can have a dialog with an AI teacher, navigating millions of human years of knowledge. More and more, people will not be limited by their resources, they will only be limited by their ideas… The average person will be able to build a website, or a phone app by describing what they want, and someone considering themselves as “not artistic” will be able to create songs, artwork, or even movies that will eventually be box office quality. AI will also likely play a significant role in things like medical advances and energy efficiency, things we generally consider good for humans.

Did you enjoy making yourself into an anime figurine? Did you come up with a prompt that made a super cool image? Did you figure out how to get my male pattern baldness accurate on the figurine? This my hot take on being optimistic about AI is horrible? Leave a comment, below!

Very Basic SunPower and Home Assistant, no HACS

In my continuing adventures trying to add better monitoring to my SunPower solar farm, I took a slightly different approach that what seems to be the common path on the Home Assistant Options for Sunpower solar integration? thread, and instead used Apache for the reverse proxy, and skipped HACS altogether, using the RESTful integration. If you haven’t read it already, you may want to start with Getting Administrator Access to SunPower PVS6 with no Ethernet Port, which covers me earlier failure.

Networking

The first thing was getting the SunPower PVS6 administrative interface. Since I didn’t have easy cabling access, I used a $7 ethernet adapter and a TP-Link AC750 Wireless Portable Nano Travel Router (TL-WR902AC). There is a cheaper model of the TP-Link that would have worked just fine, but even at $39 it was less expensive than most of the lowest-end Raspberry Pi crazy-ass prices right now. Power for the TP-Link comes from the LAN4 port on the PVS6, and the ethernet connects to USB2/LAN. The TP-Link is configured in “Router Mode”, where it connects by wired ethernet to the PVS6 and creates a separate network for all devices that connect by wifi. If you do this, you will want to configure the TP-Link to use a network different than your home network (e.g. if your home network is 192.168.0.0/24, use something like 192.168.2.0/24).

TP-Link and ethernet dongle crammed in the SunPower PVS6

At this point you should be able to connect to the TP-Link wifi and test access to the administrative interface at http://172.27.152.1.

Of course, the problem now is we need to connect the home network to the SunPower network, but there is some nuance… we only want the web traffic. Very specifically, we do not want the TP-Link to connect to the network and start giving new IP addresses to our home network, which is also why you don’t just plug the ethernet from the PVS6 into your home network.

I happen to have a home file / everything else server that runs on a Raspberry Pi, and already has Apache running. That server connects to my home network via an ethernet cable, so its wifi was unused and available. I connected to the SunPower wifi (SSID “sunpowernet”):

sudo nmcli d wifi connect sunpowernet password "5ekr1tp@$$"

Finally, I need to let the server know that when the destination network is the PVS6, it needs to use the wifi connection, not the ethernet connection:

sudo ip route add 172.27.152.0/24 via 192.168.2.1 

This is a great time to mention that it would be good hygiene to setup your server to have firewall rules blocking incoming traffic from the TP-Link, other than DHCP and established connections, in case the PVS6 is ever compromised.

Reverse Proxy

While HAProxy is super awesome and you should absolutely use it if starting from scratch, I happen to have a home server that gets 5 requests per month and was already running Apache, so I wanted to do as little extra work as possible. Fortunately, Apache has a reverse proxy, and that makes this pretty easy. I setup a virtual host with the following sunpowerproxy.conf config:

<VirtualHost *:80>
        ServerName sunpowerproxy
        ServerAlias sunpowerproxy.mypersonaldomain.net
        ServerAdmin [email protected]
        
	ProxyPreserveHost On

    	ProxyPass / http://172.27.152.1/
        ProxyPassReverse / http://172.27.152.1/

        ErrorLog /var/log/apache2/sunpowerproxy-error.log
        LogLevel warn
        CustomLog /var/log/apache2/sunpowerproxy-access.log combined
</VirtualHost>

The virtual server is going to expect the HTTP request to come to a server named “sunpowerproxy” (or whatever you name it), so you’ll need to add that DNS entry pointing to the ethernet address, not the wifi address.

If you’ve done everything correctly (modules installed, site enabled) you should be able to test everything by calling the PVS6 API from a web browser, by pointing to http://sunpowerproxy.mypersonaldomain.net/cgi-bin/dl_cgi?Command=DeviceList

After a few seconds you should get a JSON blob listing all of your devices.

Home Assistant Configuration

Finally, we need Home Assistant to be able to pull the values from the proxy. The RESTful integration provides a pretty easy way to do this… here is a basic configuration to get the current power usage and overall energy, although a lot more information, including details for each individual panel, is available:

rest:
  - scan_interval: 60
    resource: http://sunpowerproxy.mypersonaldomain.net/cgi-bin/dl_cgi?Command=DeviceList
    sensor:
      - name: "Sunpower Production Power"
        unique_id: "SPPP"
        json_attributes_path: "$.[1]"
        device_class: power
        unit_of_measurement: "kW"
        state_class: "measurement"
        json_attributes:
          - "p_3phsum_kw"
        value_template: "{{ state_attr('sensor.sunpower_production_power', 'p_3phsum_kw') }}"

      - name: "Sunpower Consumption Power"
        unique_id: "SPCP"
        json_attributes_path: "$.[2]"
        device_class: power
        unit_of_measurement: "kW"
        state_class: "measurement"
        json_attributes:
          - "p_3phsum_kw"
        value_template: "{{ state_attr('sensor.sunpower_consumption_power', 'p_3phsum_kw') }}"

      - name: "Sunpower Production Energy"
        unique_id: "SPPE"
        json_attributes_path: "$.[1]"
        device_class: energy
        unit_of_measurement: "kWh"
        state_class: "total"
        json_attributes:
          - "net_ltea_3phsum_kwh"
        value_template: "{{ state_attr('sensor.sunpower_production_energy', 'net_ltea_3phsum_kwh') }}"

      - name: "Sunpower Consumption Energy"
        unique_id: "SPCE"
        json_attributes_path: "$.[2]"
        device_class: energy
        unit_of_measurement: "kWh"
        state_class: "total"
        json_attributes:
          - "net_ltea_3phsum_kwh"
        value_template: "{{ state_attr('sensor.sunpower_consumption_energy', 'net_ltea_3phsum_kwh') }}"

Now you should have the ability to add the SunPower sensors, and configure the Energy dashboard!

The Energy dashboard in Home Assistant

Now that I have this working I will probably realize that the hass-sunpower using HACS is a way better solution, but only the RESTful integration would need to change, all of the network and proxy configuration would carry over.

Finally, if you’ve made it this far, you probably realize that it would be way better if SunPower offered a reasonable API for home integrations, instead of making people take these ridiculous steps… please let your SunPower contact know!

What’s your SunPower and Home Assistant experience? If you’re following in my footsteps (yikes), how did it go? please leave a comment, below!

Getting Administrator Access to SunPower PVS6 with no Ethernet Port

Well, if you landed on this post you either have a need to cure your insomnia or you have a very specific problem. I recently decided to become a sun farmer, and went with SunPower, which is great, but they don’t offer integrations beyond their decent but limited web and mobile apps. In particular, I wanted to integrate with Home Assistant, because… well, just because.

The main solar interface from SunPower is the PVS6 (successor to the PVS5), and by connecting to an administrative interface it is possible to pull some detailed data like specific energy output and health for each panel. The good news is the PVS6 comes with two ethernet ports, one for a WAN to connect to their servers and one for a LAN that will allow access to the administrative UI, and all one needs to do is connect to said port and then… hey, WTF? My PVS6 doesn’t have either of these ethernet ports! So, yeah… evidently there is a new version of the PVS6 that does not have ethernet ports, and the primary WAN connection is via wifi.

A blurry photo of the ethernet-port-less SPV6

After digging around teh webz, it seems that the PVS6 USB ports will work with a USB to ethernet adapter, but several people reported some adapters didn’t work. Unsure if the magical solution is the adapter needs to be USB 2.0, but I found a $7 adapter on Amazon, and it just worked. I connected my laptop to the USB2/LAN port, the PVS6 assigned an address to my laptop, and browsing to http://sunpowerconsole.com/ provided a web administration interface. However, PVS6 is not within convenient ethernet wiring distance, so I dug around some more and found Dolf Starreveld’s page, which included an amazingly comprehensive doc, Monitoring a solar installation by tapping into a SunPower PVS5 or PVS6. This doc starts with the assumption you have a PSV* with an ethernet connection and want to get to wifi, and with my USB to ethernet dongle, that’s what I had, so all I needed to do was mount a Raspberry Pi in the PSV6 to act as a router / bridge to my network. But while reading his doc, I noticed a mention of a hotspot interface available for a limited time after PVS6 power-up, and a link to a SunPower doc on commissioning the PVS6 via wifi… this sounded promising.

Sure enough when I scanned for wifi connections, I found a SunPower SSID that matched my system. And since my system had been on for days, it didn’t appear that the 4-hour window applied, so great news! The formula for the SSID is “SunPower” immediately followed by characters five and six of the PVS6 serial number; immediately followed by the last three digits. The password follows a similar formula, characters three through six of the PVS6 serial number; immediately followed by the last four digits. Once connected, I had the exact same access I had when directly connected via ethernet.

But the cool stuff isn’t really in the web UI, you need to call it directly. For example:

http://sunpowerconsole.com/cgi-bin/dl_cgi?Command=DeviceList

Will show all devices and panels, with a ton of data on each. Dolf Starreveld’s document has a ton of details.

Since I don’t plan to run this from my laptop, I still need to bridge the network… several people have written about using a dedicated device like a Raspberry Pi, including Scott Gruby’s Monitoring a SunPower Solar System, where he uses a very lightweight Raspberry Pi Zero W, and then a simple haproxy setup. However, I’d like to avoid another device (especially with the current price for Raspberry devices – holy crap), and my Raspberry Pi 4 file server connects via ethernet, so I’ll likely use its wifi to connect to the PSV6 and run the proxy from there. After that I’ll configure Home Assistant and likely bore you with another posting.

And, no sooner do I get to the end of writing a post when I realize that the wifi network has vanished, so I either need to find a way around that problem or else I’m adding a router to my PSV6.

Are you doing anything interesting and hacky with your SunPower system? Do you have cool integrations with Home Assistant? Did you stay awake through this whole post?… please leave a comment, below!

ESPHome Temperature and Humidity with OLED Display

For the 3 people that have been reading my posts, you know my journey from ESP8266 hacking off the shelf smart switches to creating custom devices, most of which I connect to Home Assistant so that smart devices are secure and don’t rely on the existence of any particular vendor. The next step in this journey is using ESPHome to simplify creation of custom software for ESP8266/ESP32. For my first project I created ESPHome Temperature and Humidity with OLED Display.

Why ESPHome?

Why use ESPHome instead of something like Arduino Studio? Simply put, its simple but powerful. As an example, I made custom software for a temperature & humidity reader for my bathroom that was 8 custom lines of code (which wasn’t really code, but configuration). YAML configuration files provide access to a ton of sensors, and if you need to dig in deeper with more complicated functionality, ESPHome provides Lambdas that allow you to break into custom C++ at any time.

One of the other cool things about ESPHome is, while it integrates seamlessly with Home Assistant, the devices are meant to run independently, so if a device is out of range or has no network connection, it still performs its functions.

And I wanted to learn something new, so…

Why This Sensor?

I have a few temperature sensors around the house and I also keep one in the camper van, mostly out of curiosity so I can compare it with the in-house temperatures using graphs on Home Assistant. However, I realized that when I went camping I wanted access to the temperature, but couldn’t do so without being connected at home. The old version of the van thermometer was based on an ESP8266 Garage Door opener (I never shared that posting, so sorry, or you’re welcome) and I didn’t want to update it with a display screen, as I really don’t need that for a garage door (yes, I realize something not being needed usually doesn’t stop me from building it). I decided that I might as well take the opportunity to use ESPHome since it was simple and worked offline.

It’s Time to Build

I won’t go into the details of setting up Home Assistant, but if you are into home automation at all, it is super awesome. Installing the ESPHome add-on gives enables the flashing and managing of ESP8266/ESP32 boards, and they automatically connect to Home Assistant at that point.

For the lazy, ESPHome is generally a few clicks and a little copy pasta. For this project, it’s no different, with the step-by-step details available on my Github project page.

Wiring complete on DHT11 and ESP8266 (ESP-12F) before gluing case together

For the hardware, I used the very tiny ESP-12F, DHT11 sensors, this OLED display, and a few miscellaneous components. All of these fit pretty nicely in the case, although it is a pretty snug fit, so I highly recommend a very thin wire, and this 30 Gauge silicone wire was super great for the job. Exact components, wiring diagram, and the STL files to 3D print the case are on my Github project page.

When assembling, removing the pins from the DHT11 board and soldering the wires directly to the board will allow it to be flush against the case, giving better exposure to the sensor and fitting better. The DHT11 is also separated from the ESP8266 board as I was hoping to create some insulation from any heat from the board impacting the temperature reading (not sure that work). There is a hole between the chambers to thread the wires, and I suggest a bit of hot glue to block air flow once wired.

As you can see in the photo, I used generous dollops of hot glue to hold the components in place… it isn’t pretty, but nobody will ever see it (well, unless you photograph it and blog about it, in which case still, probably nobody will see it). I sealed the case with Zap-A-Gap glue, as I found original Super Glue was way less effective.

Instructions

Plug it in.

Okay, well… it’s a little more than that. The screen will alternate between showing the temperature and humidity in text format and a 24 hour graph that is a bit useless for now since I am not sure how to get the historical values to label the graph. The button will toggle screen saver mode on / off, and the screen saver activates automatically after a short amount of time.

If you want to get a little fancier, I have this sensor in the bathroom and it will turn the vent fan on (via Home Assistant) when the humidity reaches a certain level… it’s useful for anything where you want to control devices based on temperature or humidity, and even more useful if someone might want to see the temperature.

Are you doing home automation with ESPHome? Do you have suggestions or requests? Did you actually read this blog post for some reason? I want to know… please leave a comment, below!

BART Train Monitor with ESP8266 and SSD1306 display

I frequently commute on BART and wanted a convenient way to tell when I should start walking to the train station, ideally something that is always accessible at a glance. A gloomy Saturday morning provided some time to hack together something…

ESP8266 to SSD1306 wiring diagram

I had already purchased a ~1 inch SSD1306 display and had extra ESP8266 boards laying around, so I figured I would start there. Wiring the screen to the board is super simple, just 4 wires (see diagram).

From there is was pretty simple to pull the train real-time data using BART Legacy API. BART also offers modern GTFS Schedules, which is the preferred way to access the data, but from what I could tell, this would make the project significantly more difficult given some of the limitations of the ESP8266. So, I went the lazy route.

Coding was pretty simple, most of the time was spent rearranging the elements on the screen. Well, actually most of the time was spent switching from the original JSON and display libraries I chose as I wasn’t happy with them.

BART-watcher-ESP8266
(early layout)

There’s a lot to fit into a 1-inch display, but I got what I needed. The top line shows the date / time the station information was collected, and the departing station. The following lines are trains that are usable for the destination station, with the preferred (direct) lines in large type, and less-preferred (transfer needed) in small type. Finally, if a train is within the sweet spot where it isn’t departing too soon to make it to the station but I also won’t be waiting at the station too long, the departure time is highlighted. Numbers are minutes until departure, “L” means leaving now, and “X” means the trains was cancelled (somewhat frequent these days).

In the example above, these are trains leaving Embarcadero for the North Berkeley station (not shown), Antioch and Pittsburg/Bay Point lines require a transfer, Richmond line is direct.

At some point I would like to use a smaller version of the ESP8266 board and 3D print a case to make it a little nicer sitting on a desk, but then again, there’s almost zero chance I’ll get around to it. If anyone is into 3D printing design and wants to contribute, I’ll give you all of the parts needed.

The code / project details are available on my GitHub page at BART-watcher-ESP8266, feel free to snoop, contribute, or steal whatever is useful to you.

BART Watcher on ESP8266 and SSD1306 OLED display
The full build of BART Watcher on ESP8266 and SSD1306 OLED display

Do you have suggestions for features you think would be cool, want to remind me that I waste a lot of time, or maybe you even want one of these things? Please leave a comment, below!

Robots Building Robots: AI Image Generation

I’ve been incredibly impressed with AI image generation, in how far it has come and how quickly advances are being made. It is one of those things that 5 years ago I would have been pretty confident that it wouldn’t be possible and now there seems to be significant breakthroughs almost weekly. If you don’t know much about AI image generation, hopefully this post will help you understand why it is so impressive and likely to cause huge disruptions in design.

Midjourney AI-generated image, “beautiful woman, frolicking in water, gorgeous blonde woman, beach, detailed eyes”

As a quick background, AI image generation takes a written description and turns it into an image. The AI doesn’t find an image, it creates one based on being trained looking at millions of images and effectively learning patterns that fit the description. The results can be anywhere from photo-realistic to fantasy artwork or classical painting styles… almost anything. There are several projects available for pretty much anyone to try for free, including Midjourney, Stable Diffusion, and DALL-E 2. I am using DiffusionBee that runs from my MacBook, even without a network connection (i.e. it isn’t cheating and pulling images off of the Internet). Oh, and image generation is fast…. from a few seconds to about a minute.

If it isn’t obvious why this is amazing and likely to be massively disruptive, imagine you need a movie poster, blog images, magazine photos, book illustrations, a logo, or anything else that used to require a human to design. The computer is getting pretty good and can generate thousands of options in the time it takes a human to generate one. It can actually be faster to generate a brand new, totally unique image rather that search the web or look through stock photography options. For example, the robot image featured at the top of this posting was generated on Midjourney with about 5 minutes of effort.

As a practical example, recently Sticker Mule had one of their great 50 stickers for $19 deals and I wanted to create a version of the Banksy robot I use for my blog header, but I wanted a new style, something unique to me. However, I am not an artist so coming up with anything that would look good was unlikely. Then I remembered my new friends the robot overlords, and thought I would see if they could help me.

Quickly cleaned-up source image inspiring a robot
The Banksy robot cleaned up just a little

One of the cool things about AI image generation is you can seed it with an image to help give some structure to the end result. The first thing I did is take the original Banksy robot, remove the background and spray paint from the hand, and fix the feet a bit. This didn’t need to be perfect of even good, as it is just used by the AI for inspiration, for lack of a better word.

I loaded up DiffusionBee, included my robot image and simply asked it to render hundreds of images with the prompt “robot sticker”. And then I ate dinner. When I came back to my computer, I had a large gallery of images that were what I wanted… inspired by the original but all very different. Importantly, they looked like stickers!

If you look through the gallery, above, you can see the robot stickers have similarity in structure to the original, but there is huge variation on many dimensions. In some cases legs are segmented, sometimes solid metal. Heads can be square, round, or even… I’m not sure what shape it is. The drawing style varies from child-like art to abstract. And again, these are all new, unique images created by AI.

The winning robot, headed to Sticker Mule

The biggest challenge I had was trying to pick just one… so many were very cool. I finally decided and it took about another 5 minutes to clean up the image a little to prepare it for stickerification.

When I looked at my contact sheet, my army of robots, it reminded me of my early days developing video games… if I had wanted to make a game where each player gets a unique avatar, it would have taken months to have somebody create these images. Today it takes hours.

I mentioned that things are progressing quickly… in the last few months we’ve gone from decent images to beautiful images to generating HD video from text prompts. It isn’t hard to imagine that, in the not-too-distant future, it will be possible to create a full length feature film from the comfort of your living room, and that the quality will be comparable to modern Hollywood blockbusters. The next Steven Spielberg or Quentin Tarantino won’t be gated by needing $100 million in backing to make their film, the barriers will be significantly smaller. AI has the potential to eliminate some creative professions, but it also has the ability to unlock opportunities for many others.

What are your thoughts? Is AI image generation an empowering technology that will democratize creative expression, a horrible development that will put designers out of work, or do you just welcome our new robot overlords? Leave a comment, below!

Hacking Smart Light Switches and Other IoT Devices

Gosund SW1 circuitboard with test hook clips on serial connections

If you’ve ever had a free weekend, a desire to create a more secure smart home, and questionable judgment, you’ve come to the right place. In this post I’ll talk about how to take common IoT (Internet of Things) devices and put your own software on them.

Disclaimer: depending on the device, this exercise can range from pretty easy to drink bourbon and slam your head against the desk difficult. Oh, and there is some risk of electrocuting yourself or setting your house on fire. So everything after this point is for entertainment purposes only…

Why Hack Your IoT Devices?

Most people creating a smart home take the easy path… pick out some cheap and popular devices on Amazon, install the smartphone app to configure it, and are good to do. Why would anyone want to got through the extra effort to hack the device? There are a few good reasons:

  1. Security: With few exceptions, most smart devices require installing an app on your phone, often times from an unknown vendor and with questionable device permissions needed. The devices themselves are tiny, wifi-connected computers, and also have software that is updated by connecting to a server in some country, and installing new software on the device connected to your home network. Having a cheap device connected to your home network that requires full access the Internet to work is bad, but it is worse when that software can be changed at any time, to do whatever the person changing it wants it to do. This could turn your light switch into part of a botnet, or worse, be exploited to attack other devices on your home network. By hacking replacing the software, you create a device that works properly without ever needing access to the Internet, lowing the security risk. You can also see (and change) exactly what software the device is using.
  2. Sustainability: Since the devices require communicating with an external company for configuration and updates, when that company stops supporting the device or worse, goes out of business and turns off their servers, your device becomes useless or stuck in its current configuration forever. By hacking replacing the software, you are able to support the device even if the company ceases to exists. And by using open source software with a robust community, you will likely have very long term support.
  3. Because I Can (mu ha ha ha): Okay, this is more of a fun reason, but worth mentioning. I’ve generally been much happier with the hacked versions of my products, whether it be my Tivo, Wii, or car dashboard. Smart light switches are a relatively low-risk hack, as they are inexpensive, and I’m assuming the risk is turning it into a brick, not causing an electrical fire (I’ll update the blog if I have an update on that).

Getting Started

My adventure started with the spontaneous purchase of a Gosund Smart Light Switch. Like a gazillion IoT devices sold by name brand and random manufacturers, this switch is controlled by an ESP8266. Most of these ESP8266 devices use a turnkey software solution made by Tuya, a Chinese company powering thousands of brands from Philips to complete randos.

For security and sustainability reasons, I decided I didn’t want this switch connected to my home network, and even if I wrote complex network firewall rules to limit its access, it would need to connect to the open Internet and other devices in my house to work properly.

I did some research and found Tasmota, an open source project that replaces the software on ESP8266 or ESP8285 devices, eliminating the need for Internet access and enabling functionality that make them easier to connect to controllers like Amazon’s Alexa. The older examples required disassembling the device and soldering to hack it, which is exactly not what I wanted to do. However, more recently there was an OTA (over the air) solution that didn’t require opening a device at all, and did all of the hacking over wifi… that sounded great.

Tasmota Wifi Installation

When I tinker I like to use a computer that I can reset easily so that I don’t have to worry about an odd configuration causing problems later. I have an extra Raspberry Pi that is handy for this, and installed a clean version of the Raspberry Pi Desktop to install on an extra Micro SD card.

I installed TUYA-CONVERT, which basically creates a new wifi network that and forges the DNS (how computers translate a name like tuya.com to numbers that identify a server) to resolve to itself rather than the Tuya servers, so that when the device goes to get a software update from the mothership, it gets the Tasmota software installed instead – hacking complete.

Gosund Light Switch In Dangerous Setting
An example of poor judgment, however the red load wire is capped, as that is a not good wire to touch when the switch is on.

I started running the tuya-convert script on my Raspberry Pi and, rather than go through the full process of installing the switch in the wall, I found a standard PC power cable (C13) was the perfect size to hold the wires in place or allow testing on my desk. DO NOT DO THIS – I am showing you only as an example of what a person of questionable judgment might do. The switch powered up and on the tuya-convert console I could see it connecting and trying to get the new software! I love it when things just work.

But then, it didn’t work. While there was a lot of exciting communication happening between Raspberry Pi and the switch, ultimately the install failed. Looking at the logs, I was getting a message “could not establish sslpsk socket“, and found this open issue, New PSK format #483. Apparently, newer versions of the Tuya software require a secret key from the server to do a software update, and without the key (only known by Tuya), no new software will be accepted. So, damn… these newer devices can’t use the simple OTA update. Also, if you have older devices, do not configure them with the app it comes with if you plan on hacking, as that will update them from the OTA-friendly version to requiring the secret key.

Tasmota Serial Cable Installation

I realized I was too far down the rabbit hole to give up, so it was onto the disassembly and soldering option. The Tasmota site has a pretty good overview of how to do this, although I thought a no-solder solution would be possible, and tried to find the path that requires the least effort (yay laziness).

Gosund Light Switch Circuitboard
Gosund light switch SW5-V1.2 circuitboard, pen for scale. The connection points are the six dots towards the top, running down the right side (zoom in for labels).

Opening the switch required a Torx T5 screwdriver (tiny, star-shaped tool), and I happened to have one laying around from when I replaced my MacBook Pro battery. Looking at the circuit board, I realized that very tiny labels and contact points, combined with my declining eyesight, made this a challenge. I took a quick photo with my Pixel 4a and zoomed in to see what I needed… the serial connections on the side of the board (look for the tiny RX, TX, GND, and 3.3 labels… no, really, look). While soldering would be the most reliable connection, I was hoping test hook clips would do the job.

Since I was already using a Raspberry Pi, I didn’t need a USB serial adapter, as I could connect the Pi’s GPIO directly to the switch. Again, the Tasmota project has a page giving an example of connecting directly to the Pi. Whatever method you use, it is critical you connect with 3.3V, not 5V, and the higher voltage will likely fry the ESP8266. If you have a meter handy, check and double check the voltage. And, if you’re using the Raspberian OS, you may find /dev/ttyS0 is disabled… you will need to add enable_uart=0 to your /boot/config.txt file and reboot.

I connected the switch directly to the Raspberry Pi. There ware several things annoying about this, starting with each time the switch is connected to the 3.3V, it reboots the Pi. And since almost every command to the switch requires resetting its programming mode through a power cycle, that means rebooting the Pi frequently (fortunately it is a fast boot process).

Test hook clips connecting the Raspberry Pi to the Gosund switch worked surprisingly well.

The good news is, the test hook clips worked, which was a bit of a surprise. I added a connection from Pi ground to switch 00 (green wire in the photo), as that forces the switch to enter into programming mode at boot (it is okay to leave that connected during the hacking process, or you can detach it once it is in programming mode). I made sure everything was precariously balanced to add excitement and more opportunities for failure into the process. I was able to confirm that I entered programming mode and had access to the switch by esptool, a command line utility for accessing ESP82xx devices. Success! 🎉

The bad news is, other than being able to read the very basics from the switch, like the chip type, frequency, and MAC address, pretty much everything else failed. And, each successful access only worked once and then required a reboot. I was unable to upload new software to the switch. After researching a bit, the best clue I had was problems with voltage drops on homemade serial devices, and wiring directly to the Pi circuitboard seemed like it might apply. At this point I needed a drink, and went with a nice IPA.

But hey, once you’re this far down the rabbit hole, why stop? I decided to try a more traditional serial connection, using a CH340G USB to serial board.

Serial Killer Part Two

Apparently there was an issue using the Raspberry Pi directly for the serial communication as the USB to serial adapter worked perfectly. I validated the connection using esptool and then used the tasmotizer GUI, which makes it easy to backup, flash, and install new software on the switch. Many steps require rebooting the switch to proceed to the next step, but that is as simple as unplugging the USB cable and plugging it back in (even better that it isn’t triggering a reboot of the Raspberry Pi each time).

Tasmotizer and the default web interface to configure your newly-hacked switch

Once the new software is installed, there is one final reboot of the switch (don’t forget to disconnect the ground to 00 or else it boots back into programming mode). At this point the switch sets up a wifi network names tasmota[mac] where [mac] is part of the mac address. Connect to this network and point your browser to http://192.168.4.1 and you are able to configure your device. Set AP1 SSId and AP1 Password to your home wifi, click “save”, and a few seconds later your switch will be accessible from your home network.

I’ll provide the details of configuration in a follow-up post, but I used the Gosund SW1 Switch template following these instructions to import it, and turned on “Belkin WeMo” emulation to make the switch automatically discoverable by Alexa, without the need to install special apps on my phone or skills on Alexa. The configuration process and connecting to Alexa was incredibly easy and took less than 5 minutes.

Update January 2, 2020: I added a post on hacking the Gosund Smart Dimmer Switch (SW2) and the Youngzuth 2-in-1 Switch, each of which required a different technique.

If you’re curious about attempting this yourself, have questions about my sanity, or have other experiences hacking your smart devices, I’d love to hear from you – please leave a reply below!

Google I/O 2019, Some Exciting Bits that Were not Obviously Exciting

Over the last couple of days I’ve been looking at the various product announcements that came out of Google I/O 2019 and there were a couple of themes that got me pretty excited about where Google can go and how that can make pretty a positive impact on millions of people.

Creating Opportunities for People… All People

I loved the Google Lens announcements from Aparna Chennapragada because the application of the technology can make such a huge difference in people’s lives, and not just the people I typically see in wearing fleece vests and sipping cold brew coffee Silicon Valley. What was most compelling to me was the transcribing / Google Translate integration that was demonstrated, especially when combined with the processing being done on device (not cloud), and being accessible to extremely low-end ($35) devices. Visual translation was always a very cool feature and, when I was trying to figure out menus in Paris, I was happy to have the privilege of a high-end phone and data plan. Making this technology widely accessible enables breaking down barriers created by illiteracy, assisting the visually impaired, and helping human interactions in regions with language borders.

Google also announced Live Caption, where pretty much every form of video (including third party apps and live chat) can have real-time subtitles. This is also done on-device, and works offline, so it can be applied to live events, like watching a speaker at a conference. A shoutout to my friend and former colleague KR Liu for her work with Google on this project, that makes the world far more accessible to people with hearing challenges.

Also notable, Google’s Project Euphonia is making speech recognition more accessible to people with impaired speech.

Movement Towards Device vs. Cloud

The “on device” and “offline” features I mentioned (and were part of other announcements like Google Assistant improvements) are important because of the implications they have in making the technology available to everyone, and also because of the personal privacy that capability will enable.

Of course, my data, Google’s access to it, and personal privacy is a much larger, complicated conversation… for now I am going to focus on possibilities, not challenges.

For years there has been a move for all aspects of people’s lives to be captured and collected in the cloud. There are many reasons this may have been necessary, from correlating data to make it useful, raw computer processing power requirements, over-reaching policies, and business models requiring all the things to win. Once in the cloud, personal information can be used for purposes never imagined by the consumer, including detailed profiling, sharing with third parties, accidentally leaking to malicious parties, revealing personal content, and various other exploitations that can negatively impact the consumer.

As the processing stays on your device and does not require transferring data off of your device, it enables products that can still provide incredible benefits while also being respectful of customer privacy. This is exciting as there are product opportunities in areas like personal health (physical and mental) that will likely require deep trust and protection of consumer information to gain wide acceptance and benefit the most people.

Personal Assistant of My Dreams

And something I am more selfishly excited about…

For several years I wished that all of the products in Google would integrate with each other and eliminate almost every manual step I have to organizing my day. I am going to side-step the discussion about how much data a company has about an individual and say that I intentionally choose to trust my information with two companies (Google being one), because of the value I get from them. I use Google to organize most aspects of my life, from email communication to coordinating my kid’s schedules, video conferencing, travel planning, finding my way around anywhere, and almost every form of document. As a result, all the parts of Google know a lot about me. But still, when I send an email to setup a meeting, I usually need to manually add that to my calendar and then I also need to add in the travel details (I frequently take trains instead of driving)… it’s a couple of extra minutes that I could be spending on better things, or just looking at pictures of cats on the Internet.

With the progress of Google Assistant and Google Duplex, I am seeing a path where administrivia is eliminated, where email, text messages, phone calls and video conferencing can also provide inputs that guide this assistant into organizing my life behind the scenes… Action items discussed in a Hangout can automatically result in a summary document, a coordinated follow-up lunch, optimal travel details, and a task list.

There is an obvious contradiction between my excitement for the announcements that emphasize better human outcomes and my “let Google know all the things” excitement over a personal assistant, but again, this is about my personal, intentional choice to share data vs. products that mandate supplying personal data, often far in excess of what is necessary to deliver the product or service.

There were some other “that’s cool” announcements, and I’ll probably be buying a Pixel 3a, which seems like a great deal for the feature set, but overall I’m more excited about the direction than the specific products showcased.

Hinder, Don’t Halt: Griefing Content Thieves for Fun and Profit

The art of deterring content theft is an ongoing game of cat and mouse – generally any barrier you create to prevent theft is temporary, as thieves continue to find new ways to steal the content, so long as the value of the content exceeds the effort necessary to steal it. For this reason, it can often be more effective to hinder thieves instead of trying to stop them.

I encounter this “hinder don’t halt” pattern with others that run large services, and you can see this reflected in solutions like shadow banning. One of the most common themes I hear is the satisfaction that comes from solutions that cause frustration for bad actors, so I’m sharing one from my personal experiences…

At IMVU, customers called Creators make content that they sell to other IMVU customers. The content they create is 3D items like avatar clothing, items to decorate an environment, and ways to customize an avatar. This content creates real value for other IMVU customers, who spend real money to purchase it from the catalog of over 10 million items. While many Creators create content just for the enjoyment of creating, some do it as a business, with a few making over $100K US annually. Whether creating for pleasure or business, all Creators hated having their work stolen. And, since there is real money from the sales of content, there is real incentive for thieves to try to steal it.

At one point we discovered a site that was selling a service that would allow people to steal Creator content without paying for it. It was pretty easy to detect the service and the initial response was blocking them, which immediately broke their service completely and, not surprisingly, made the thieves quickly respond by finding a new way around the block. The block lasted less than a day and the thieves were back in business.

The next response was more fun… rather than blocking the thieves, we made their service not work… sometimes… and inconsistently. Code was added to detect thieves accessing content and randomly some content being accessed would be mildly corrupted. The corruption could be configured to occur at certain rates, on certain items, at certain times of day, and be disabled based on what appeared to be testing for the corruption. As a result, customers of the thieves started getting inconsistent results, that would sometimes lead to content failing to load and even crashes. If you are an engineer reading this, you understand why this is a nightmare scenario to debug and fix… customers are reporting different failure cases with no consistent way of reproducing the problem to understand the cause. And, since your code is working fine, the bug isn’t going to be found… you eventually have to discover that you are being served different content than is being served to legitimate customers.

The result of hindering was much more effective than blocking… it took many weeks for the thieves to understand what was happening and, during this time, we could see them getting bashed by the people that paid them because the stolen content was ruining their experience. By the time the thieves had found another solution, they had such a bad reputation that people were less willing to give them money.

If you have dealt with content thieves I would be interested in hearing your stories, successful or not. Please leave a reply, below!

Credits
Cat and mouse chase image by Jeroen Moes
Dungeons & Dragons dice by Lydia

Scaling Continuous Delivery: Happiness as a Metric

A few days ago Jeff Atwood (Coding Horror) suggested a good measure of a tech company’s health is the time it takes to have a simple change become available to customers:

And while there are numerous metrics that determine the health of a tech company (see Jez Humble’s book, Accelerate, for an amazingly comprehensive overview), Continuous Delivery strongly correlates to successful outcomes.

I support Jeff’s assertion – I witnessed the value created by Continuous Delivery at IMVU, where we pioneered some of the crazy processes that would be followed by more sane practitioners. From day one, IMVU placed value on the speed of product iteration and “designed” build systems accordingly. In 2006, development was done in Windows using Reactor Server to provide the LAMP-ish stack, and the deploy process looked something like this:

 svn-server$ rcp website/* production:/var/www/

If you’re wondering why I omitted the test framework, I didn’t. Code went from a local Windows sandbox environment, to source control, to live on a Linux environment running a version of PHP different than the local sandbox. Fun! While there were numerous problems with this system, iteration velocity was amazing (those 1 word copy changes could ship in less than 5 minutes, and so could full features). This development velocity was a key component enabling IMVU to build a large, successful business in a space where the failed companies outnumber survivors 50 to 1.

And to be clear, time to get a change live to customers doesn’t in itself indicate healthy tech, but a lot of tech health comes from the corresponding systems necessary to make rapid deployment work.

Fast forward a few years to 2008 and I transition from leading the operations team to leading the engineering organization, where the build and deploy systems had matured, with reasonable test coverage, and automated deployment, with automated rollbacks when something unfortunate made it into production. It was pretty cool, even though publicly the process was mostly received with the sentiment, “that will never work , and certainly won’t scale”.

Scaling Problems

One of my first challenges as the new engineering leader was a team unhappy about their ability to get work done because it was taking hours for a commit to become live to customers. It’s astonishing when you think about it – every engineer in the company had come from companies where the commit to live process was typically measured in months, but once they experienced the value of Continuous Delivery, anything more than minutes seemed unbearable.

Digging into the problem, I came to understand that the problem was not slow builds (although that was part of it), the most significant issues were caused from the shared responsibility for build systems, combined with the desire to deliver features to customers, created a tragedy of the commons. When an engineer had a failure in the build system, the optimal solution for that engineer was to fix the problem in place, blocking the build system for anybody else in the queue, which meant the number of commits in the next build increased, which meant the chance of a failure in the that build increased, ad infinitum. The result was pushing to production could be blocked for hours, sometimes most of the work day.

Solving for Happiness

I thought the best solution was to formalize a project, have a clear success outcome, and have a single person with the responsibility for (and therefore authority over) the build / deploy systems. The first problem was determining a clear success criteria… anything time metrics I chose would be somewhat arbitrary, so instead I chose engineering happiness as the success criteria, or more specifically, pushing to production was no longer causing unhappiness. While I generally hate subjective success criteria, there were ways to assess progress through 1:1 conversations and Likert scale surveys. We also had great (highly objective) data around commit to deploy times, so we could see the correlation to the more subjective happiness index.

There was some pretty straightforward work to improve the actual test and deploy speeds, including simple things like adding more hardware and the slightly less simple sorting tests to run by speed (a surprisingly large performance gain), and fixing the slowest of the tests. But some of the most important gains came from the human parts of the deployment system… engineers were required to immediately revert code and fix the issue in their sandbox rather than blocking the build system. This was not a popular policy change as immediately engineers experienced the direct impact from a failed commit, but didn’t immediately see any gains to the overall system.  But after a few weeks the improvements were clear in the average commit to deploy time. And giving credit where it is due, Eric Prestemon was the “Buildbot Sheriff” that identified so many of the opportunities for improvement and delivered the results… many people helped, but Eric had the burden of hearing a lot of critical feedback about unpopular policy changes (eventually outweighed by the praise for the results he produced).

Eventually the build system frustration ceased being a common topic in 1:1 meetings, and it faded away as a meaningful problem in engineering surveys. 12 minutes. When the commit to live time is 12 minutes, this system is operating well. That became the new value for alerting – under 12 minutes, all is good, after that we need to actively drive improvements. In practice, deploy time was usually around 11 minutes, 8 for parallel test builds/runs and 3 minutes for rollout checks (thanks for the reminder, @jwatte).

Diminishing Returns

I have been asked why we didn’t try to make the build and deploy systems as fast as possible… why not 2 minutes? We constantly worked on optimizing these systems, adding separate hypothesis builds, automatically isolating build servers to allow diagnosing and fixing without blocking, etc. And sometimes deployment would take less than 9 minutes.

However, much like the difference between 99.99% and 99.999% uptime for a service, the difference to the customer can be negligible while the resources necessary to deliver that improvement can be extraordinary. When business requirements are being met and engineering is happy with deploy times, the resources necessary to dramatically improve were better spent delivering value to customers.

Key Takeaways

  1. Working in a (well functioning) Continuous Delivery environment is empowering, naturally encourages other strong technical practices, and is hard to retreat from once experienced.
  2. Certain problems fall into what I call the “roommates and dishes” category, where “it’s everybody’s responsibility” sounds good, but in practice actually means “it’s nobody’s responsibility”. In these cases it is better to find a results-driven person and ensure they have responsibility and corresponding authority.
  3. Hire Eric Prestemon or somebody like him.

 

Have you worked in a Continuous Delivery environment and experienced non-obvious scaling challenges? I’d like to hear about your experience – please leave a comment!