Automated Crop Protection System

Engineered a distributed IoT system on ESP32 and Raspberry Pi, utilizing a YOLO-based computer vision pipeline for autonomous pest identification.

Media

Design Decisions

For my senior design engineering project, I worked in a team of 8 to build a pest detection and deterrence system for Duke Campus Farm. Essentially, farms lose tens of thousands of dollars in crops every year due to pests like groundhogs, squirrels, and raccoons. But certain farms that are organic certified have restrictions that prevent them from using pesticides as deterrents. And for ethical reasons, farmers don't always want to shoot/trap the animals. Additionally, as the engineers solving this problem, we had restrictions of our own; we had to design the system with the following constraints: + power constrained (relying only on solar and batteries); + windproof and waterproof + subsystems had to communicate over a local mesh network (isolated from the internet) + material design had to avoid chemicals that could leech into the soil These numerous constraints made this a thorny problem, so our team brainstormed a variety of solutions. After two rounds of Pugh Matrices and 18 solutions, we settled on one. For detection, we would use a scaffolded approach- we would start off with a low-power, low-certainty detection method, which would then trigger a second stage that is higher-power and more certain. If the second stage was still uncertain, then we would escalate to a final third stage which would use a camera feed to act as the "final judge" on whether an animal was present or not. For deterrence, we decided to use a combination of lights, sound, and motion; we settled on using a floodlight, a loudspeaker, and a motorized scarecrow. With the scope narrowed down, we divided into a detection subteam of 4, and a deterrence subteam of 4. My part in the project was owning the camera pipeline and setup of the OrangePi (essentially a RaspberryPi with an onboard NPU that can be used for ML inference.) My plan was to build and train a custom CNN pipeline to take in an image, and use multi-class classification to determine if the frame included a specific species (just a binary classification was sufficient for the purposes of the project; however our teacher said she'd grant extra credit for species ID). To this end, I gathered thousands of images on Hudson quad and some other locations, by taking videos at varied heights, rotations, and distances from a stuffed animal as a proxy. I wanted to get the training data as messy as possible, including shadows, lighting artifacts, and people/bikes in the background. I used standard practices of using a train-val-test split, and the initial training was quite good. Suspiciously good actually, and I determined that there was data leakage occurring. I realized that since I took most of these videos, if I treated frames from one video as "distinct", then the model could "cheat" on the val set since it could see adjacent, similar frames in train set. After fixing that so the videos don't appear in both train and val sets, the training was quite bad (like 30% val accuracy if I remember right), but I used a few approaches to improve this. First I used data augmentation to vary the training set: color jitter (tint/brightness), rotations, and also addition of some noise to mimic blurs and rain. After a few hours, I ended up getting about a very nice confusion matrix, with a high val accuracy of 98% (see image above), and a more sensible loss curve. Now the last step was to integrate this model for processing on the OrangePi's NPU. However, around this time, our other subteam unfortunately ran into some trouble. Two of their teammates withdrew, which left them quite short-handed- as it currently was, none of the lights, sound, and motion were working. It was in a bad place, so in order to make the final deadline, me and two of my friends from the detect team made the decision to redirect our efforts to help assist with the completion of the deterrence subsystem. Doing a deep dive into the wiring, I diagnosed and corrected several issues- incorrect voltage sources, lack of insulation causing shorts, a wire that somehow had the ends soldered red-to-black and black-to-red (a cruel joke! it killed our buck converter), and also some software bugs. My teammates also took over the inter-subsystem communication side of things, and we defined a protocol over UDP that broadcasted and searched for the deter subsystem (and had a heartbeat for reconnection, in case snow or rain disrupted the signal). Back on the image classification side, we decided to implement a much simpler, YOLO-based, binary classification pipeline instead (which had a lower 80% val accuracy but overall satisfied our teacher's requirements). Moving to deployment, we drove together to Duke Campus Farm to install our final system. The nights were long and cold by that time in November, below freezing, but ultimately we stuck it through and got a working demo in time for our final. We had to take it back to lab and reinstall it again (reinforcing the solder connections across both systems and neatening up the wiring onto some protoboards).

Key Learnings & Takeaways

Despite our setbacks, I feel it was such a great thing to work in this big of a project- I even made a few new friends through this group, and I introduced a few of them to my favorite sandwich shop. As one of the last classes I took at Duke, this was an excellent project to unite the various concepts I learned in my academic career. This project nicely tied together IoT, embedded systems, signal processing, machine learning, and electronics. Seeing our project succeed in the real world, for real people, was such an impactful moment for me.