Unfortunately, it looks like the SDS011 might not be suitable for longer-term comparisons.
My assumption at the beginning of the experiment was that I could calibrate the sensors by adjusting for measurement differences; by running the machines all in one place, I would get a single number that I could add (or subtract) from each machine to bring it into alignment with the others.
This might not be true.
When, for instance, compare Pi1 to Pi0, and subtract the adjustment number for Pi1, the results start off fine–but toward the end, they dip into the negative! Air pollution, obviously, can’t be negative. The trouble is that adjustment number. It is larger than the measured air pollution, so subtracting it leads to a number less than 0.
I’m pretty sure that this means that Pi1’s measurements are drifting over time–and that they’re drifting downward.
To find out, I subtracted Pi1’s measurements (near the tracks) from Pi0’s (far from the tracks). And, indeed, the difference between them grows, if slowly, over time. This implies that they are drifting out of alignment. Pi1’s measurements are drifting slowly downward. (Because of double negatives, (Pi0–[–Pi1]), the line has a positive slope.)
The results for Pi2 are similar, but less pronounced. Pi2 is drifting upward, but slower than Pi1 is drifting.
Of course, we must assume that Pi0 is also drifting.
The question now is, can I use a m+bx equation to adjust each Pi instead of a flat m? Of course, this won’t work for this experiment, since I can’t determine the m+bx from the data I have, since they’ve already been placed. But, perhaps in the future, I can, instead of using a fixed number, use an equation.
The results are in, and drum roll please…. There’s reason to investigate further.
The air quality beside the Weston—Mount Dennis tracks is, if my calculations are correct (and there are lots of ifs), ever-so-slightly worse than the air away from the tracks.
Proving so was tricky business. A recap:
First, I calibrated the sensors by running them all at my house. I found that, indeed, they are all measuring well, and that roughly 95% of the variability in the sensors can be attributed to variability in the atmosphere.
Each sensor, though, measures ever-so-slightly differently. So while Pi0 might show an average over 24 hours of 10.2μ/m³, Pi1 would measure 11.4, and Pi2 measure 12.1. Over time, however, these differences are more-or-less constant.
In Figure 1, Pi1 (green) consistently measures higher than Pi0 (blue), which tends to measure lower than the average of all four. But, importantly, all four Pis move in sync–so we can be reasonably sure that they’re actually measuring something.
The problem with using Pi0 and Pi1 to measure the air quality beside the tracks is obvious: if I put Pi0 there, the air would seem better than it is. If I put Pi1 there, it would seem worse. The trick, then, is to adjust the Pis.
Unfortunately, there is no calibration screw on the sensors, so we need to calibrate them mathematically.
The difference between each Pi’s measurement and the average of all Pis can be used as a constant to adjust each individual Pi. Pi0 can be adjusted up (or Pi1 down) by adjusting by the difference between its average readings over time, and the average of all the average readings over time.
To calibrate the machines, I took 100 measurements over a several days outside my house. The results were as follows:
This Pis average
Average of Pis 0-3
Adjustment to average
Adjustment to Pi0
This Pis average
Average of Pis 0-3
Adjustment to average
Adjustment to Pi0
So, in order to adjust, say, the Pi1 so that it reads the same as Pi0, we subtract 1.64 from its PM2.5 readings, and 1.43 from its PM10 readings. (Pi0 is an mathematically arbitrary reference, but it’s the one farthest from the tracks—it’s at my house—so it makes sense as a comparison. And we could just as easily adjust to the average of the Pis instead of a particular Pi; again, it is just an arbitrary choice.)
Knowing, then, how to make a comparison, I did so.
Locations next to the tracks appear to have slightly higher concentrations of PM2.5. My house is approximately 425m from the tracks, and I have an average concentration of 2.9μ/m³. The house nearest the tracks (Pi2) has 6% more, and the house slightly farther away (Pi1) has 8.4% more.
Adjusted average relative to Pi0
Measured difference to Pi0 (abs)
For PM10, the results are similar. My house (Pi0), has 5.3μ/m³. The house nearest the tracks has a 15% higher concentration. The house slightly farther away from the tracks has a 16% higher concentration.
Adjusted average (ref Pi0)
Measured difference to Pi0 (abs)
So, there’s good news and bad news–and much news in between.
The bad news first.
It does look like the homes next to the tracks have higher pollution, though there are many caveats. First of all, this is a short-term reading (one week). Second, the sensors, though I’ve done my best, are cheap. Third, and most importantly, I’m very new to this. I might have totally screwed it up.
Third, and most importantly, I did screw this up. For PM2.5 the statistical certainty of my findings is quite low: p=.36 (for Pi0:Pi1 PM2.5) and p=.28 (Pi0:Pi2 PM2.5. That means there is a roughly 30% chance that these effects were due to random variations, and not physical differences.
For PM10, my results are much more certain. It seems like the PM10 pollution near the tracks is about 20% worse, with a p<.03.
The good news:
The sensors appear to work. We can (and will) continue to monitor the pollution levels, and we can do so cheaply.
The in-between news:
It is very far from clear that these differences in concentration are meaningful in terms of human health. The air quality, even when it is slightly more polluted, appears to be very good. The differences are small.
It is not at all certain that the tracks are the source of the pollution. Mount Dennis is a few kilometers away (and closer to downtown) than my house. The differences could be attributable to that distance, or something else entirely.
Indeed, it is possible that the tracks are not the source of pollution. The house 72m from the tracks had slightly higher levels than the house 39m away. This defies my predictions.
We should study the pollution more; the PM2.5 results could be due to chance.
On July 5, 2018, I installed the first remote air-quality sensor at a home approximately 72m from the centre of the Georgetown GO/ UPX line (with the owners’ permission, of course!)
The installation went very well, and took a matter of minutes; it’s only really necessary to configure the Pi’s wifi. Unfortunately, RealVNC didn’t work, for reasons I can’t determine, but since the Pi reports to the server perfectly well (and restarts the measurement program on restart), it isn’t really necessary.
I was very worried that the sensors wouldn’t be very good. After all, they’re cheap, poorly documented, and come from a virtually unknown manufacturer.
Happily, there are statistical tests to to tell how good the sensors are—and the TL;DR? They’re not bad at all!
I used a statistical test called the Intra-Class Comparison to measure whether the four different Pis I built agree with each other. And they do! When one Pi reports bad air quality, the other Pis tend to do so, too.
Of course, that’s the short version. Herewith, the longish version:
In a perfect world, the air quality monitors would all report exactly the same number, like in Figure 1. They would give us results that are perfectly correlated.
Of course, it’s not a perfect world, and these are cheap sensors. They vary. The issues, then, are how much they vary, and whether that is an acceptable amount.
Figure 2 shows another imaginary example. The AQMs make wild swings, and, worse, the swings are uncorrelated. When one AQM reports an increase, another reports a decrease. When one goes up a lot, another goes up a little. It’s a mess.
These are uncorrelated results, and if we received them, we would know our sensors are random number generators.
There are middle grounds between perfect correlation and no correlation at all.
In Figure 3, the results are loosely correlated. Generally, when one sensor reports a change, the others do, too. However, in each case, the sensors measure different sizes of change.
And in Figure 4, we have an excellent result, and one that somewhat approximates the results I found with the four Pis I used: each sensor reports a consistent change. If Pi1 reports, say, a PPM of x, then Pi2 reports a PPM of x+2.
Of course, it would be best if there was no variation between the sensors, and if they reported exactly the same results. But, if they’re going to vary, this is just the kind of variance we want, because it’s easily corrected for. And, more or less, that’s what we got.
Our Pis varied, but they varied by a reasonably consistent amount. Pi1 was always a little higher than Pi2, which was usually a little higher than Pi3 and Pi0.
This is great, because it means we will be able to make meaningful comparisons between different parts of the neighbourhood. We can take the results from one AQM, adjust them by the constant, and compare it to the other adjusted AQMs. Thus, we will be able to see whether local pollution conditions are better (or worse) than other locations in the area.
I can hear you in the back. Correlation does not equal causation. Not quite right, but I catch your drift.
Correlated results aren’t necessarily good results. For instance, our air quality monitors could, unbeknownst to us, be measuring humidity, not pollution. As long as they are all measuring humidity consistently, we would never know, because they are all correlated.
Quite so. Still, I can’t think of any way to check every possible, non-particulate, cause. It could be that they are all measuring humidity, temperature, sunshine, the radio waves, sunspots, or the Blue Jays’ score. At some point, we have to just have faith that the sensors are doing what they say they are doing and look like they are doing.
When your Pi boots, it should start recording air quality data. It won’t flash or bing or do anything science-y sounding. Your only chance to notice it will be once an hour, when the fan starts spinning to suck air into the monitor. (If you really want to check, you can log into the Pi over VNC and see if it’s working by searching through running processes.)
As it measures the air quality, the Pi is recording data to a spreadsheet in the AQM folder called “alldata.csv”. It is also trying to send data to my webserver, because I haven’t got around to fixing that yet—not to worry, though; no data is being sent because your Pi is not able to log into my server (and I’m not able to log into your Pi).
The Pi saves a lot of data to alldata.csv. It saves 40 measurements an hour (20 for each of PM2.5 and PM10). There’s no good reason for this, and I should make it save only an average, but it has proven useful¹, and there’s no discernible harm (after 15000 measurements, alldata.csv is still less than a megabyte in size).
The number of measurements does make drawing inferences a little difficult. The trick is to use the moving average function on your spreadsheet software of choice. Chart 1 shows the AQM data for a week in May, 2018, in my backyard. I’ve drawn two moving averages, one for each of PM2.5 and PM10.
¹ The Pi takes 20 measurements, once an hour. Weirdly, the first measurements are always lower than the others. I’m glad I kept all the data (over Mohammad’s objection) because we were able to find this flaw. It doesn’t make a lot of difference to the results because we are making relative comparisons and the error is consistent. But it’s there.
We wanted our air quality monitors (AQMs) to be weatherproof, so I used cheap, dollar-store tupperware enclosures with snap-tight lids. Each was $1.50 CAD.
The SDS011 must be connected to a short hose if it’s going to be enclosed. I used some hose I had around the house from making beer.
The tupperware needs three holes: one for the intake, one for the power cable, and one for the air outlet—which I, stupidly, forgot at first.
I used a hot glue gun to melt holes in the sides of the cases, pushed the tubes and cable through, and then sealed them up with hot glue. (I did try using caulk; hot glue worked better.) The power cord was rather larger, so I also sealed it with Gorilla Tape, just to be extra sure.
Create a startup script that will start the software every time the Pi boots
Step 1: Downloading the package from adamnorman.com
This should be easy. I’ll leave you to it. Download this.
Step 2: Copying the files to the desktop
RealVNC allows you to copy files between your server (the Pi) and your viewer (your computer). As of June, 2018, the process for doing so (on a Mac) is as follows:
Open RealVNC on the Pi by clicking on the black and blue VNC icon in the top right of the Pi’s menu bar. Open “File Transfer” from the hamburger menu.
Select the desktop for the location your files will be saved to under the “Fetch Files to” option box, and close that dialog.
Placing your cursor on the menu bar of the RealVNC client window. Click on the two-way arrows.
Click on “Send Files” to send the files you unzipped on your computer to the desktop of the Pi.
Your files should now appear as a folder on the Pi’s desktop. Important: the program will only work if it’s installed on the desktop.
Step 3: Creating a boot process
You will now force the Pi to start the measurement software every time it boots up. Unfortunately, this step requires using the Terminal, which is a pain. Not to worry, though; you only need to type, and you won’t need to understand what you’re typing.
Open the Terminal by clicking on the raspberry in the top left, then “Accessories”, then “Terminal”.
Now type: “sudo nano /etc/rc.local”. This will open a very ugly, very tiny version of Microsoft Word right in the Terminal window. You’ll use this word processor to edit one of the files the computer reads when it starts (the file is called rc.local).
Move the cursor (with your keyboard arrow keys) to the line that says “fi” in green. Press Enter or Return on your keyboard to make a new line.
Type (or copy) the following words into the document, on the blank line below the word “fi”: “sudo python3 /home/pi/Desktop/AQM/main.py &”
Press Ctrl-X, and save your work.
The computer will read rc.local when it boots, and will start the program main.py when the it boots.
We still need to build the sensor, but the programming part is done!
To set up the Pi, hook it up to an HDMI-compatible TV, a keyboard, and a mouse. They keyboard and mouse must be attached with a USB hub and an OTG USB cable–an adapter that converts the full-size USB cable to a micro-USB male end.
Configuring the operating system is straightforward, with a series of dialogs to help users configure the settings.
Only four small customizations are required:
Setting the time to local time,
Changing the system password
Getting the wifi running
Setting the local time is required to get the Pi to report pollution data accurately, and VNC is used to control the Pi from a remote computer. It is software that ‘projects’ the Pi’s desktop onto your desktop, and allows you to control it as if it were in front of you, with a keyboard and mouse.¹
Both VNC and the time are set within the System Preference dialog, which is under the Raspberry icon at the top left.
You can set the Pis preferences in the Raspberry menu in the top left.
Chance the password while you’re at it.
Enable VNC, which will let you connect to your Pi over the internet using a keyboard and mouse.
Set the timezone in the same panel.
Next, you will need to connect to your wifi, which is very straightforward, though confusingly named. When the Pi asks for your “shared key”, enter your wifi password, if any.
After you’ve enabled VNC on the Pi, it’s easy to connect to it with RealVNC. You will need ‘client’ and ‘server’ software, on your home computer and the Pi respectively, but it’s no harder to use than GMail.
Finally, once RealVNC is up and running, you’ll may want to allow your Pi to be remotely administered over the internet (and not just your local network). If so, enable cloud connections under the RealVNC options menu.
¹ There are other ways to do this, using the terminal and SSH. They are agonizing.
The air sensor connects via USB to a computer. For indoor use, hooking the sensor up to any computer would be good enough for spot readings, but our plan was to put the computers outside. We decided to use a tiny and very cheap computer called the Raspberry Pi Zero W.
Raspberry Pis are bare-bones computers that cost between $10 and $50, but do not include screens, storage, or any peripherals—not even an electrical cable to power them with. They do, however, run a full operating system, which makes them quite easy to use compared to the alternatives we considered for this device (notably Arduino).
We settled on the second-cheapest Raspberry Pi because we have very simple computing requirements. Though the Pi Zero W is very slow compared to any other modern computer, it is certainly capable of doing the computations we need. We did not use the cheapest Raspberry Pi, the Pi Zero (without a W), because we wanted our computers to be able to report wirelessly.
We were forced to purchase the Pi Zero Ws in kits, which included a case, a power supply, and a MicroSD card, because the computers alone are rationed out at one per customer. Unfortunately, this drove the price of each Pi up from an advertised price of $13 to $65 (CAD, including tax and shipping).
The Pi Zero W (which I’m going to call “the Pi” from now on) runs Raspbian, a free operating system, which can be installed using a utility called NOOBS. Raspbian comes with the free programming language Python preinstalled. We used Python to collect and manipulate the measurements from the SDS011 air-quality sensor.
Setting up the Pi is fairly straightforward if you have the cables and peripherals—but to do it over a graphical interface, you’ll need quite a few of those, including:
A MicroSD card writer
A USB hub
And a USB keyboard and mouse
A mini-HDMI to HDMI adapter
An HDMI cable
A USB OTG cable
A somewhat beefy USB power adapter
In addition, you’ll need a MicroSD card. These peripherals add considerably to the cost of the $13 Raspberry Pi if you don’t have them in a drawer somewhere.
There is excellent help available on the internet, particularly Reddit, to get you to the point where you can boot the operating system.
After looking over many options for an air-quality sensor, I settled on the SDS011. It costs about $25, and can be ordered on many Chinese retail sites, such as AliExpress.
The SDS011 is well reviewed, and “developed by inovafit, a spin-off from the university of Jinan”[sic]. It reports the concentration of ultra-fine (2.5 micron) and fine (10 micron) airborne pollution in μg /m3, which are standard measures called PM2.5 and PM10, respectively.
There are several other sensors, but the SDS011 had the benefit of being well reviewed and capable of being connected over USB. I found that soldering joints and using breadboards were very difficult and unreliable.
The sensor has problems, though. The documentation is sparse, and it does not come with a program to make it function and record the data. These have to be written (or downloaded). The specifications are also written in poor English.
Finally, I found it hard to believe that a $25 sensor would do a good job—that it would be accurate, reliable, and consistent with other sensors. I was glad to be mistaken about these concerns.