Tile Farm, our map rendering server, has undergone a quiet overhaul during the past three months to deal with wild swings in demand. In March, we released the worldwide Watercolor maps to bring the number of visual designs at maps.stamen.com to three. With great press and kind words comes unprecedented demand, so I’ve been working with Tethr’s Aaron Huslage to change the way we use geographic data to handle the load. We’ve learned a few things along the way that will be important features of Stamen’s future mapping work.
The general theme of our optimization work has been to speed up response times and shrink problems. Tiles were being rendered reliably most of the time, but the overall experience was sluggish and unresponsive. Sometimes, the watercolor tiles would fail altogether. System load was weirdly high, and we didn’t have a reliable way to understand what we were seeing.
Repair #1: Postgres and Imposm
Looking into the system, Aaron noticed that several common map feature queries were taking an unusually-long amount of time for Postgres to process, even after performing basic database tunings to use memory and cache more effectively. Our first significant move was to attempt to shrink the cost of those queries, typically for large features at low zoom levels.
We found queries for lakes, forests, riverbanks and similar features that introduced massive overhead due to the number and complexity of their shapes. After doing a bit of research into simplifying and filtering geometries, I remembered Oliver Tonnhofer’s Imposm mentioned by AJ Ashton in his 2011 State Of The Map presentation on Tile Mill. Imposm is an OpenStreetMap data importer and an alternative to the older Osm2pgsql, and in addition to solving our needs for simpler, fewer shapes through its GeneralizedTable feature, it also answered a bunch of questions we didn’t know we had yet.
The first unexpected bonus from using Imposm was that we could create special, custom tables for exactly the data we’d need for any particular selection of data. Instead of the resource-sucking Postgres views onto generic point/line/polygon tables offered by High Road, Imposm allows us to create a larger number of rendering-specific tables with no more than the exact data we’d need.
The second unexpected bonus is that Imposm is written completely in Python, so adding new post-processing steps to the data is trivial as it’s imported into the database. For example, we’re starting to get rid of long, confusing regular expressions for abbreviating street names and replacing them with procedural code that can be more easily customized and shared.
The nicest pleasant surprise from Imposm is that it’s actually much faster to run than Osm2pgsql, thanks to a concurrent design that divides the import into parallel processes.
Repair #2: More, smaller Gunicorns
Tile Farm uses a WSGI server called Gunicorn to host instances of TileStache. A herd of gunicorn processes are hidden behind the webserver Nginx, which protects them and delivers requests. This is a common arrangement, and in our first design for Tile Farm we had a single Gunicorn configuration with all of our TileStache settings. With a rush of visitors to the site upon launch and the unique design of the Watercolor layer, we had to perform a number of fast, targeted interventions to improve performance and keep up with sudden demand. Watercolor in particular presented some debugging challenges, and we found that it was important to test changes to it in isolation from the other styles.
After working with a separate Gunicorn configuration for just Watercolor on a different port, it was clearly going to be easier to extend the same setup to the remaining Toner and Terrain styles, each of which is actually built up from a number of sub-styles composited together after rendering in Mapnik. The TileStache configuration is now split, with one for Toner, one for Terrain, one for Watercolor and a few others. Instead of interfering with the entire service when performing maintenance we can now modify individual styles and lower the extensive cost of restarts. Unlike Apache processes, Gunicorns can often be hard to kill and they don’t go down easy, so it’s important to be able to target them more narrowly. The primary drawback to this approach is complexity: each Gunicorn/TileStache setup runs on a separate port, governed by a separate startup script with its own logfile and other reporting. Another drawback is hundreds of persistent connections from Mapnik to PostGIS left open at all times, which Aaron assures me is weird and wrong. Still, the ability to isolate problems and even permit limited crashiness in construction areas has been liberating.
Repair #3: The Gunicorn Slayer
I mentioned above that Watercolor has its own special snowflake problems. Processes rendering these maps would eventually become unresponsive, not replying to requests and not giving up their slots to be replaced by their parent Gunicorn server. We looked at a number of possible causes and fixed each: watercolor texture bitmaps are now loaded globally instead of per request, and we no longer use metatiles since the CPU cost of Watercolor is linear with bitmap size. Debugging this was confusing, then frustrating, then finally a waste of time. We settled on a more drastic solution and built the Gunicorn Slayer.
The Gunicorn Slayer is inspired by the Netflix Chaos Monkey, “a wild monkey with a weapon in your data center to randomly shoot down instances and chew through cables.” Admittedly less chaotic, ours borrows a strategy from Logan’s Run and retires any Gunicorn process older than a few hours.
This last fix still feels somewhat dirty since we were unable to determine the root cause of the problem, though if we were to revisit it all again an excellent post from Pinterest’s engineering blog gives some great hints on repurposing process titles and POSIX signals to gain some visibility into a running process.
Then And Now
Maintaining reliability for a project like maps.stamen.com is always going to be a long game of catch-up. Our goal has been to make the project slightly easier to ignore every day, by increasing reliability and finding and fixing trouble spots. The fix with the single largest impact has been a move from generic to bespoke OSM data through Imposm, followed closely by moving to a more-and-smaller approach with the rendering servers.
One of the most common enquiries we’re reading about Watercolor maps is about how to make prints. Given such demand, we’re thinking about a new tool to help you select an print an area of a map. Stay tuned for news about that.
I answered a bug report from Michael in Switzerland, who’d found the “underwater” bug in an area of the map he wanted to use on a poster about OpenStreetMap for a conference called GEOSummit, to be held in Bern June 19-21. As you can see below, he worked around the bug by blending Watercolor with the default OpenStreetMap tiles, and the fabulous OpenCycleMap, including a view on the Transport layer, by Andy Allan.
If only I was in Switzerland!
Thanks to our handy bug reporting form, and perhaps spending a little too much time surfing around the world in watercolor at the studio, we’ve isolated a couple of bugs which we’d like to update you on.
The background is, even though the watercolor map has been online for some weeks now, viewers haven’t yet looked at every place at every zoom level. Since it would take aaaages to make all the map tiles for all the zoom levels available, we’ve been creating new tile areas based on people’s viewing activity, and then working to cache tiles for popular areas. (See Jeff’s Log Maps post on content.stamen.com for more information about this.)
This means there are parts of the world haven’t had their watercolor map made. What we’re finding too, is that some of the tiles that have already been made have been generated incorrectly, and will have to be re-made.
As you click around watercolor world, you might come across maps that look like this:
We’ve been calling this the “Underwater” bug. It seems like it happens in the tile-making process if the machine that constructs the tiles is running too hot. It freaks out at coastlines, and ends up literally flooding the land area with the water texture.
You may also have seen a preponderance of grey while you browse around too, like:
Thanks to our superstar efficiency buddy, Aaron Huslage, we think we’ve tracked down the overall issue to machine I/O, the servers’ ability to process inputs, and issue outputs. If the I/O is flooded, the software to generate tiles on the fly baulks, and gets more and more underwater. So, to try to reduce that chance of flooding, we need to reduce and simplify the inputs we’re sending through to create new tiles. Step 1 is to try to “simplify the world.”
The theory is that we’re sending a more complicated Make request than we need to. CTO, Mike Migurski likens this to killing a whole chicken in order to make a McNuggetTM. We’re experimenting with ways to reduce the size of OpenStreetMap data ahead of time, for the whole world, because Watercolor in particular is such simplified cartography that is doesn’t need the whole chicken. If we just give Cascadenik only what it needs (instead of the whole chicken), that might reduce the machine I/O. Then, we’ll see what happens next…
We’ll post an update to let you know if that worked, or not. Any advice that springs to mind, feel free to post a comment!
Well, it’s been about a month since we announced maps.stamen.com, and what a month it’s been! The whole team has had tremendous fun listening and watching people enjoying the new maps: Watercolor, Toner and Terrain. Here’s a few snippets of reporting in the news, and a sample of fantastic re-use of the new maps…
In The News
“A new Creative Commons tile set adds a human, organic touch to cold digital maps. Now if only there were more projects like this.”
“Beautiful visualisation tool transforms maps into works of art”
“…a re-imagined view of cities…”
And we were super excited to spot this little tweet from @pajbam in Paris:
Go, Gamers Go
Duncan integrated Watercolor into MapsTD (Maps Tower Defense) game he built on top of the Google Maps API. Looks pretty great as I defend the Taj Mahal…
Toner popped up on the fabulous WordPress plugin, Vérité Timeline, which lets you insert gorgeous timelines into your WordPress site, using data from a Google spreadsheet, or a JSON file. Great to see some interplay between Knight News funding recipients. (More please!)
Jonas Häggqvist built osm.rasher.dk, a site that shows data visualization and QA for Danish OpenStreetMap edits over Toner (as one of a few map style options). In this screenshot, I turned on a few different overlays: Power, Named Places, Traces (on OSM), and Street Centers.
It’s really important and exciting to see these maps help to build out tools in the same ecosystem. To use maps generated on top of OSM data to see OSM data more clearly must be some sort of VIRTUOUS CIRCLE, surely!
Pins, Pins, Pins
Paul Mison discovered a nice easter egg on Pinterest when you try to pin a map – you see all the composite tiles shuffled about a bit. (I feel a game coming on!)
As I’ve spent the last while getting up to speed on the whole Citytracking environs, something Eric about the project wrote way back before the Data & Cities conference has really stuck with me, and should probably be on a t-shirt:
The project is: Here’s some work, grab the code, the license is cool, don’t worry about it, use it, go ahead and publish your stuff.
To see that actually coming true — even on cupcakes — and also to see people using the Creative Commons Attribution 3.0 license correctly is huge! We’re actively collecting examples of our maps in the wild, so if you’re using them, please either leave us a comment here, or tweet @stamen with a link!
April 2-4, 2012 saw the Where 2.0 Conference come to San Francisco. Most of the studio went to at least one morning or afternoon, to watch many of our friends present on what they’re up to, and of course, to watch Eric talk about maps.stamen.com, and all the different parts we worked out to get the system launch-ready. Here are Eric’s slides (13MB PDF):
It’s been absolutely brilliant to watch the web’s response to the new maps – I’ll post more on that later, including links and examples of the maps being used and reused in other sites, one of the key goals for CityTracking.
We also prepared a surprise to accompany Eric’s talk on Wednesday: newspapers from Newspaper Club that contained watercolor maps of 20 cities around the world (Thank you, thank you, to Anne, Russell, Ben and Tom from Newspaper Club for helping us get the papers ready in time!)
Nathaniel is in Washington, DC at the moment, and also presented about how we created the Watercolor maps to a super-smart geo-tech audience today at Free and Open Source Software for Geospatial (FOSS4G) North America. “Reminiscent of hand drawn maps, Stamen’s new watercolor tiles apply area washes and organic edges over a paper texture to add warm pop to any map. We’ve added more Photoshop-style raster effects to TileStache to render OSM tiles like you’ve never seen them before.”
In case you missed them, we posted a week-long series of blog posts to accompany the launch of maps.stamen.com from various people at Stamen:
- Eric announcing the launch,
- Zach explaining how a watercolor map is generated,
- Geraldine talking hand-painted textures, including this CC-licensed set you can use on Flickr,
- Mike on the Terrain map tiles, and
- Jeff on how we made maps to watch people looking at maps
If only I’d gotten this blog online in time (!!!).
I happened to be in the audience, and was thrilled to hear a gasp or two when Eric revealed our work on the Watercolor maps in public for the first time, about 15 minutes in to his presentation. That was a couple of months before we launched them on March 22, 2012.
As a provocation, Rodenbeck closed with the statement that “there’s no such thing as raw data”—it is always scrubbed, filtered, and interpreted. The challenge—and great opportunity—is to analyze and interpret the data in service of the public good, and to communicate these insights to mass audiences in accessible ways that resonate and inspire action.