Finding a Dataset
Typically, when I give talks about PostgreSQL, Stado or Postgres-XC, I like to use examples as much as possible to get my points across to the audience. This usually presents a problem in deciding what data to use in the examples. The old stand-by is to use pgbench to generate a bunch of sample data, but the values are meaningless to the audience, and you're a bit limited with what you can show. I've also used the sample databases from PG Foundry but also found them lacking.
Last week, I attended the State of the Map conference out in Portland and the answer became blatantly obvious, use the Open Street Map data set. The data size can either be as big or as small as you'd like. You can load in just a small subset of the data covering your city or state or load in the whole planet. This let's me target the example queries to the location that I'm giving the talk making it much more meaningful for the audience. Another great attribute of the data is that it contains both PostGIS data and hstore data so we can show off some of the coolest PostgreSQL features that developers love.
Take for example, the following query that I would use when presenting in New Jersey. It will find the schools in Monmouth County that are near by the Garden State Parkway.
SELECT distinct p.name, ST_AsText(ST_Transform(p.way,4326)) AS pt_lonlattext
FROM planet_osm_point p, planet_osm_roads r
WHERE p.tags @> 'gnis:state_id=>34'::hstore
AND p.tags @> 'gnis:county_id=>025'::hstore
AND p.amenity = 'school'
AND r.name = 'Garden State Parkway'
AND ST_Distance(p.way, r.way) < 400;
If you'd like to start use the Open Street Map data, loading the data into a PostgreSQL is really straight forward thanks to the guide provided by Boston GIS.