Reading Time: 10 minutes

I work at a county public law library that has transitioned from having physical branches to working with partner libraries. Soon after I arrived here, we closed our last physical branch law library and moved our resources to a different city within the county. This decision re-emerged recently as our governance Board was discussing our annual report. How did we choose? In particular, there was some discussion about whether we should retreat from our original decision and redeploy resources to the former location. It was useful to be able to fall back on the data we used when we made the original decision.

One thing that is common in law libraries is the lack of data. We do the best we can but the legal profession and the court system do not engage in a lot of data gathering about themselves. Legal publishers lag behind other publishers—look at the advantages of COUNTER and SUSHI as one example—in providing useful usage data. What should be standard becomes part of a license negotiation, if you’re big enough to warrant it from the legal publisher’s perspective.

It is not surprising, then, that a discussion around a decision may be based on anecdata and supposition. The best guess becomes the best practice. It was useful, then, that in the decision about locations, we had a lot of demographic data that could inform some of our choices.

Gather Ye Data While Ye May

This is not a historical post, though. The decision was made nearly 2 years ago but the recent discussion made me wonder about presentation of that data again. I still had the raw data. Now I wanted to know if I could do a better job of creating a visualization.

I’ve posted about visualizations before. I think they can be particularly meaningful when dealing with people with legal backgrounds who are not accustomed to working with data as much as they are with words. A picture can be literally worth a thousand words if you can shorten a discussion with a clear presentation of information.

They’re also not complicated. You can make complicated ones and you can invest a lot of time in creative ones. But you don’t have to do either. I’ve had some of my most successful visualizations come from Excel charts.

An early attempt I made looked at the locations of lawyers and courthouse law libraries in Ontario, Canada. I leaned towards maps for discussions related to location (where to put libraries, where we need to deliver remote access databases, or might want to engage in greater interlibrary loan with local distribution points) because, well, it seems obvious.

People understand maps and overlaying data on them can speak more clearly than a chart with columns and lines. Also, there are real density issues that matter. Using a number to describe lawyer density may gloss over nuances about where those lawyers are.

A screenshot of a Carto map showing lawyer concentrations by post code and locations of courts and courthouse law libraries in Ontario, Canada

I had grabbed all of the information about lawyers in San Diego County from the State Bar of California. I’m not sure all bar associations or regulators would have this data, although I expect they do. The State Bar’s site allows for a search by County but is limited to 500 retrieved records. I ended up contacting the state Bar and asking for a count of lawyers in San Diego County. The list I got back distinguished active and inactive lawyers, separate from judges, and included a zip code for each record. This allowed me to compile density counts by zip code.

That was the hard part. It is easy to find population data and I was able to grab data from San Diego County of people over the age of 19 and their zip codes. I selected this smaller group, excluding children, on the assumption that most children would not be trying to access a law library on their own. Our County has an open data portal which made this easier.

The more localized the data, the harder this would be. I was imagining how I would justify a repurposing of a room within our law library. We have foot traffic data, showing how many people go in and out of the law library on a given day. But that’s not enough to understand an internal location’s usage. Also, I think foot traffic is an incredibly soft data point. You have no idea how many of those feet are repeats or what value they found at the library. Foot fall is not nothing but I’m not sure how much of something it is either.

Let’s say I wanted to convert a room back to stacks because the learning commons or training room inhabiting the space was underutilized, I might send someone around to do a headcount each hour, and continue that for a few weeks. That’s not a great method and I would want to do it for as many weeks as I could to account for external influences (bar exam? summer associates? bad weather?) that wouldn’t be noticed if it was only done for a week or two.

It’s ironic, really, that the closer the space or resources are to ourselves (print books or law library space v. licensed databases or our neighborhood), the harder it may be to know how it’s utilized.

MapMaker, Make Me a Map

Now to pull the data together. My first attempt, and the one I presented to the Board, used bare zip codes and overlaid the two populations: how many lawyers and how many residents. I used CartoDB (n/k/a Carto) to create a geographic representation because I thought a discussion about locations would be clearer with a map.

CartoDB was free at the time I used it but it is now a paid service. I used three data sources: my list of lawyers by zip code, county demographic data, and I threw in a simple sheet showing where court houses are (you can see one in the map below represented with a blue hexagon).

Here’s the result (you can see the full map and move around it here ):

A screenshot of an application screen. There is a map on the right and a menu on the left.  The map reflects north San Diego County.  There are dark large dots to show resident population, and numbers below the dots.
A screenshot of a Carto map showing dots with numbers for lawyers and resident population.

The small number on each dot is the number of lawyers, if any, in a zip code. The dot is centered on the zip code. The larger number is the number of residents. We have more than 3 million people in the county and nearly 22,000 lawyers. As you can see, a lot of the dots were pretty large with residents so the goal was to find a location that was relatively close to large clusters of lawyers. Once we knew that general area, we could look for a local public library that might want to partner with us.

A map is a particularly good visualization for a location like San Diego County. You won’t know it by just looking at a flat map that the County is defined by its canyons and mountains. The east side is isolated from the west side. Our physical branch had been at the top of what was essentially a figure eight, so you had to go up and over the top to reach it. The lack of east/west access was a key issue to consider.

Another challenge was that there was a potential partner library in the same city that our physical branch had been. And, as was suggested during the governance board meeting, we could still partner with them. We discussed the substantial added costs of deploying a staff person and collection to a fifth location (we have three partners and are about to activate a fourth) while also revisiting this map data.

Better Map Visuals

One drawback to the Carto map was that it pinpointed the zip code. So the heat map that I think is more visually common and easier to read was not an option. I returned to Carto this second go round but was still not able to break through this challenge.

I did learn something new, though. Always a good thing when you’re experimenting. The US government offers shapefiles for free, which are the information that will fill out your map shapes to connect a zip code or addresses to a map shape. I downloaded the Zip Code shapefile. It comes in a compressed zip file that you leave compressed.

As I say, I tried again with Carto with no success. I tried with a second set of data on another project I am playing around with and also ran into a steep learning curve. My list of counties in one state were translated into cities in states around the U.S., and I wasn’t able to find a way to get Carto to understand that they were counties within a single state.

I then moved on to PolicyMap. It is also a paid option but the State Library of California has licensed it for libraries, including public law libraries, to use. I thought I would see what I could come up with since it was designed as a geographic mapping tool. I have enjoyed using the pre-built data sets but this was my first time adding my own.

Unfortunately, the data I had was not robust enough for the PolicyMap data needs. Like most of these data tools, it attempts to create geolocated information from your data. While some data applications are able to work solely with zip codes, PolicyMap didn’t and I wasn’t able to get past this error with each of my data files.

A screenshot from the PolicyMap web app.  It shows menus and geocoding options.  In the top center, there is a message saying "Insufficient columns to establish site location."
A screenshot of a PolicyMap import screen. At the top it says “Insufficient columns to establish site location”

PolicyMap remains a tool that I would use or consider for a data-based initiative in the future. It’s strengths may be in using data that is already within the system. Our law library does not collect more detailed data and I would be uncomfortable buying a list of lawyers that included complete addresses. I don’t need it and don’t want to store or be responsible for that level of data detail. It will be a tool I’ll consider in the future but for now, it wasn’t a good choice for me.

Tableau Rasa

I ended up using the Tableau desktop tool. Tableau was something I had attempted before, when I was doing some similar mapping in Canada. Tableau still has a free Community product option but I started with a trial version of the full product.

As you can see in that post, I had to do a lot of data manipulation to get things to be properly geolocated. All of that was built in to the system this time. As I uploaded my data files, Tableau found the geolocated columns with zip codes and placed them properly on a map. I was able to easily join my two tables so that the zip codes were connected in a union and start to play around with the map.

At first, I had the same outcome as I did with Carto. There were dots for the zip codes but no real heat map. The red circles reflected areas with no lawyers. Blue circles had the largest residential populations in the zip code, and the number of lawyers whose addresses reflected that zip code were labeled below the population data label.

A map generated with Tableau showing San Diego County with dots reflecting relative sizes of populations, with some resident and lawyer populations labeled with text.
A map generated with Tableau showing San Diego County with dots reflecting relative sizes of populations, with some resident and lawyer populations labeled with text.

The trick was to import the shapefile for counties. Once that was imported, I could re-run my analysis. In Tableau:

  • switch from your Data Sources tab at the bottom left to the Sheet 1 tab next to it
  • move your data elements into place then, near the top, switch to the Analysis tab
  • double click Cluster (this was the only analysis option I had) and it retrieved the shapefile for my zip codes.
A map of San Diego County showing population in colored blocks with the number of lawyers in that zip code overlaid on top of the map.
A map of San Diego County showing population in colored blocks with the number of lawyers in that zip code overlaid on top of the map.

If you look at both maps, you can compare the lawyer numbers and orient yourself to the map. As you can see pretty clearly in the bottom map, there is a large grouping of lawyers in downtown San Diego (4,772) but, directly north, there is a bulge with 1,300 and large groupings to the west of that. Much of the north-south access (highways, express transit) goes through that dark blue patch inland from the coast.

We ended up finding a partner in the zip code that has 151 lawyers but that, with the overlay of the highways you can see in the earlier image, was easily accessible to these large lawyer concentrations. At the same time, it was also easily accessible to large groups of residents by highway and public transit.

One thing I liked about Carto was the ability to share a link to the data, so that people can navigate around the map and zoom in and out in ways you can’t with a flat image. I switched over to Tableau Public which offered all the features I needed for this project and was free. It also allowed me to publish the map to the Tableau Public cloud. You can then create a link to share the file or you can embed it with a javascript.

A screenshot of the Tableau Public interface showing the saved map with the Share dialog open with the available link and embed code.

Once it is uploaded to the Tableau Public site, you can go in and edit the file there as well. You can toggle labels on for populations and make other changes to the appearance of the map.

In future, I am going to start with Tableau Public. It seems to have improved significantly (or I have more understanding of what I’m doing) since the last time I hazarded this map. Also, knowing about shapefiles will make future geographic maps easier to make visually interesting. And if our Board wants an updated map, it will be easy for me to help create a visual for our decision-making.

In some ways, law libraries are somewhat insulated and so maps may not always be an obvious tool. We’re serving students, faculty, lawyers who are employees: audiences within our physical space. But when we start to consider remote access, or outreach, or marketing, and want to think about where to do these things, a map can be useful. The real trick is to get the legal-specific data that will make your project valuable.