You have /5 articles left.
Sign up for a free account or log in.

Heather VanMouwerik is a PhD candidate in Russian History at the University of California, Riverside and the Congressman George Brown Graduate Intern for the Special Collections at Rivera Library. Find her on Twitter or through her website.

The first time I ever touched a computer was in the second grade. Our elementary school had built a state-of-the-art computer lab, which was attached to our library. One afternoon, we filed into the windowless room and nervously sat in front of our computers. I had absolutely no idea what these boxes were even supposed to do, but, after an hour, I had learned how to turn my Mac II on, insert a floppy disk, and play Number Muncher. Many of my classmates were unimpressed though I was enthralled. From then on, I did everything I could to get back into that lab. Yes, I liked learning DOS commands and typing, but mostly I just wanted to play Oregon Trail or Where in the World is Carmen Sandiego?

These early experiences, supervised by librarians and teachers, established a strong connection for me between computers and information. Because of this, research has always felt like a game. I study history, for example, like I would play a game of Clue, strategically working my way backwards to understand what happened, when, and why.

After I passed my qualifying exams and started writing my dissertation, though, research became an overwhelmingly demanding job. Books and microfilm reels and citations and journal articles and notebooks and dictionaries and random scraps of paper started piling up quicker than I could process them. Every time I entered my office, I heard Venkman say, “No human being would stack books this way.”

It was time to get my act together and corrall my collection; it was time to build a dissertation database.

A database, also known as a document management system (DMS), is a collection of materials that are organized to allow for easy access and use. Although they can exist in a physical sense--all archives and libraries, for example, are databases--for the purposes of this post, I want to focus in on digital databases, or collections that exist solely on a computer.  

Over the next few months, I experimented widely with all different schemes. In short order, I adopted and discarded a lot of software, and I started and abandoned several organizational plans. Along the way, however, I learned three lessons. First, don’t put your horse before the cart. Taking the time to identify your needs before you start buying software can save you a lot of money in the long run. Second, remember that you are the beating heart of your research. Any database you build needs to work for you, and you alone. Finally, start with an organized computer. This way you are starting your database with a clean slate.

One of the reasons I love playing computer games is that they all--platformers and shooters, text-based and roguelike, RPGs and life simulators--have rules that guide the player through their gameplay. Good games do not dictate player action, but they establish the framework or parameters within which they can act. This, too, is the goal of a good database. It should have enough uniformity to ensure access, but it should also be flexible enough to support creativity while not being overly time consuming.

To help you identify your database needs, I have put together a little Choose Your Own Adventure-style “game.” As you go through the questions below, note your answers. At the end, these answers will help guide you to an appropriate database system.

Identify your goals.

1. What is the purpose of your research? Don’t worry! I am not asking you that dreaded question about the significance of your work. Instead, I want you to think about what would constitute a successful conclusion to your project. Is it simply a completed dissertation? An article- or book-length work? Or something bigger? Knowing your intentions early will help you quickly narrow in on the particular needs of your database.

2. What is the lifespan of your research? Although we all like to think that the time, tears, and energy we pour into our dissertations will result in an entire career’s worth of data to mine for new insights, be honest with yourself: How much are you actually going to use your database in the future? Yes, you will need to refer to it to put together your book, but after that? If you need continuous access to your information, then you are going to want a database program that is completely under your control. If the active part of your project will only last a few years, then you might consider a system that offers more flexibility and is easy to archive.

Identify your needs.

1. What type of data do you need to organize? My research is largely based on 18th century printed books and journals, which are written in Russian, French, and English. This means I do not have to worry about capturing numbers or handwriting; however, many of the Russian items are written in a font that is unrecognizable to OCR (Optical Character Recognition). On one hand, if you have a lot of text, like me, or have mostly one type of information (all numbers, all photographs, etc.), then your needs are pretty basic—one system to house one file type. On the other, if you have a lot of different types of media, then you are going to need to find a system capable of storing all of that material.

2. How much data do you need to store? Can it fit on a laptop computer? Or are you going to need to invest in a desktop system? A server?

3. Who needs access to the information? In my discipline, no one will probably ever see my database. That means I am free to organize it in whatever ridiculous way I see fit and house it in a place only I have access to. If you have collaborators or if you need an external body to corroborate your findings, then you are going to need to take other people into account.

4. What sort of security does your information require? I study the public writings of people who are long dead, so I am not particularly concerned about securing my database beyond the normal computer-related protection. But not all research is benign. If there are security issues, then you are going to need to consult with your advisor about best practices and find the most secure option available.

Identify your limitations.

1. How computer savvy are you? No, really. Learning and employing a new piece of software is time consuming. Even for me, a digitally literate software addict, there is always a learning curve. If, for example, computers make you nervous or uncomfortable, then a complex program that uses a lot of tech-speak is not going to do you any good. Remember, you are the beating heart of your research. Any database system you build must work for you.

2. How much time do you have to spend on putting together a database? My dissertation has been in-process for the last two years, and I still have two more to go. Four years is a long time, and I can afford to invest several months into making my database run like a dream—all the bells and whistles, please. History programs move slowly; however, other humanities, like Political Science, and STEM fields have much faster timetables for writing a dissertation. You have to decide how long you are willing to invest in what can be a time-consuming process.

3. How good is your computer? The more information you plan on downloading onto your computer, the more storage space you are going to need. I house my research on a relatively new MacAir, which is light and portable but also lacks storage space. This means I have to keep my computer scrubbed of any unnecessary items. If your computer is old, unreliable, or slow, then you are going to have to build your database in the cloud, meaning through external storage, or buy a new computer. Conversely, if you are running on a new, safe, and efficient machine, then it might be best to store everything in one, accessible place. Before you waste any time and energy, take stock of the material limitations of your project.

4. How much money do you have? I know that I am the last person who needs to tell you this, but your dissertation is going to cost money. Yes, there are some free programs available that work well for particular types of data and for building small databases; however, you are going to need to pay a not-insignificant amount for a more robust and effective program. Take into account, too, whether or not you want to buy a program up front or pay a smaller, though recurring, fee.

Database Adventure Options!

1. Embedded Folders. For this, all you need to do is organize your information into folders that you nest together on your computer. This option, though not the sexiest of databases, is cheap and good for people who are not tech savvy or have little time to invest. In addition, it has longevity, is easily secured, and can accommodate a lot of information. It is a good system, too, for people who have a lot of different file types they need to accommodate.

2. Reference Management Systems, like Zotero and EndNote, are usually used to organize citations; however, in the last five years they have become much more robust. If your information comes from published sources and you have a lot of notes to organize, this might be the database system for you. You can add subject headings and tags for searchability, and they make building a bibliography easy. This is a good option, too, for people who are not tech savvy, are planning to use the information for other projects, and don’t have a lot of money to spend. There is a lot of options with a wide array of prices, so shop around.

3. Cloud-Based DMS, or databases that are stored off-site and accessed through an internet connection, are great for people who need to collaborate or want to access the database from multiple computers. Basic access is often free, but unlimited storage is often affordable. They do require a strong internet connect, and although there haven’t been any major breaches lately, they may not meet security standards for sensitive information. In addition, longevity could be an issue since access is dependant on recurring payments and the continued existence of the DMS provider. The two biggest contenders in this category are Evernote and Dropbox, both of which I highly recommend.

4. Computer-Based DMS, or databases that are stored on your computer or server, tend to be the most expensive, time-consuming, and difficult to learn option. That being said, they are also the most robust in accommodating a wide variety and large amount of information. Because they are housed on a single computer, they work best for individual projects that require long-term access. There are several different software systems available; however, several GradHackers and I use (and love!) DEVONThink.

Alright, choosing your database isn’t really a fun adventure game; however, I hope you see that choosing the appropriate system for your database, much like the rules to your favorite computer game, will provide structure to your information and the freedom to explore it fully.

How have you set up your dissertation database? Do you have any recommendations for systems? We’d love to hear about them in the comments!

[Image from Flickr user abstrkt.ch, and used under Creative Commons License]

Next Story

More from GradHacker