It turns out that human being are creating a LOT of data. 180 exabytes in 2006 to be exact. And we’re creating data faster than we can keep up with the storage medium needed to store it, much less the people needed to maintain and protect it.
Which makes me wonder about libraries, librarians, and librarianship.
70% of the world’s data is created by individuals. 70%! I thought this number was awfully high until I reflected on the proliferation of inexpensive, high quality, computers, cameras, and video recorders. Just think about two things: cameras and movies.
When I was a kid and we wanted to take photographs, we got out the little camera (my family could only afford a 110 camera until I saved up and got my first real camera: a Pentax K1000) and took maybe 4-5 pictures. Then we’d put the camera away until the next event (4 months later) and we’d bust it back out (complete with the odd little square flash!) take 4 more pics, put it away and so on until the whole roll was used. Then we’d take the roll to Wallgreens and it would be like Christmas when the photos were ready a week later — you never remembered what you took pictures of so opening that package in the parking lot was a crap shoot. Hey! That’s my birthday! I remember that! Hey! oops! Put those pictures away! The point is that it took several months to shoot 24 pictures. And typically we never reprinted them and if we wanted to share them with Grandma we either had duplicates made or we bored everyone at a dinner party.
Now? people shoot probably thousands of pictures in a year. On their cell-phones, on their digital cameras, on their laptops. And sure, most of it’s just memories or “look at what Johnny was doing when he was drunk!” But the point is that everyone is taking so many more pictures now and they are storing them or emailing them to their friends or just deleting them that it’s pretty easy to see the insane proliferation of data creation over just 10 years ago.
And what about movies? Again, even 10 years ago, if you wanted to record a movie or own a copy of a film, you had to get out the VHS tape and duplicate it. The average consumer was not making a lot of recordings and storing them in their library. My family was a little odd in that whenever pops sprung for HBO (usually once a year when they made some kind of special deal like buy 2 months get 2 months free) my mom would record all the movies she saw. ALL OF THEM. Growing up, we had probably the most eclectic collection of VHS tapes known to man: everything from Wolfen 2 (Electric Boogaloo) to One Crazy Summer. But even still, she could only amass maybe 100 movies. And we had (I still have it) exactly ONE home movie — a strange tape that has the only video of my late brother.
Now? with TiVo and other DVRs complimenting our DVD collections; YouTube junkies who record every aspect of their life, minute long snippets from people’s cell phones and on and on. Video is quickly becoming the next step in user generated content and user generated storage.
So, what does all this mean for an academic library? Weeeeeellllll, first of all, I think there’s an important component here: the content is user generated. There’s no expert who is mediating this content creation. The user creates the content and has to figure out how to store it and send it his or her self. The middle man is cut out. I don’t need someone to develop my photographs for me anymore. The same thing is happening with digital music and (importantly for us) with academic publications.
As all this content creation has grown and the middle-man has been cut out of the content creation component technologies like Google came along to help with the second part in the life-cycle of data: search. Google didn’t organize the data. They just created the bots and server farms that facilitate search for the data. And they are wildly popular. But note that Google is a middle man. Just as libraries are the search middle man for a lot of data.
Then there is a third component here: management of the data. There are libraries, there are Web 2.0 technologies, and there are other content management systems (heck Gmail combines all three services in one!).
I think (and here comes the controversy) the user is going to find a way to cut out the middle men when it comes to search and management. And I think it’s going to happen with end-user technologies, like Flickr, combined with production technologies, like Lightroom. Something is going to emerge that will do all of that; produce, package, market and index data.
And when it does, I wonder what the library will look like?