Sunday, July 15, 2012

Managing PDF/Documents using Benubird PDF

Having the books, presentations, articles in PDF is the best way to read and use it across platforms. However, if you have more than thousands of them, then it is difficult to manage. Here my account of my efforts to find a way on my own and how I stumbled upon a good product to manage it.

When I started to use PDF books few years ago, at that time, with few files organizing into folders was easy. Then Google Desktop came, that made life easy. But that has other issues such as re-indexing when moving folders/system and of course the security vulnerabilities. So, I decided to go for a primitive solution. Have the name, title, author, description and location of PDFs in a spreadsheet. I made some tinkering so that I can open a pdf from spreadsheet. This too has some issues. If I send the file to someone else they also have to do this.

Ideal solution is to have those properties in the PDF so that extracting, sorting etc would be easier. Alas, it took long time for me to realize. So, how to do that? First, to know, how many files have those properties?

I started to search for python PDF packages and found pyPDF is good in extracting information from PDFs. But if the PDF size is more than 15MB then it hangs. Apart from that, edit the property of the PDF is only possible by creating a new PDF with those properties and content. This means I will end in duplicate files and need to check those files manually before deleting them. I've tried other tools too but it is same thing. Even the famous ExifTool is also creates new file.

So next thing is search for pdf manager. First hit is Mendeley. Actually it is managing your research papers, make the citations easy not for other purposes. So, it doesnt have any option is edit to title in PDF. You can only edit that in Mendeley not outside of it. That is not I am looking for but coming  close.

Next, searching alternatives for Mendeley. After reviewing few apps, found Benubird PDF. To my surprise, it offers easy editing of PDF properties and this app will write that to PDF. But dont try this with PDF opened in Acrobat. Benubird will overwrite the pdf leaving only one page. Ideally that throw a warning but it is okie as of now. It has other nice features as watched folders, smart lists, lists etc. UI shows Filename, title, author, Subject, tags size, type, path etc.

I have transferred all of my spreadsheet data to PDFs via Benubird. Hope that it will work fine. 

Other links if you want to check
PDFMiner

pdfrw 

pdflib

ShareThis

raja's shared items

There was an error in this gadget

My "Testing" Bundle