Archives Scanning Project

    Starting years ago, and continuing still, countless volunteer hours have been spent setting up and organizing our archives.  The result is a facility that beautifully fulfills the first purpose in our bylaws: "to preserve the historical materials of railway transportation ... and especially those of the New York, Ontario & Western Railway".  The archives is a wealth of such information, and holds answers to a myriad of questions -- past, present, and future. 

    The archives facility has good security, (even recently improved), and climate control, but, like most any building, is not fully protected from fire, flood, or building damage.  Although such events are unlikely, they could result in a devastating loss of our archives contents.  At the other end of the risk-versus-damage spectrum, the necessary handling of documents as part of research and making copies is definitely (though gradually) wearing them.  In addition, finding, viewing, copying, and returning the original documents takes considerable time (and workspace, which is very limited).  Because not everybody has many opportunities to visit the archives, we have been selling custom-made copies of documents, large and small, for years.  However, the document handling time limits how many such copies we can make each day, so we have not published extensive lists of the archives contents.  Most people do not realize what information we have, so don't know what to ask for. 

    The solution to the above is the scanning project.  By digitally scanning documents and photos into computer files, we can store sets of copies of the scan files at multiple locations to ensure against sudden information loss, and can both view the scan files and make copies from them to completely eliminate handling the original documents.  By recording information about the documents as we scan them, we’ll be able to find scan files for desired documents quickly, easily, and reliably via computer search tools, and then view them (or make copies from them) with a few keystrokes and mouse clicks.  In addition, just having the scanning capability encourages people to lend us documents so we can scan them, keep and use the scan files, and return the originals to their owners.  It will also encourage donations of documents (with the backup and reduced handling providing improved protection for the donated material), and will actually give us more workspace (eventually) by eliminating the need for a large table reserved for handling large original documents. 

    Planning for the project has involved many decisions.  One of the first was to classify the archives contents into large documents, small documents, and photos (including slides and negatives) -- based primarily on the hardware required to scan them.  We decided to start scanning the large documents first, as they have the most content per page, are a finite group, suffer most from handling, and are slowest to access for viewing or copying. 

    That requires unique equipment, and after careful research we settled on a TDS-450 system from Océ-Technologies (see the photos).  It is a fully integrated system, which avoids possible compatibility issues, and is fast, easy to use, flexible, and robust.  It is roughly equivalent to the all-in-one printer/scanners that many of us have at home, only a whole lot bigger -- and a whole lot more expensive (We were able to purchase a demo system at a significant discount, although it still put a large dent in our bank account).  It can handle documents up to 36" wide by virtually any length.  It can scan in monochrome (simple black and white), gray scale (needed for photos, but handy for documents, too), and color (surprisingly, many of the O&W documents contain some color, usually very limited, to set off property lines, building walls, etc.).  The printer holds two paper rolls (different widths), selects the proper roll automatically, and cuts the copies to length.  It uses toner (similar to a photocopier or laser printer), which makes fast, low cost, yet stable (generally waterproof) copies.  Because it uses toner, the prints are black and white only, but because the color in the drawings is usually of little significance, that should be perfectly acceptable.  The system also has its own controller unit, similar to but definitely different from a PC (it has Windows XP, but it is embedded within the Océ software program). 

Of course, we didn't forget the smaller documents and photos in the archives, and have been raising the needed funds through member donations to purchase the appropriate equipment. 

    This is a big project (the large documents alone number around 15,000), so a lot of planning has been done (and is still being adjusted).  As the details of the project were being developed, a few requirements became apparent.  They are simple, yet profound, and have many implications. 

    The "prime directive" is to not lose information (I guess that term came to mind after a friend quipped that the Océ equipment has that early Star Trek TV show look).  Losing information can range from losing or damaging an original document, misplacing it, losing a scan file of it, or even losing track of a scan file so it won't be found by normal computer search.  Losing information also includes not capturing all the information on a document (including light pencil marks), or losing any of it through "clean up" of stains, smudges, or discolored areas of the scan.  This requires careful document handling procedures, a good system for recording and tracking the scanning process, defined and consistent methods, file backup methods, etc.   

    The "sub-prime directive" (gee, that sounds somehow familiar, too) is to not let errors accumulate on the project so that things become unreliable, unworkable, and unfixable (which can eventually lead to the project simply being abandoned).  This also requires the creation of procedures and methods, and special computer tools to catch errors as soon as possible while they are still easy to correct. 

    The information about the documents is typed into an Excel spreadsheet during the scanning process.  We record in it the file name of the scan file (which includes the O&W document number and size), document title, subject (or subjects) covered, location(s), any other railroad(s), drawing and revision dates, material and condition of the original, notes, where the original lives, and the date scanned.  If it is a borrowed document, we record the lender, loan date, etc. 

    The subjects, locations, and other RRs are the most useful when searching for documents.   If anything is misspelled about a document, that risks the document not being found when someone searches for something specific (that they spell correctly).  That would make document searching unreliable, which would violate the "prime directive". To eliminate that risk, the entries in those areas are "validated" -- meaning they are only accepted if they exactly match items on special lists.  Many documents depict multiple subjects, multiple locations, or multiple other RRs, so provision has been set up to handle that.  Although it is simple to enter things, such as multiple subjects, it takes a little extra time.  However, that makes it easy (and even fun) to search for documents, so it's well worth it.

     As we finish scanning various boxes of documents, we add them to the list of scans that’s published on this website (the link to the list is below).  The list has already gotten so long that we can’t show the whole list at once, and the only way to find documents within the list is to use the search and filter tools that we built into it.  The list page contains a link at the top that explains how to use those tools.  When you find a particular document on the list, the title and description tells you a lot about it.  However, to get an idea what the document looks like, you can view a small preview images of it. 

    We've set up a workflow for two people to work efficiently -- one to enter the information, and the other to handle the original documents (scan them, check the results on the Océ controller, rescan if needed, name the scan files, and move the documents to the completed area).  The two people will need to share information about each file, so we set up a PC right next to the Océ controller for entering data into the spreadsheet.  We connected that PC, the Océ system, and the main archives PC (in the office/sales room) together via a computer network so data and files can be moved among them, and one PC can be used to back up files on another. 

    Periodically, such as the end of each day, the work will move to the PC in the office/sales room where the scan files are accumulated.  Any scan files that need it will be rotated for proper viewing, and they and the files not needing change will be added to the folder with the other final scan files.  Using a special procedure, a list of the files in that folder will be generated and copied onto another page of the Excel spreadsheet, which compares the list with the information that was entered during the scanning.  The spreadsheet will flag if any file description entries are missing, any files themselves are missing, or any file names don't match the names entered.  This will go a long way to satisfy the "sub-prime directive" to keep things straight.  By finding any problems early, they can be fixed while the information is still fresh, and the number of problems small. 

We need your help. 

    The actual process of scanning takes a few minutes per document, as does the entry of the data about the document.  If two people work together, doing the two tasks simultaneously, the job gets done in half the time.  With 15,000 large documents, that means saving months of time. It is easy, straightforward work (and you even get to learn a lot about the O&W in the process).  The equipment is easy to use, and there aren't too many variations to learn.  If you can type at all, and can operate a mouse, you can help.  Please call me, Jeff Otto, at (845) 343-2467 or email me at jeffotto@aol.com so we can coordinate days and times.  Hopefully, we'll get multiple volunteers so the project can be worked fairly steadily without any one person spending too much time.  I can be there pretty much any day and time to work with you, and we can work as few or as many hours as desired.  If we all pitch in, we can complete this project (well, the large document part, to start) in a matter of months.  That will allow the Society and all its members to realize the project's many great benefits.

 The list can be viewed and searched (and even copies ordered) HERE.