Archive for the ‘guardian’ Tag
Using OCR to help the Guardian convert PDF’s to text
What do you do if someone shares a PDF with you and the PDF contains scanned images of text ? How do you get at that text if you want to copy and paste it, search it, or even edit it ? In short how do you liberate it ?
The Guardian Newspaper’s unsearchable PDFs …
Last Friday (22nd February 2012) the Guardian newspaper found itself in just such an unenviable position after the BBC released a slew of PDF files (relating to an independent review of the BBC’s handling of the Jimmy Saville scandal) containing scanned, un-searchable text. Not exactly the most helpful format for journalists looking to make use of the files in a hurry !
Optical Character Recognition & Zamzar to the rescue
Fortunately Zamzar was able to step in and help – We used specialist OCR (Optical Character Recognition) technology to analyse the 30+ PDF files and produce readable, searchable text.
You can read more about our assistance in a story by the Guardian’s technology editor Charles Arthur : “BBC Pollard inquiry: why is it so hard to search the documents?“
OCR (Optical Character Recognition) technology has a reputation for being costly and difficult to use, so we are pleased to say that we’re currently working hard to make it available on the main Zamzar site so that we can help to liberate more documents ! Do let us know if this might be of interest to you.
Zamzar named in Guardian’s “top 100 sites” for 2009
The Guardian has just announced their list of the Top 100 websites for 2009 and we’re honoured that Zamzar has been recognised in the “Create/collaborate” section.
Zamzar is one of only 2 sites in that section which remain from the Guardian’s 2008 list (the other being NetVibes).
We’ve worked hard this year to improve the site with dozens of new conversion types (including CAD, Visio and MS Project) and the ability to convert documents into speech.
We’re excited about what 2010 has in store, but we wouldn’t be anywhere without you.
So a big thanks to all of you who use the service, tell your friends and colleagues about it and keep us busy with suggestions on how to improve it
Zamzar in the Guardian
Ever wanted to find out more about the guy’s behind Zamzar ? Well, you can now read an interview with the Zamzar founders in the Guardian online, where they talk about the high’s and low’s of running your favourite online file converter.
Thanks,
The Zamzar Team.
Leave a Comment


