Working with documents (you found on the internet)

In the connected world, with a multitude of options for consuming and publishing information, we’ve become accustomed to re-using content produced by others to produce or supplement our own current documents. This content may originate in Word documents or PDFs or as web pages; and we may want to publish or distribute our output as Word documents, PDFs or web pages. Most of us approach these tasks in a haphazard fashion, but there is a better way.

In the beginning was Word*

Most of us use Microsoft Word all the time to create and edit documents. It’s bloated and not ideal for many specialist applications, but it’s the standard. If you don’t use Word, good luck; you must know what you’re doing. (We’ll be covering Google Docs later.)

But if Word is the only application you use for creating and editing documents, then you are seriously ill-equipped and likely to be wasting copious amounts of valuable time. To be efficient and productive you need to invest in a few more tools.

*Apologies to older readers who know life began with WordPerfect (or maybe even with WordStar).

Essential tools for your arsenal

Adobe Acrobat

Adobe are responsible for the ubiquitous PDF (essentially an electronic print format). You can view PDF documents with the free Acrobat Reader, but you can’t do anything to them. You can print to PDF as a free option from Word and many other applications, but you can’t do more. There are also many free and low cost applications and online tools that will enable you to edit PDFs, but the only full-featured PDF editor you should consider is Adobe’s Acrobat. It will cost you several hundred pounds, but that will be paid back in spades over the next several years you will use it.

With Adobe Acrobat you can actually edit PDFs: extract, insert and delete pages, save pages as images, edit content (after a fashion), protect documents and much more.

A good text editor

For all the times when you want to clear up the mess in your Word document, or to see what you’re doing without all that formatting in the way, or to seriously hack a bunch of documents, you need a good (plain) text editor.

So you use Notepad?! There are many far better text editors, but I would recommend UltraEdit, favoured by programmers for good reason. It has a host of features, some used only by geeks, but many that you’ll soon find invaluable. And even if you’re using it at a fairly basic level, it’s far superior to Notepad. And all for the cost of a good meal out.

Particular features I use all the time are its powerful Find and Replace and Find in Files, its Macros, Compare Files and Sort functions.

A web page editor

Whatever you do, don’t use Word’s Save As Web Page; use a proper xhtml (web page) editor which will give you clean and consistent code under the bonnet. Unless you’re a web designer, a simple one will do just fine: you should be able to get Author access to the editor built into your firm’s content management system (eg WordPress).

How to do things properly

Convert a PDF to Word

There are numerous free PDF to Word converters. But, for the reasons given above, rather use Adobe Acrobat and Save as Word. This does a remarkably good job, considering it is looking at the format of a page and implying from that the structure. It will do a good job on straightforward documents, but don’t expect miracles for complex documents like court forms which have boxed text etc – it will produce a document that looks the part but is not the same as a fillable form.

Use Word like a pro

You are a pro, so learn to produce documents like a pro. In particular, don’t hit return twice for a new para and don’t use explicit formatting for headings: learn to use Word styles.

Use proper curly quotes and proper dashes. Word’s autocorrect will do these for you, but you may need to fix portions pasted in from elsewhere. To fix quote marks, just find straight quote and replace with straight quote and Word will make them all smart.

Use Find and Replace intelligently to fix inconsistencies, including finding multiple spaces, para breaks (^p) and tabs (^t) where appropriate.

Fix things with your text editor

You’ll tear your hair out sometimes trying to fix some things in Word. It’s usually far quicker to cut and paste the relevant portion to your text editor, fix it, copy and paste it back and reapply styles. Really.

Create web pages

Web page editors will have both a visual and a text mode. In visual mode you can drop formatted text straight in from a Word doc or other web pages. This will strip out most extraneous code, but if you need to fix things, switch to text mode and tinker around.

Word Heading and Bullet styles will map directly to the corresponding html, so get the Word doc right first and you’re sorted.

And finally

Just because you “found it on the internet”, doesn’t mean you are free to reuse it however you wish. In fact, almost certainly you are not. (You know that, you’re a legal bod.) Here are my guidelines:

  • always check provenance and act accordingly;
  • if a direct quote is not appropriate, don’t be lazy; make the words your own by rewriting;
  • always acknowledge and attribute significant sources (preferably with a linked title); and
  • be especially careful with images; you will be bitten.

Nick Holmes is Editor of the Newsletter. Email Twitter @nickholmes.