Working with data

Egyptian spreadsheet?

Back in 1979 Apple published what is regarded as the first “killer app”, the spreadsheet program VisiCalc (a contraction of “visible calculator”), which turned the microcomputer from a hobby for computer enthusiasts into a serious business tool and prompted IBM to launch their PC.

VisiCalc’s mantle was soon wrested from it by the superior Lotus 1-2-3 which reigned supreme on the PC until the advent of Windows when Microsoft’s Excel started to outsell it. Today there can be no organisation which does not rely heavily on Excel or an equivalent spreadsheet application for its planning, budgeting and forecasting, accounting … and many other functions.

What are spreadsheets for?

Silly question! They’re for crunching numbers and presenting financial data, right? If you work with numbers, then, yes, spreadsheets were designed specifically for you: they were initially developed as computerised simulations of paper accounting worksheets. But if you regard spreadsheets as number crunchers only, then you overlook their immense value for organising just about any data.

We all keep lists. Some of these are simple, but most include items that benefit from categorisation, annotation and ordering in various ways.  Whether we recognise it or not, we are daily organising our tasks with tables of data; best to do it efficiently with spreadsheets. Spreadsheets are now used extensively in any context where tabular lists are built, sorted and shared.

By way of example, following is a list of the spreadsheets that I use on a regular basis (in deliberately random order):

  • profit and loss projections*
  • address lists
  • legislation amendment lists
  • publication contents planners
  • email lists
  • subscriber lists
  • production schedules
  • sales reports*
  • personal tax calculations*
  • publication costings*
  • case data

You’ll note that only those marked * are based on financial data.

As a publisher I obviously have some specific needs that are met with some of these, but most of my other needs are generic and the range of applications for which you might find spreadsheets useful will be similar.

Doing stuff with your data

If your spreadsheet is of financial information then probably the spreadsheet is the end product. You plug in the variable information, everything is calculated and laid out in neat rows and columns and you have your desired result. Maybe you need to tweak a few things to produce a pretty print version; or maybe you’ll use a graphing function to produce a nice bar- or pi-chart.

Mail merge and more

For most other applications we use spreadsheets as a convenient data store and need to do something more to get our result.

A very common application, built into word processors, is mail merge which is designed to read data from your address file data source (spreadsheet or other database), select items matching the filters you set, sort them in a particular order and merge them into your word processor templates for mailings, envelopes or labels.

What’s worth noting here is that you can use “mail” merge to merge any data with any template, so, for example, you can produce a staff directory, or record cards for your library, or whatever.

Working with data on the web

But isn’t this talk of mail merge rather dated? Typically we are nowadays concerned more with email rather than with snail mail communications and with transferring data to and from web-based applications.

Most good web applications will accept data directly from Excel or other native spreadsheet format. You can happily copy and paste from your spreadsheets to some web application editors when in wysiwyg mode. You can also upload data directly from your spreadsheets to online databases and download therefrom in spreadsheet format (typically Excel’s xls format). So, for example, you can download Google AdWords reports for further manipulation and analysis.

Data interchange formats

Working directly with spreadsheet data is often not optimal. What are the other file format options?

csv (comma separated values) is a long-established format for data interchange. Spreadsheet applications will save as csv and import from csv or open a csv file directly; but in csv, non-numeric values need to be enclosed in “quote marks” which makes it an awkward format to work on directly if needed.

tsv (tab separated values) has the huge advantage that you can copy and paste directly between a tsv text file and your spreadsheet and is much easier to edit.

You’ll need a good (plain) text editor to work efficiently (for comments and suggestions, see my article “Working with Documents” in the January issue).

A simple worked example – email lists

Let’s say you have an email list as two columns in a spreadsheet in the form Firstname Lastname, Email; you want to change this to the three-column format Email, Firstname, Lastname.

1) Swap the two columns in the spreadsheet.

2) Copy the Firstname Lastname column data to your text editor.

3) Using Find and Replace, change the spaces to tabs (^t).

4) Copy the block of data back to the spreadsheet.


Of course you’ll have to adjust for spaces in names like da Silva etc, but you get the principle.

Nick Holmes is Editor of the Newsletter. Email Twitter @nickholmes.

Image: By Karen Green on Flickr.