Importing Albums into SCDdata

Introduction

Here is a quick and dirty guide (well, not so quick anymore but still quite dirty) on how to import album information into the SCDdata database. (»Albums«, these days, are usually CDs, but this guide also applies to other types of media such as LPs or MCs if anyone has any of these that aren't in the database yet.)

An »album«, in SCDdata, is a collection of »recordings« together with some additional information such as the type of medium it is available on, or the artist/band who recorded the album (which is usually the same as that of the recordings that make up the album, except for »sampler« albums). SCDdata can also store a scanned copy of the album cover. A »recording« specifies a named unbroken piece of sound (such as the music suitable for a certain dance) that usually (but not always) belongs to a musical genre such as »strathspey« or »jig« and can be subdivided into chunks of ever so many bars, repeated however often (such as a »8 by 32-bar strathspey«). Recordings can be declared suitable for specific dances, and they also usually consist of renderings of certain tunes, so SCDdata recordings need to be connected to dances and tunes. A recording usually contains four different tunes but the number can go up to eight or even more.

The problem with entering album information is that the usual WWW forms aren't exactly the most efficient type of interface. To add a single album we need to add a dozen or more recordings, each of which contains any four (or so) of 10.000-odd tunes in the database (or even previously-unrecorded tunes, which need to be added to the database as well). Many tunes have named composers, and a new tune may have a previously-unseen composer who must be added, too. Similarly, a recording may refer to any of more than 13.000 dances or even a previously-unseen dance. The approach we're taking instead is to prepare a text file (using any text editor such as Emacs or Notepad) that contains a description of the album contents. This text file can be uploaded to the database, using a web browser, and the database then returns a result page stating any inconsistencies or items (such as tunes, composers or dances) that need to be added. The text file can then be refined until all the issues have been fixed and its contents is ready to be committed to the database.

The remainder of this document explains how to prepare and upload a text file for a CD. The CD in question will be the recently-issued RSCDS CD for RSCDS book 11.

Starting an album

To prepare a file with album import data, first launch a text editor of your choice with a new empty text file. (The name of the text file doesn't really matter.) The first lines of the text file specify some data pertaining to the album as a whole, like so:

ALBUM NAME Music for 12 Scottish Country Dances (Book 11) SHORTNAME Book 11 PUBLISHER RSCDS YEAR 2008 ARTIST Renton, John and his Scottish Dance Band MEDIUM CD RSCDS_CD064 AVAILABLE

The ALBUM keyword means that this file contains album data (in the future we will import, e.g., data on dance books in a similar fashion). NAME gives the name of the album as it appears on the CD booklet, while SHORTNAME is an abbreviated version of the name (which might be used by player programs). The PUBLISHER is the entity who arranged for the CD to be brought out – the database doesn't actually use this yet but it makes sense to collect the data even so. The YEAR is the year (AD) the album was published.

The ARTIST is the individual or band who recorded the album as a whole – for sampler-type albums, the correct artist is »Various Artists«. The import process tries to locate whatever we enter here as a »person« in the database, so it makes sense to check the database first to see how it refers to the person or band.

The MEDIUM line tells the database in which form the album was published, what its number is in the publisher's catalogue, and whether the album is currently available commercially (i.e., new from a purveyor of records, rather than used from eBay). The first word after MEDIUM is (at this point) either »CD«, »LP«, or »MC«, the second (the catalog number) is fairly arbitrary except that it isn't supposed to contain spaces (if in doubt, use an underscore), and the third is either AVAILABLE or UNAVAILABLE. For an album that is available on different types of media at the same time, use several MEDIUM lines.

Adding Tracks

The next thing to do is add the various tracks that make up the album.

TRACK NAME Knit the Pocky TYPE Reel 32 8 TIME 5:02 TUNE Knit the Pocky [Bremner, Robert] TUNE The Fife Hunt [Gow, William] TUNE Perth Assembly [Duncan, Samson] TUNE Loch Rynach [T]

The first four lines are fairly intuitive – TRACK, like ALBUM above, specifies that data for a track follows; TYPE gives the type of recording, the number of bars, and the number of repetitions (we will go into some of the subtleties later), and TIME the running time of the track in minutes and seconds, separated by a colon. (In this case a running time of »5:02« for an 8x32 reel should raise a warning signal because the time should really be on the order of »4:35«. Overruns such as these have several likely causes: Live CDs often append a generous helping of frenetic applause to every recording, which throws the usual heuristics off, but this CD isn't live. The timing given in the booklet may just plain be wrong, in which case it makes sense to look at the CD using a computer to find what the CD actually says the timing is (it is usually more correct than the booklet). Or the CD may be an RSCDS production, which as we all know are notorious for their very slow tempos, so this is the most likely explanation.)

The TUNE lines are much more interesting. They list the tunes involved in the recording in the order the CD booklet gives them (which may or may not be the order in which they are actually played), with the composer appended in brackets. If a tune is »traditional«, put »[Traditional]« or »[Trad]« or even »[T]«. If a tune is by the same composer as the previous tune on that track (including »traditional«), you can simply put »[]« for convenience.

It turns out that many CD producers are really sloppy about composer names and will say »Traditional« when they really mean »This is a tune that I don't think I should have to pay copyright royalties for, and I can't be bothered to check who actually wrote it, if we know that at all«, which from the point of musicology is clearly sub-optimal (especially since CD producers are very often mistaken about such things). So if a CD says a tune is »traditional« and the database already contains a record for a tune of the same name with a named composer, there are several possible explanations:

  • The database is right and the CD producer is sloppy. This is the most likely explanation.

  • The recording on the CD refers to a different tune than the one that is already in the database. The proper way of treating this would be to check if the tune in the database has been recorded on another album that you happen to have around, and to compare the two. This of course is a lot of work, and usually the titles of tunes are idiosyncratic enough that the first explanation suffices.

  • The tune is claimed by several composers and it isn't clear who's right. Especially in the old times, many composers – even big names such as Niel Gow – weren't above pinching good tunes from their colleagues and publishing them under their own name, and this practice can carry over even to this day and age. For example, the official tune for the dance, The Ladies of Dunse (from RSCDS book 26) is »The Duchess of Buccleuch«, which the RSCDS book insists is a Niel Gow tune, but in fact the tune is by William Marshall.

In cases of doubt it is usually best to assume that the database is right unless you know that it isn't (chances are that the entry in the database has had some research poured into it already). If you can't decide what to do then by all means post a query on the dancedata-friends mailing list; we may have an idea or else will escalate the issue to people who are likely to have a good answer.

Sometimes a recording will use a tune under a name that differs from the name it is usually known by. For example, the tune for The White Heather Jig is »The Six-Twenty Two Step« by Jimmy Shand (this is after the BBC »White Heather Club« TV show which used to air at 6:20pm) but this occasionally occurs on CDs as »The White Heather Jig«. The correct way of entering this is like

TUNE The White Heather Jig [Shand, Jimmy Senior] {Six-Twenty Two Step}

(Don't worry if you think you're not enough of an SCD music anorak to be able to spot these things. We can always correct them after the fact.)

The same mechanism can be used to fix up variant titles (also known as »aliases«) as in

TUNE Mrs Dogsbody's Reel [Trad] {The Honorable Mrs Eulalia Dogsbody's Favourite Reel}

The idea is that the database will correctly reflect what the CD booklet says while also making all the interesting connections that would otherwise go unnoticed. This is a bit of a judgement call – for example, I wouldn't use the alias mechanism just because the CD booklet adds or drops a »The« or an »A« (our users are presumably smart enough to figure that out for themselves), but the Mrs Dogsbody case would in my opinion be worth doing right. Again, if there is doubt chances are that the database is correct, especially if the tune occurs in several other places – but as I said before, we can always switch things around later.

We will have to repeat what we did for »Knit the Pocky« for the eleven other tracks on the CD, and in the interest of brevity this is omitted here. You can find the complete file for reference here.

Uploading the File

Once the file for the album in question is complete, the next step is to upload the file to the database and see what happens. Chances are that the database will find a few things to complain about, but that's not a problem – we will look at this in due course.

Import Form
Go to http://my.strathspey.org/dd/import/. This will present you with a form similar to the one in the picture shown here. (If you aren't logged in as a my.strathspey user, the system will make you take a quick detour to the login page first – we don't take updates from just anybody. In fact, you will have to be authorised to perform imports; send mail to Anselm if you aren't but would like to help.)

Enter the name of the file (on the disk in your computer) into the »File name« field or use the »Browse …« button (called »Durchsuchen …« in the screen shot since my Firefox is German). Leave all the other choices (»Object Type« and »Action«) as they are and select »Submit«. This will take a little while to upload the file, and then the system will ponder your input and figure out how to react. Then it will display the import form again, followed by an annotated copy of the file you submitted, as in following picture.

Output (first helping)
Generally, annotations with a green background are informational – they repeat what the database thought your input meant, and are supposed to serve as insurance that the database understood you correctly. Annotations with a yellow background point out things that may be suspect, while annotations with a red background indicate graver problems. What we shall have to do next is to go through all the yellow (and red, if any) annotations and clear up all the doubts.

Sleuthing

Output (second helping)
The first two yellow boxes show up in the »Knit the Pocky« track we entered above. The issue with »Knit the Pocky« is that the composer given in our file is »Bremner, Robert« while the one associated with the tune is »Traditional«. Generally it makes sense to be more specific; however in this particular case the tune is credited in Book 11 to »Bremner's Collection 1761«. Robert Bremner, who lived from the early 1700s (the actual year isn't known) until 1789, was noted more as a music seller than as a composer, so in the absence of more evidence to the contrary we should assume the database has it right and put »Traditional« as the composer in our album file for the next upload. – The next issue concerns the first name of the composer, »Duncan, S«, which our CD liner notes give as »Samson«. Generally it is great to be able to add more detail to the database, so the way to fix this is to change the name in that composer's record rather than the Book 11 album file. However, it would be good to find independent corroboration. The resource most worth checking for this (unless you have a really good music library) is Andrew Kuntz's Fiddler's Companion, and for the tune, »The Perth Assembly«, this mentions that the tune was indeed

Composed by Samson Duncan (1767‑1837), born at Kinclaven. He was an excellent fiddler and played with some of the most famous fiddlers and bands of the time‑‑Niel, Nathaniel and John Gow.

If you wanted to be really diligent, you could check the database entry for »Duncan, S« to find whether there are any other tunes listed there – there might be a danger that there are more S. Duncans about and all their tunes were lumped together under one entry. There is indeed another tune in the database by the name of »Garey Cottage«, which Kuntz also credits to »S. Duncan« – the hyper-correct way of fixing this would be to add a »Duncan, Samson« entry to the database and assign »Perth Assembly« to it in the absence of evidence that »Garey Cottage« was also written by Samson Duncan (the Gows' buddy) – but it would probably be OK to just change the »S« to »Samson« until more information about »Garey Cottage« comes to light.

Here's a quick summary of what we did so far:

  • »Bremner, Robert« as the composer of »Knit the Pocky« is probably not quite right, so we replaced it by »Trad« in the album file.

  • »Duncan, Samson« is, in fact, the composer of »Perth Assembly«, so we changed the database to reflect this observation in either of the two ways outlined above.

On to the next issue, which we find in the next track, »Monymusk« …

Output (second helping)
The first problem here is easily cleared up by referring to Book 11, which in a footnote to the music page for »Monymusk« says: »Composed by Daniel Dow and called by him "Sir Archibald Grant of Monemusk's Reel"«. So the reference to »The Reids O' Monymusk« is a red herring. The Dow tune is in DanceData as »Sir Archibald Grant of Monymusk« (with a Y), and the correct way to enter this is using

TUNE Monymusk [Dow, Daniel] {Sir Archibald Grant of Monymusk}

i.e., by giving the »database title« of the tune in braces.

The next issue (a red one, no less) concerns the tune, »Mrs Muir McKenzie«, which doesn't appear to be in the database at all. A good bet in this case is to search for tunes with names containing things like »Muir« to see if there are any close mismatches, and in fact it seems that the lady (or any of her close relations) was quite popular with composers in her time – the database lists three tunes which might be contenders (»Miss Muir MacKenzie« – a traditional tune –, and »Mrs Muir McKenzie's Delight« and »Mrs Muir McKenzie's Favourite«, both written by Charles Sharpe). It is not completely unlikely that the Wm. Gow tune on the Book 11 recording is actually the same as the »traditional« tune called »Miss Muir MacKenzie« (Miss, Mrs – who cares? It would probably be Ms Muir MacKenzie today, anyway), so to be 100% sure one would have to dig out one's copy of Bobby Brown's album, »Ready…And!« and listen to »The Bonnie Breist Knots« in comparison to »Monymusk«. On the other hand, we want to be done in time for dinner, so for the moment it is OK to note this as a possible issue and add the tune as a new tune (and eventually the database will provide a way to document this). Actually our importer can do the adding for us if we replace

TUNE Mrs Muir McKenzie [Gow, William]

by

TUNE+ Mrs Muir McKenzie [Gow, William]

(note the »+« after TUNE).

The next tune, »The Auld Brig o' Ayr« is identified by the database as an alias of »The Miller o' Dervil«, and according to the Fiddler's Companion this makes actual sense (Phew!). The yellow here comes from the fact that the other tune actually has a named composer, so it would probably best to put

TUNE The Auld Brig o' Ayr [Trad] {Miller o' Dervil, The}

Finally, another red one – »Lady Hamilton Dalrymple« is a fairly popular tune and indeed it is in the database already as »Lady Hamilton Dalrymple's Strathspey« (as a quick search for »Dalrymple« shows), so

TUNE Lady Hamilton Dalrymple [Trad] {Lady Hamilton Dalrymple's Strathspey}

ought to clear this up.

Here is a quick run-down of the other issues our trial run has unearthed (a copy of the complete log is here):

  • The composer of the tune, »Johnny McGill«, is in the database as »MacGill, John« rather than »McGill, John« (according to the CD booklet). The MacGill spelling goes back to the RSCDS book, and the software is too stupid to figure out that »Mc« is the same as »Mac«. This should be fixed in the album file.

  • »Hon. Miss Jessie Ruthven's Favourite« isn't in the database (check for »Ruthven«), so we'll have the importer add it.

  • »Miss Williamson of Oldfield's Jigg« is credited to »unknown« in the database and »traditional« in the CD booklet. This doesn't make a lot of difference in practice (it is one of those cases where CD producers put »traditional« as an easy cop-out – read up on the tune, The Recruiting Officer, in the Fiddler's Companion for the sordid story), so I'd change the album file to read

    TUNE Miss Williamson of Oldfield's Jigg [(unknown)]

    (with the parentheses).

  • »Mrs. Lieut. Morison's Fancy« is another bona-fide addition to the database (it would make sense to check »Morrison« tunes here, too).

  • »Brig of Perth« is another one by Mr. Dow, who apparently couldn't make up his mind whether he wanted to be called Daniel or Donald. This should be fixed in the album file.

  • »Mrs George Stewart's Strathspey« is credited to Charles Grant in the CD info, where the database has only »(traditional)«. I'd go with the CD info here – the Fiddler's Companion doesn't appear to have heard of the tune – and change the database.

  • The composer of »Betty Washington« is James Scott Skinner, but the »Scott« was his middle name, not part of his last name (one never stops learning), so this should be fixed in our album file.

  • »Sleepy Maggie« is another Bremner case.

  • »Thomson's Got A Dirk« is one where you need to get out your rad Gaelic skillz – this tune is often called »Biodag Air Mac Thomais« (which according to The Skye Collection means »Thomas' Son Wears A Dirk«; close enough!) – and superior knowledge of SCD recordings to remember that the same tune occurs on Muriel Johnstone's »Dancing Live« CD as »Boidag Ain McOmish« [sic!]. For the moment we shall put

    TUNE Thomson's Got A Dirk [Trad] {Boidag Ain McOmish}

    and sort it out in the database later.

  • »Dainty Davie« is another Bremner-like case. I'd go with »Trad«.

  • »Lady Caroline Montague« is given to Nathaniel Gow by the Fiddler's Companion, and I'd go with that (and change the database).

  • According to the Fiddler's Companion, the composer of »Mrs Adie«, William Logan (as per the CD info), was indeed an army officer (as per the database), so again the database should be fixed here.

  • »Patrick George Moncrieff Esq.« by Alex Leburn isn't in the database but a tune »Patrick George Moncrieff of Reedie« (composer unknown) is. This should be cleared up by comparing the two recordings (anyone have Muriel Johnstone's »SCD for children« tape around?), but the title is unlikely enough to proceed on the assumption that the two tunes (both being jigs) are the same. I'd credit the tune in the database to Mr Leburn and put it as the actual tune (in braces) in the album file, pending further investigation.

  • »Fiona MacDonald's Jig« is probably a new tune by our esteemed band leader so I'd just add it as new.

  • The same applies to »Sweet Nell of Glencarse« by Angus Fitchet. Note that the annotation makes it convenient to check for other works (with possibly similar titles) by the composer in question.

  • »The Chase«, credited in the CD info to Joshua Campbell, is probably another Bremner case.

  • »Maghie's Matchless« is probably a misspelling of »Magnie's Matchless«, which is in the database already.

  • »Luggin' The Box« by Neil Barron is in the database as »Lugging the Box«. (For our non-English speakers, this means »moving the accordion«.)

  • »Whisky for Breakfast« is probably a new (to the database) tune.

  • »Robin Anderson - Orkney« by Donald Ridley is in the database as »Robin Anderson (Orkney)«. This is a case where I would change the album file.

  • »Belladrum House« is another (Scott) Skinner surname confusion.

  • »Lady Louisa Gordon's Strathspey« is credited to Robert Mackintosh on the CD, and this is, in fact, correct – even though the database claims it for William Marshall (who wrote a tune called »Lady Louisa Gordon's Reel«, as well as many other tunes for the Gordon family, who he happened to work for as a factor).

  • »Invercauld's Reel« is in the database as »Invercauld«; I'd go with that pending future resolution.

  • »The Banks o' Spey« is in the database as »The Banks of Spey«.

  • »Mrs Rait's Strathspey« appears to be another unseen tune.

  • »Gertie Gibb« is another (Scott) Skinner.

  • The two »Fellowes« tunes in »Rakes of Glasgow« are presumably more new material by John Renton himself.

If you are still reading this you are probably driven only by morbid curiosity as to what other chores will be inflicted upon you. Congratulations on your persistence. It is an unfortunate fact that our database does seem to require a fair amount of detective work, but the good news is that the more we work now to improve it, the less work is going to be necessary in the future! The main tedium appears to be with the tunes so don't worry if you are not enough of a music nerd to really dive into this – there are people around who are, and they will eventually get around to picking up the pieces.

Once we have made all the changes outlined above – either to the album file or else the database – we can re-upload the album file to see what happens next.

The Second Trial Run

Result Overview
The next trial run will hopefully result in output that contains only green annotations. After the annotations, there is a new section summarising the items that will be added to the database. This is mostly useful when debugging the importer, but may be worth checking over (see the tables specifications) even so. If you are happy with the output the next step is to actually commit the new additions to the database.

Committing the Data

Do this by going back to the top and selecting »Commit data to the database« from the »Action« pull-down menu, then resubmitting the album file.

If this goes through – the output looks substantially like the one from the step before –, the data will have been added to the database. Feel free to browse the database to see whether the additions show up in all the right places.

Here's a copy of the final file used for the successful import.

You see things, and you say »Why?« But I dream things that never were, and say »Why not?«
– George Bernard Shaw