Tuesday, October 08, 2013

Naming your computer files for easy referencing


Housekeeping on computers - File naming.

If you produce a lot of documents you need a consistent naming style that allows for understanding the content of the file without having to open it and for easy sorting should you make multiple editions of a similar file.

One of the environments that I work in produces audio files; some on a weekly basis, monthly, yearly.
Once determined your naming system can apply to most situations and allow for the above conditions to be met, ie. know what's IN the file, ie. what it is, and see easily the different versions/editions.


All consistently regular text should be before the text that modifies it;
eg. dates or sequence numbers ("1", "2" etc).
If the number of files is large and the sequence numbers go beyond single digits then earlier numbers need to be "padded" with zeros eg. "01", "02" ... "10" ... "99"
or "001", "002" ... "999"
This is because "12" would sort before "2" in a list (because the "1" and the "2" are in the same column), but "02" would sort before "12".


For our first example here are two audio files:

Original names:

Promotional AD 1 2013.wav
Promotional AD 2.wav

New names:

Promotional AD 2013 1.wav
Promotional AD 2013 2.wav

A simple rename in this case. Add the year to the second filename and put the sequence number at the end.

Obviously the dates on the files above is simple being a once a year item but if months [and days] are required then always use reverse order for nice chronological sorting in your File Manager (eg. Windows Explorer)
eg. YYYY-MM-DD

So, why do this?
Below is a list of filenames, derived from daily logs, please look through them and find the most recent file:

01-10-2013.txt
02-10-2013.txt
03-09-2013.txt
03-10-2013.txt
04-09-2013.txt
04-10-2013.txt
06-09-2013.txt
07-09-2013.txt
08-10-2013.txt
09-09-2013.txt
10-09-2013.txt
11-09-2013.txt
12-07-2013.txt
12-09-2013.txt
13-09-2013.txt
14-09-2013.txt
15-09-2013.txt
17-09-2013.txt
18-09-2013.txt
26-09-2013.txt
27-09-2013.txt

Now here are some similar filenames but in the YY-MM-DD format, find the most recent one:
13-07-12.txt
13-09-03.txt
13-09-04.txt
13-09-05.txt
13-09-06.txt
13-09-07.txt
13-09-09.txt
13-09-10.txt
13-09-11.txt
13-09-12.txt
13-09-13.txt
13-09-14.txt
13-09-15.txt
13-09-16.txt
13-09-17.txt
13-09-18.txt
13-09-19.txt
13-09-20.txt
13-09-24.txt
13-09-25.txt
13-09-26.txt
13-09-27.txt
13-09-28.txt
13-09-29.txt
13-10-01.txt
13-10-02.txt
13-10-03.txt
13-10-04.txt
13-10-06.txt
13-10-08.txt

Even though this list did not write the date in full form (YYYY) it's still readily apparent, in a long list like this, the chronological order. But if we had less files using the same naming convention it could be harder:
eg.
09-07-12.txt
09-09-03.txt
10-09-04.txt
10-09-05.txt
11-09-06.txt
11-09-07.txt
12-09-09.txt
12-09-10.txt
12-09-11.txt
12-09-12.txt

In the list above which digits are the years and which are the days and which are the months? It's perfectly possible that the first digits are the days meaning this list is sorted first by the value in the days then months then years. You can see, if that were the case, that we have files from 2012 at the top and at the bottom of this 'sorted' list. Not very convenient.
We might deduce that a column doesn't contain months if there were digits above 12 but we also need those digits to be above 14 (as of the writing of this post in 2014) to show that the digits do not represent years. Should we presume the years are the first digits or the last? Even if we presumed they were last, because many people (of the Australian clients I've worked with) DO write DD-MM-YY, how do you *know* for sure? Add to that the different formats other countries use:

For example Australia and America differ in how they write the date (DDMMYY vs MMDDYY) therefor there is the potential for confusion between an Australian date, 03/06 (3rd of June) and an American, 03/06 (March 6th).

So writing the full Year first removes that ambiguity as that implies the reverse order style, and that order is only done one way YYYYMMDD, no 'other' version, at least not in my experience. If it is done it's done very rarely.
Using the reverse date order (YYYYMMDD) the above list would become:

2009-07-12.txt
2009-09-03.txt
2010-09-04.txt
2010-09-05.txt
2011-09-06.txt
2011-09-07.txt
2012-09-09.txt
2012-09-10.txt
2012-09-11.txt
2012-09-12.txt

Adding the hyphens makes the date(s) easier to read than without but it is still relatively easy to discern which digits represent years in the list below:


20090712.txt
20090903.txt
20100904.txt
20100905.txt
20110906.txt
20110907.txt
20120909.txt
20120910.txt
20120911.txt
20120912.txt


Now, why do we want the 'regularly occurring' text before the date or sequence number?
Let's say you produced some documents and the parts were labelled:
1 history of tech.doc
2 history of tech.doc
3 history of tech.doc
4 history of tech.doc

This naming looks reasonable enough at first glance, and kept on their own in a separate folder they will not be confused but should you place them into the same location as another group of documents, similarly labelled:

1 dawn of technology.doc
2 dawn of technology.doc
3 dawn of technology.doc
4 dawn of technology.doc

the end result, when viewed in a file manager (eg. Windows Explorer) would be:

1 dawn of technology.doc
1 history of tech.doc
2 dawn of technology.doc
2 history of tech.doc
3 dawn of technology.doc
3 history of tech.doc
4 dawn of technology.doc
4 history of tech.doc

Now when you want to drag ONE set of documents away from the other it's not as simple as drawing a selection box around the files of one of those sets.
This list only puts two difference 'groups' of files together, imagine this list if a large number of files, of a variety of 'groups', named thus were all dropped into the same folder.

On the other hand; if the sequence numbers, or dates, were at the end (a suffix rather than a prefix), the similarly 'titled' files of each group would 'glob' together and the sequence numbers or dates would sort them:


dawn of technology 1 .doc
dawn of technology 2 .doc
dawn of technology 3 .doc
dawn of technology 4 .doc
history of tech 1 .doc
history of tech 2 .doc
history of tech 3 .doc
history of tech 4 .doc

Also remember that file and folder names can be quite long (256 characters) so unless you have software that reduces the available viewing size of a filename (or somehow squashes it making it hard to read) then you should feel free to write descriptive names; don't limit filenames to one or two words if it creates ambiguity with other files with similar names. If you are restricting the length of the filename for ease of viewing in other software then at least make your folder names more informative.

eg.
in a folder named "dawn of technology" I place files named:

d.o.t-1.doc
d.o.t-2.doc
d.o.t-3.doc
d.o.t-4.doc

                                                                                        


There, hopefully that is of some use to someone out there on the webs. Email any questions and if need be I'll edit this post to reflect that which I've [probably] neglected.  =)

D