Sunday, 24 January 2010

OCD and filenames, as pertaining to TV series

I like all my file names to be similar, and this applies to both mp3 and series. For mp3 I use the wonderful foobar2k to tag and rename files, but for series I ended up coding my own formatter. All this assumes a linux system (as is my home server, flatline)
First of all, to extract the video files from the rars:

find -name "*rar" -type f -exec rar e {} \;
This extracts all rars from under the current directory (and subdirectory) into the working directory.
If the files already come uncompressed, but are in subdirectories:
find -name "*mkv" -type f -exec mv {} . \;
Replace mkv with avi if that's what you've got.
Now, sometimes the files come in different formats. Some have [season][episode], others [season]x[episode], and I'd like for them all to be in the same [title]S[season]E[episode] format.
For that all you need is the following python script:

#!/usr/bin/python26
import os,re,sys
from pprint import pprint
expressions = ["(?P<title>[a-zA-Z0-9\-.]+)\.[sS ](?P<season>[0-9]+)[eE ](?P<episode>[0-9]+)\.(?P<rest>[a-zA-Z0-9.()\-\[\] ]*)\.avi",
"(?P<title>[a-zA-Z0-9\-.]+)\.(?P<season>[0-9]+)(?P<episode>[0-9]{2})\.(?P<rest>[a-zA-Z0-9.()\-\[\] ]*)\.avi",
"(?P<title>[a-zA-Z0-9\-.]+)[ _]*[-. ][_ ]*(?P<season>[0-9]+)x(?P<episode>[0-9]{2})[ ]*[-.][ ]*(?P<rest>[a-zA-Z0-9.()\-\[\] ]*)\.avi"]
expressions = expressions + [e.replace("avi","srt") for e in expressions] + [e.replace("avi","mkv") for e in expressions]
compiledExpressions= [re.compile(e) for e in expressions]
if len(sys.argv) > 1:
directory = sys.argv[1]
else:
directory = "lost"
title = None
if len(sys.argv) > 2:
title = sys.argv[2]
print "using %s as directory" % directory,
if title:
print " and %s as title" % title
else:
print
fileFormat = "%(title)s.S%(season).2dE%(episode).2d%(rest)s%(extension)s"
rename = True
for file in [f for f in os.listdir(directory) if f.endswith("avi") or f.endswith("srt") or f.endswith("mkv")]:
print file
for exp in compiledExpressions:
match = exp.match(file)
if not match:
continue
match = match.groupdict()
for k in ["episode","season"]:
try:
match[k] = int(match[k])
except ValueError:
pass
match["extension"] = os.path.splitext(file)[-1]
match["title"] = ".".join([a.capitalize() for a in match["title"].split(".")])
if not match["rest"].startswith(".") and match["rest"] != "":
match["rest"] = "."+match["rest"]
if "season" in match and "episode" in match and "title" in match:
if title != None:
match["title"] = title
newFileName = os.path.join(directory, (fileFormat % match))
print "Renaming to %s " % newFileName
if rename:
os.rename(os.path.join(directory,file),newFileName)
break

It can take 2 arguments: the directory where the file names are, and a title that overrides the one found in the file names. Hope this is helpful for someone else :)

No comments: