Wednesday 6 March 2013

Regular expression in Python

What is Regular expression and how to use it in Python?
  • A regular expression is a set of string that satisfies certain format.
  • regular expression plus linux command can make life easy.
I have many amazon ebooks which is regularly named by not sorted yet. Two plans initially came out:
  1. browse books by authors' name.
  2. browse books by category. 
In the following, I will introduce how to sort books by authors' names. In the next blog, I will show you how to use Google Ajax web search API to classify them.

#!/usr/bin/python
# Filename: SortLibrary.py
# 2013/03/06
# Version: 1.0

"""
   A script to automatically sort the books by authors' name.

   Usage: SortLibrary.py sorting_directory_here
"""

import sys
import os
import re

def sortLib(dir):
    '''
    sort the books by author name, use python regular expression
    '''
    filenames = os.listdir(dir)
    for filename in filenames:
        match = re.search(r' - (.+)\.mobi', filename)
        if match:           
            path = os.path.join(dir,match.group(1).replace(' ','_'))
            if not os.path.exists(path):
                mkdir_command = 'mkdir '+  path
                os.system(mkdir_command)
                
            mv_command = 'mv {0}/"{1}" {2}'.format(dir,filename,path)
            os.system(mv_command)
            
def main():
    sortLib(sys.argv[1])
    

if __name__ == '__main__':
    main()


outcome:



No comments:

Post a Comment