Table of Contents

Legacy Records Import

Objective

To facilitate an upload of legacy data into Learnexa in back-end. A simple interactive task will upload legacy data such as User, Course & Enrolment into Learnexa. Uploaded records will automatically be activated or qualified upon creation.

Assumptions

Purpose

A back-end task to upload legacy User/Course/Enrolment records into Learnexa. The legacy CSV data file is used as a source list for the import process. Uploaded records will get validated as per existing business rules and automatically be activated. Learnexa will not send the usual activation email notification to the user.

Status of each imported rows gets recorded in an output file for reference. The output status will be Completed or Failed.

Input File Format

Input files should be a valid CSV file. Having column header as the first row.

User fields:

Valid literals for Member Status are Member or NonMember.

Course fields:

Valid literals for Type are Course, Certification, LiveEvent, InPersonEvent.

Enrolment fields:

Valid literals for Type are Course, Certification, LiveEvent, InPersonEvent.

Output File Format

The output file is also a CSV file. With indication of row-wise status. Each row displays all fields as is in the input file with two additional fields

Valid literals for output Status are Completed or Failed. Sensitive data such as Password will be avoided in the output file.

Sample User output file:

Email Address First Name Last Name Member Status Output Status Error Description
tom@mu.com Tom Brady Member Completed
Ram Kumar NonMember Failed Email address is missing
tom@mu.com Tom Hawk Member Failed Email address has already been taken
ganesh@mu.com Ganesh M NonMember Completed
Amit Raj Nmember Failed Email address is missing. Members status should be Member or NonMember only.

Like User output, Course & Enrolment output files will have additional fields output status & error description.

Processing

Common:

User:

Pseudocode

A Factory implementing with a superclass specifies all standard and generic behavior and then delegates the creation details to sub-classes that are supplied by the client.

require 'csvlint'

# A factory implementation
class Uploader

  def initialize(source, cid)
    @cid    = cid
    @source = source
    @output = CSV.generate "output.csv"
  end

  # Validating source CSV. Method takes validation schema as argument. It's defaulted to User schema.
  def validate(schema='User')
    @csv = Csvlint::Validator.new(File.new(@source), nil, schema)

    #invoke the validation
    @csv.validate

    unless @csv.valid?
      #access array of errors, each is an Csvlint::ErrorMessage object
      output @csv.errors		
    end

  end

  def upload
    @csv do |row|
      begin
        load(row)
      rescue MalformedCSVError
        output(row, err)
      end
    end
  end

  def output(row, status = 'Success')
    @output << row << status
  end
end

The client is totally decoupled from the implementation details of derived classes. Polymorphic creation is now possible.

module User
  def load(row)
    user = User.new
    user.firstname  = row['firstname']
    user.lastname   = row['lastname']
    user.email      = row['email']
    user.status     = User::ACTIVE
    user.action     = User::ACTIONS[:activated]
    user.company_id = @cid
    user.skip_making_activation_code = true
    
    # To avoid all post user creation actions
    User.skip_callback(:set_status) do
      output(row) if user.save!
    end
  end
end

Factory Method enforces that encapsulation and allows an object to be requested without inextricable coupling to the act of creation.

module Course
  def validate
    super 'Course'
  end

  def load(row)
    crse = Course.new
    crse.save!
  end
end
module Enrolment
  def validate
    super 'Enrolment'
  end

  def load(row)
    enrl = Enrolment.new
    enrl.save!
  end
end

An interactive Rack task to direct the execution flow.

namespace :legacy_data do
  
  desc "Legacy User data import"
  task :import => :environment do
  
    def crash_exit
      puts "Invalid entry"
      exit
    end

    def get_client cid
      return "Madras University"	
    end

    puts "Choose import category:"
    puts "1. User"
    puts "2. Course"
    puts "3. Enrollment"
    category = gets.chomp

    puts "Enter CSV file source:"
    src_file = gets.chomp

    while true do
      puts "Enter Client ID:"
      cid = gets.chomp

      puts "Confirm Client:" + get_client(cid)
      confirm = gets.chomp
      break if confirm
    end

    # Polymorphic creation
    import = case category
      when '1'
        Uploader.new(src_file, cid).extend(User)
      when '2'
        Uploader.new(src_file, cid).extend(Course)
      when '3'
        Uploader.new(src_file, cid).extend(Enrollment)
      else
        crash_exit
    end

    import.upload if import.validate
  end
end  

Reference:
CSV Lint - A ruby gem to validate CSV files to check their syntax and contents. The library supports validating data against a schema. A schema configuration can be provided as a Hash or parsed from JSON. https://github.com/theodi/csvlint.rb

Entity Relationship

The following entities are used as part of the implementation

class Company < ActiveRecord::Base
  has_many :users, :dependent => :destroy
end
class User < ActiveRecord::Base
  belongs_to :company
end