vendredi 31 juillet 2015

Normalizing raw text in different formats to create objects in Ruby

I have three text files with the exact same type of information but with different delimiters. One is a CSV, one uses spaces as delimiters, and the last one uses | (pipe) as the delimiter. The delimiters are different, but each row in all of the files has exactly the same format. So in the pipe-delimited file, the format is FirstName | LastName | DOB | City | State | ZIP (there is a space before and after each pipe). The other two files use the exact same order but with the other delimiters. All rows are unique. The files do not have headers.

I want to go through all of these files and create an instance of my Person object for each row. The class looks like this:

class Person
  attr_reader :first_name, :last_name, :d_o_b, :city, :state, :zip

  def initialize(first_name, last_name, ...)
    @first_name = first_name
    @last_name = last_name
    ...
  end

  ...

  etc.

end

I want to parse this data and create the objects in the cleanest and most readable way -- performance/scaling/etc. are unimportant here. What approach would be best for doing this? My initial idea is to convert all of the files to CSV somehow (perhaps with a gsub), then make a nested array from this data, and then iterate over the array to create the objects, but I am looking for any possible better/cleaner ideas.

Aucun commentaire:

Enregistrer un commentaire