patternpythonMinor
Universal Spreadsheet reader/writer
Viewed 0 times
writeruniversalreaderspreadsheet
Problem
I have a project that reads spreadsheets as a rectangular list of lists (matrix) transforms the matrix and then writes it to another spreadsheet. I want to be able to read from and write to multiple formats. Consequentially, the program should double as spreadsheet conversion tool.
Here is an example of just copying one sheet to another:
Depending on what type of object
The design I have implemented is an odd factory pattern. It's primarily why I am asking for advice.
Abstract base class
This is pretty basic and would essentially be the interface in a proper factory pattern:
Concrete Class
The only concrete base class I have implemented so far. Note that it reads and writes sub-vectors as rows. I am specifying that all Concrete classes operate that way.
Also note that I am allowing optional arguments, specifying the range to write them to.
```
class ExcelSheet(ISheetIO):
def read_sheet(self, sheet, col_start=0, row_start=0, col_cnt=-1, row_cnt=-1):
# TODO: Raise an error instead. Research which error to raise
assert not sheet.ragged_rows
if row_cnt < 0:
row_cnt += sheet.nrows + 1
if col_cnt < 0:
col_cnt += sheet.ncols + 1
return [sheet.row_values(r, col_start, row_start)
for r in xrange(row_start, row_cnt + 1)]
def write_sheet(self, data, sheet, col_
Here is an example of just copying one sheet to another:
def ConvertSheet(in_sheet, out_sheet)
sheet_io = SheetIO()
data_matrix = sheet_io.read_sheet(in_sheet)
# different functions would operate on data_matrix before writing.
sheet_io.write_sheet(out_sheet, modified_data)Depending on what type of object
in_sheet and out_sheet are determine how SheetIO writes it but as far as the client cares, that's SheetIO's problem.The design I have implemented is an odd factory pattern. It's primarily why I am asking for advice.
Abstract base class
This is pretty basic and would essentially be the interface in a proper factory pattern:
class ISheetIO(object):
""" Abstract base class...
"""
def read_sheet(self):
raise NotImplementedError("read_sheet not implemented")
def write_sheet(self):
raise NotImplementedError("write_sheet not implemented")Concrete Class
The only concrete base class I have implemented so far. Note that it reads and writes sub-vectors as rows. I am specifying that all Concrete classes operate that way.
Also note that I am allowing optional arguments, specifying the range to write them to.
```
class ExcelSheet(ISheetIO):
def read_sheet(self, sheet, col_start=0, row_start=0, col_cnt=-1, row_cnt=-1):
# TODO: Raise an error instead. Research which error to raise
assert not sheet.ragged_rows
if row_cnt < 0:
row_cnt += sheet.nrows + 1
if col_cnt < 0:
col_cnt += sheet.ncols + 1
return [sheet.row_values(r, col_start, row_start)
for r in xrange(row_start, row_cnt + 1)]
def write_sheet(self, data, sheet, col_
Solution
Indeed, this factory pattern is not right. A factory should have the single responsibility of creating instances. You violated that the moment you made it a subclass of
The first step to make this better is to change the factory to be more factory-like:
The methods that remain,
This class should not extend
Why? You would rightfully ask:
it has
It even uses the same kind of parameters (uhm, more or less).
But a crucial (and somewhat subtle) difference is that the methods in the helper class and the methods in
That is, the methods in
In strongly typed languages,
these two interfaces wouldn't be compatible, the code wouldn't compile.
In other words,
the helper and the concrete classes don't follow the same interface.
You could create a dedicated parent class for
That might sound like overkill,
since you only plan to have just one implementation,
it might be worth it,
to make the distinction from
Or you could be lazy and just make it extend
And when you use the helper,
you should add "helper" in the variable name to avoid confusion:
Actually,
since you probably won't ever work with the concrete classes directly,
it would be better to rename the interface of the concrete classes instead.
So in the end, have this class structure:
In the end, the names are quite different from what I used in the beginning.
I hope you see the evolution that lead us here, and hopefully it's not too confusing.
If this is not clear,
let me know, I can restructure your original if it would help.
Code review
In this code:
When I see
These are very easy to answer, but I still have to think of them.
It would be better if I didn't have to think.
It seems to me that this simpler variant would be just as good,
and it won't make me think:
In this code:
It's a minor thing,
but since each
I prefer this writing style:
ISheetIO, which is an entirely different responsibility.The first step to make this better is to change the factory to be more factory-like:
- Rename it to
SheetIOFactory
- Extend
object(stop extendingISheetIO)
- Rename the
_factorymethod tocreate_sheetmethod: this will become the official way to create sheets
The methods that remain,
read_sheet, write_sheet, _orient_data, _transpose belong in another class. Let's call it SheetIOHelper.This class should not extend
ISheetIO.Why? You would rightfully ask:
it has
read_sheet and write_sheet methods.It even uses the same kind of parameters (uhm, more or less).
But a crucial (and somewhat subtle) difference is that the methods in the helper class and the methods in
ExcelSheet have completely different contracts:- The methods in
ExcelSheetare designed to work with Excel sheets
- The methods in
SheetIOHelperare designed to work with whatever kind of sheet you throw at them
That is, the methods in
SheetIOHelper are expected to detect appropriate source / target sheet type, the methods in ExcelSheet are just not designed that way, they work with only a specific type of sheet.In strongly typed languages,
these two interfaces wouldn't be compatible, the code wouldn't compile.
In other words,
the helper and the concrete classes don't follow the same interface.
You could create a dedicated parent class for
SheetIOHelper.That might sound like overkill,
since you only plan to have just one implementation,
it might be worth it,
to make the distinction from
ISheetIO loud and clear.Or you could be lazy and just make it extend
object.And when you use the helper,
you should add "helper" in the variable name to avoid confusion:
def ConvertSheet(in_sheet, out_sheet)
sheet_io_helper = SheetIOHelper()
data_matrix = sheet_io_helper.read_sheet(in_sheet)
# different functions would operate on data_matrix before writing.
sheet_io_helper.write_sheet(out_sheet, modified_data)Actually,
since you probably won't ever work with the concrete classes directly,
it would be better to rename the interface of the concrete classes instead.
So in the end, have this class structure:
SheetIOHelper: interface of the helper, that can read / write any types of sheet
SheetIO: implementation of the helper
ConcreteSheetIO: interface of concrete implementations that can only read / write sheets of typeT
ExcelSheetIO: concrete implementation to read / write Excel sheets
CsvSheetIO: concrete implementation to read / write CSV sheets
SheetIOFactory: implementation of the factory, createsConcreteSheetIOinstances and does nothing else
In the end, the names are quite different from what I used in the beginning.
I hope you see the evolution that lead us here, and hopefully it's not too confusing.
If this is not clear,
let me know, I can restructure your original if it would help.
Code review
In this code:
def read_sheet(self, sheet, col_start=0, row_start=0, col_cnt=-1, row_cnt=-1):
# ...
if row_cnt < 0:
row_cnt += sheet.nrows + 1
if col_cnt < 0:
col_cnt += sheet.ncols + 1When I see
row_cnt += sheet.nrows + 1 I have 2 thoughts:- What is the value of
row_cntbefore the+=?
- Why the +1 in
sheet.nrows + 1?
These are very easy to answer, but I still have to think of them.
It would be better if I didn't have to think.
It seems to me that this simpler variant would be just as good,
and it won't make me think:
if row_cnt < 0:
row_cnt = sheet.nrows
if col_cnt < 0:
col_cnt = sheet.ncolsIn this code:
if sheet_type is xlrd.sheet or sheet_type is xlwt.WorkSheet:
return ExcelSheet()
# elif CSV
# return CSVSheet()
# elif OpenXML
# return OpenXMLSheet()
else:
raise TypeError("Unknown Sheet type or not a sheet")It's a minor thing,
but since each
elif branch returns,I prefer this writing style:
if sheet_type is xlrd.sheet or sheet_type is xlwt.WorkSheet:
return ExcelSheet()
# if CSV
# return CSVSheet()
# if OpenXML
# return OpenXMLSheet()
raise TypeError("Unknown Sheet type or not a sheet")Code Snippets
def ConvertSheet(in_sheet, out_sheet)
sheet_io_helper = SheetIOHelper()
data_matrix = sheet_io_helper.read_sheet(in_sheet)
# different functions would operate on data_matrix before writing.
sheet_io_helper.write_sheet(out_sheet, modified_data)def read_sheet(self, sheet, col_start=0, row_start=0, col_cnt=-1, row_cnt=-1):
# ...
if row_cnt < 0:
row_cnt += sheet.nrows + 1
if col_cnt < 0:
col_cnt += sheet.ncols + 1if row_cnt < 0:
row_cnt = sheet.nrows
if col_cnt < 0:
col_cnt = sheet.ncolsif sheet_type is xlrd.sheet or sheet_type is xlwt.WorkSheet:
return ExcelSheet()
# elif CSV
# return CSVSheet()
# elif OpenXML
# return OpenXMLSheet()
else:
raise TypeError("Unknown Sheet type or not a sheet")if sheet_type is xlrd.sheet or sheet_type is xlwt.WorkSheet:
return ExcelSheet()
# if CSV
# return CSVSheet()
# if OpenXML
# return OpenXMLSheet()
raise TypeError("Unknown Sheet type or not a sheet")Context
StackExchange Code Review Q#63689, answer score: 3
Revisions (0)
No revisions yet.