Saturday, April 21, 2012

Movimentum - Analysis

I am in a hurry. So I do not want to write the all-encompassing, every-feature-included animation solution. Actually, I'd like to write a program with as little code as possible. Therefore, most work should be done by other programs. Here is my idea of a simple setup:

First, I need the parts of the machine that will be animated. They are simple GIFs or BMPs that can come from at least two sources:
  1. Output of CAD program
  2. Dressed-up cut-out of some picture.
Here are examples of each of these:
  • Drawing from a CAD program:
  • In the following picture, I am not yet done with cutting out a lever from a scanned photo and patching missing parts—but you get the idea:


When I have the parts, they need to be moved around according to my ideas, and the results are written into separate image files. Before that can be done, I have to create some description of the intended movements. And last, but not least, in a third step these separate images must be assembled into a complete animation. The following diagram shows these activities:


Now the question is: Which activities should be supported by which programs? There are many possibilities, e.g. using a program that does all the above (Synfig has been suggested for this); or using a program that does only the central "column" (define movements, create animation images, and assemble them into an animation). I opt for a radical decomposition: A separate program for each activity. This increases the reuse of existing programs, and minimizes work. For three of the activities, the programs are obvious:
  • Create images with CAD—use a CAD program;
  • Cut parts from images—use a photo editor;
  • Assemble animation—use a program like Windows Movie Maker or ImageMagick.
The program for creating the movement model depends on the interface to the central animation activity. What could that interface look like? Here are some obvious alternatives (that should be in the toolbox of every software architect):
  1. Writing some sort of code in a standard programming language.
  2. Writing text—i.e., we describe the model in some new model language.
  3. Writing serializable, text-based structures—nowadays, this mostly means "XML."
  4. Writing serializable binary structure.
  5. Writing database-based structures.
Which one to choose? Essentially, the question hinges on whether the interface should be of a form that can be read and written by humans, or not. If not, then we have immediately decided that we need a separate program to capture the model and writing the "bytes" to the interface. This is legitimate, but that program costs up-front money. So I discard—in my context—4. and 5. This leaves us with 1., 2., and 3.

My experience here is to go with alternative 2. as often as possible.
  • "But isn't that the hardest one? After all, for both 1. and 3., the parsers and the converters to internal models are ready-made, whereas 2. requires writing a new parser."
This reasoning is seriously flawed. Creating that parser is easy with the right tools (I use ANTLR, but there's GoldParser and others). Our main concern right now is to find a description that "fits the problem." And for this, I do not want too many restrictions.
  • XML is obviously a horrible language for humans—its "readability" is so bad that it can just be used for debugging purposes; or for very simple interfaces (people who do not see this are IMHO already "brainwashed" by "XML for everything" proponents).
  • With alternative 1., one gets distracted by the idiosyncracies of the concrete programming language, and various design concepts (e.g. a fluent API design vs. a statement-level API). The limitations of an API modelling approach sometimes even lead people to design "string based APIs", i.e., functions with string parameters that are parsed to define the model. But this is obviously a poor man's version of alternative 2, with the disadvantages of 1 and 2 combined.
Alternative 2 is not without problems:
  • Language design needs considerable freedom of thinking; one must not fall into the trap of including existing language concepts, like "there must be an if-statement in every language."
  • Text-based languages are one- or at most two-dimensional—this can lead to awkward formulations of complex relationships (think of the contortions needed to describe a general graph in XML).
On the other hand,
  • thinking about and in a new language is the most liberal way to attack a problem (there is an old mathematical saying: The correct notation is half of the solution);
  • languages allow easy insertion of comments, so there is no high pressure to make everything easily usable in the first place;
  • and finally, any standard text editor can be used to write text in the new language—with modern editors, there's even easy highlighting and other such features included.
So the tool infrastructure follows the activity model above 1:1:


Before we accept this as a good solution, let's think a little bit more about the activities. If we are honest, the process above is far too idealized: When creating the various inputs, I'll certainly make mistakes—place the pictures at the wrong place, define an unintended movement like too fast or too slow movements. And, moreover—maybe I should have placed this first -, I want to work incrementally: Get the objects for the intial part of the animation, have them move around, adding more objects and more moving, maybe adding text at some place etc. Therefore, the actual process is much more iterative; there is a flow of information from the various outputs to the input:


This means that I need to use all these tools at the same time (this is, of course, the argument for "big integrated software solutions"). But of course, this is not a problem per se—it just requires a little caution in the design of the tools (so that e.g. a tool does not lock a file for its use only). Then all work will flow freely.

No comments:

Post a Comment