Attention:

This page compiles several essays and articles written by Phil Calçado between 2007 and 2008. Everything here should be considered unfinished drafts. The content here is incomplete and likely to be out-of-date, the links might be broken, grammar and writing are generally bad, and opinions and recommendations might have changed.

I am publishing this again because Wikipedia and several authors quoted some of this material. I am not actively working on this topic anymore, but if you need help with this content, I might be able to help.

Introduction
Language-Oriented Programming (LOP)
- Language Adaptation
- Layering Domains
Domain-Specific Languages (DSLs)
Domain-Specific Languages Patterns
- Monkey Tag
- Modeling Patterns
Bibliography

Introduction

In 2002 I got interested in the ORM problem. I have read every single piece on the topic I could find and tried tons of solutions, from early TopLink and Hibernate builds to Prevayler. During the process, the problem itself became not that interesting anymore. The Object-Orientation literature had so many cool topics that I started reading and researching about whole paradigm, not that specific bit. Six years later I think I’ve got a good set of information on the topic, at least sufficient knowledge to develop decent OO software.

During that learning I faced many concepts that were new to me. In 2004 a book by Eric Raymond on how UNIX programs are designed gave me some basic information on Domain-Specific Languages, those little languages that we’ve being using to configure UNIX systems for a long while without even noticing.

I followed the few references I could find about the topic but I got no big picture until I read Martin Fowler’s article on Language Workbenches. Since then I’ve been reading and trying a lot the topics on Language-Oriented Programming and Domain-Specific Languages.

These pages are about what I’m learning on the subject. I hope putting my notes and findings online will help people that are wondering about how these topics will change the future. Or not. Don’t expect clear and final answers here, texts and pages will be changing the more I learn about this whole mess.

Language-Oriented Programming (LOP)

I like to define Language-Oriented Programming (LOP) as a way to develop computer programs by creating a new language or modifying an existing one. The new language created during the process will be extremely problem-focused and efficient at solving the given problem.

Using LOP means that you will create a language. When the Hibernate team created its HQL language they did use LOP but when someone uses Hibernate he or she is not necessarily using that paradigm -you can’t say you’re using LOP just because you are using HQL.

LOP is a paradigm, it is a tool and not a goal by itself. When a developer builds a solution using Object-Orientation –other paradigm- the problem won’t be solved just by creating objects, objects are tools to be used while developing the solution. Just the same when you do LOP you use the construction of a new language as a tool to solve your problem, the language is not a goal in itself. When an engineer is working on a new programming language the goal is not solving a business problem but creating a new language, so he is not applying LOP.

Domains and Models

Martin Fowler has a more strict definition about Language-Oriented Programming tying the term to Domain-Specific Languages:

I use Language Oriented Programming to mean the general style of development which operates about the idea of building software around a set of domain specific languages.

Martin Fowler, Language Workbenches

I think that it depends on what you call a domain. I have seen dozens of examples where LOP were applied to more basic level of abstractions than modeling business domains so the statement above looked like telling someone that object-orientation is about modeling the real world, what although not wrong is a huge oversimplification.

Let’s get an example of a classic LOP platform: Common Lisp. This language doesn’t provide a loop facility and we normally find like the for or while constructs in other languages. The LOOP construct in Common Lisp is a macro, just like any other that could be written by a user or bought from a vendor. The LOOP macro defines its own language, for example:

(loop for x across "domain" collect x)

Would return a list like the one below:

(#\d #\o #\m #\a #\i #\n)

The “for x across [string]” is not valid Common Lisp syntax, it is a language that is interpreted by the LOOP macro itself. All Lisp dialects are full of examples like this one and this is clearly a form of Language-Oriented Programming but is absolutely not related to the Domain Model of the application.

Fowler’s definition could be a weird opinion from a single author but many papers and articles put LOP and DSLs as synonyms. Dealing with lists and iterating through them, what is what LOOP does is a domain by itself but since I’m familiar with Fowler’s vocabulary I think he is using the same definition he uses on his Pattern of Enterprise Application Architecture book: “[domain is] Logic that is the real point of the system” and as a synonymous to Business Logic.

When wondering about that I remembered one of my favorite basic object-orientation texts, the one from Meilir Page-Jones called Fundamentals of Object-Oriented Design in UML and his broader definition of domains. It divides classes among four major domains: Fundamental, Architecture, Business and Application.

Fundamental Domain is the very basic aspect of a system. Classes at this level deals with concepts likeiterating, dates, text strings, numbers. Those classes are generally useful in all kinds of systems since no matter what is your business about you’ll need them. I would include in this classes that although more sophisticated are still generic, like ORM tools or even very basic frameworks. The important thing about Fundamental Domain’s components is that they can’t impose anything on how your applications is structured.

The Architecture Domain comes right at this structure hole. Classes that deal with this domain will be concerned about the structure of your application. MVC frameworks are a good example.

Ideally, the Business Domain is where the application developer joins the game (we’re not supposed to be creating Fundamental or Architecture classes for each project). This is where the business concepts are modeled into classes, so you’ll find here the Domain Model of a system. The Business Domain can (and ideally will) be used by all applications in the same business developed by that group.

Then the Application Domain defines logic that is exclusive to a given application. As an example a Product class can be used in different systems like an electronic store and a logistics application. The concept of Product is the same for all applications so its behavior and data would be defined in a Business Domain components but the behavior of a store (the classes created to support it) are relative to that application, not to the whole business, and would be contained into the Application Domain.

So LOOP and the like are part of the Fundamental Domain but Fowler’s Domain Specific Languages concept would be somewhere between Business and Application’s domains.

Why LOP?

One of the main things about Page-Jones’ domains is that they draw a very clear line on software reusability. The Fundamental Domain is designed to be reused by others, while the Application Domain is probably useful only for a single scenario.

Language-Oriented Programming is no new paradigm (maybe is a new name but the concept is very old) and I think it’s getting momentum right now because we’re not applying it only in the Fundamental Domain anymore. Experienced programmers in flexible languages like Lisp or Smalltalk are used to creating mini-languages on their daily job tasks. UNIX administrators and users are used to the dozens of mini-languages provided by the environment . The point is that current improvements in programming environments makes easy to apply DSLs even on mainstream languages like Java or Ruby.

For years mainstream development used LOP just at Fundamental Domain levels, defining new query or markup languages for example, but right now we’re seeing movement on bringing LOP to the more valuable layers: Business and even Application levels.

And that’s why big players and scientists are interested so much in this technology. We are trying to use LOP so we can reduce the complexity of developing new systems just like WYSIWYG editors like Microsoft Excel helped reducing complexity on creating new spreadsheets.

Language Adaptation

Every programming language has its way of performing tasks, its style. For example most Java developers follow the language’s style of creating classes to represent the relevant concepts and entities in a system and perform tasks by passing messages (calling methods).

The Java style works fine with almost all objects you could need to implement a system, from business world’s to JVM concepts like Classloaders. For example you could deal with Strings by using only basic object-oriented syntax like the snippet below:

final String WHITESPACE = new String(new char[] { ' ' });
String firstName = new String(new char[] { 'J', 'o', 'h', 'n' });
String secondName = new String(new char[] { 'D', 'o', 'e' });
StringBuffer buffer = new StringBuffer(firstName.length() + secondName.length() + WHITESPACE.length());
buffer.append(firstName);
buffer.append(WHITESPACE);
buffer.append(secondName);
String fullName= buffer.toString();

Strings are objects in Java and are fully supported by the language’s object-oriented syntax. Nevertheless like almost all mainstream languages Java has syntactic sugar to deal with Strings, its class is the one that uses operator overloading, the + operator is used only to sum two numbers or concatenate Strings, you can’t overload it in any other ways. So the code above could be rewritten as this:

String firstName = "John";
String secondName = "Doe";
String fullName = firstName + " " + secondName;

If Java has a fairly effective syntax that could handle Strings just like it does with any other object why would such an exception to the main rule be needed? Because just like everywhere else there’s no ‘no one-size-fits-all’ here. It is hard to think of a real world Java application that doesn’t have lots of Strings all over. Generally we have to perform lots of concatenations with Strings and our programs are full of text constants. To make these operations easier and more pleasant to the developer the language defines special syntax to deal with those objects. The same thing can be done in almost all domains. As said above we would generally follow the language style while developing software, in Java we will use objects, get/set syntax, java.util.Iterators, interfaces, etc. Some times, however it would be better if we break this style to handle some concept.

When to Break Language Style?

A common system will deal with lots of concepts, from the Hibernate’s Criteria API has method chaining:

List cats = sess.createCriteria(Cat.class)
    .add( Restrictions.like("name", "Fritz%") )
    .add( Restrictions.between("weight", minWeight, maxWeight) )
    .list();

What makes writing those query objects much more pleasant that the standard Java style but are not committed witht eh definition of a logical flow, it is much more a syntactic sugar that helps on productivity. A Criteria Fluent Interface could look like:

List cats = list(Cat.class).with("name", like("Fritz%") ).and().with("weight", between(minWeight, maxWeight));

Which is a lot more far from what is good object-oriented code but is clearer. While using a Fluent Interface you use those chaining to Eric Evans’ Time & Money library is a classic example of this style:

//(...)So as you read the tests, instead of seeing
Calendar calendar = Calendar.getInstance();
calendar.setTimeZone(TimeZone.getTimeZone("Universal");
calendar.set(Calendar.DATE, 5);
calendar.set(Calendar.MONTH, 3 - 1);
calendar.set(Calendar.YEAR, 2004);
Date march5_2004 = calendar.getTime();
//(...)you will see
TimePoint march5_2003 = TimePoint.atMidnightGMT(2004, 03, 5);

Internal Domain-Specific Languages

When you define a Fluent Interface you are still using the underlying language, you just break its style to get more clarity on your intent. There is JMock.

allowing (bank).withdraw(with(any(Money.class)));
    will(throwException(new WithdrawalLimitReachedException());

allowing (bank).withdraw(Money.ZERO);
    will(throwException(new IllegalArgumentException("you cannot withdraw nothing!");

JMock’s DSL defines a way to program mock objects that will be used in test cases. That language was created to perform this action, you shouldn’t use it as a general programming language.

There is a thin line between a Fluent Interface and an Internal DSL. I think the main difference is that by using an Internal DSL you will be tied to a domain and your language “keywords” and style will be deeply related to that domain. Also you will be defining a little language to be used while programming at this domain. Internal DSLs also will be rarely inlined into plain host language code. Fluent Interfaces have no worries about defining languages, being tied to a domain or anything, it’s just a matter of breaking language style to get more clarity on what you are doing. In the examples above while using Fluent Interfaces you changed the way you call methods in objects but you still use objects to map business concepts and still follow the same message-passing collaborative style of writing programs. When you use a DSL you don’t need to be concerned about things like that. It doesn’t matter if the throwException method returns an object, if it is static or anything, these are concerns that you have when you’re writing programs in Java and by using JMock’s language your are not using Java anymore.

Why not?

Keeping the language style while developing is generally a good practice, it makes easier for other developers to understand what your code do and allow use to use every tool your platform provides. When you give up on the language style to create your own you have to be aware that it’s not because something seems clear to you that it will se to everyone, the new styles you create must be widely understandable by everyone working with that code.

Also frameworks and tools often rely on naming and syntactic conventions to do their job. In Java some frameworks will work only in classes that follow the get/set naming convention for methods, for example.

Modifying language style in general programming is a rare situation. Generally this will be done while developing a library or framework and not application logic itself. Neertheless modifying a language to create an Internal DSL is getting pretty common these days.

Layering Domains

We’ve previously established a parallel between language domains and Meilir Page-Jones class Domains. At this text I’ll try to adapt Page-Jones’ concepts to the current development environment we have.

Page-Jones writes about how some classes have in common the fact that they would be more or less reused according to how they are related to a context defined by a single system. The closer to that context a class is the least reusable it is. By that classification he states that classes like the representation of dates, integers and strings belong to the same group and he calls those group Domains. “Domain” is certainly a overloaded word in computer science, it can have different meanings depending on what we are talking about, but between Page-Jones’ Domains and the Domain concept in Domain-Specific Languages there is no big difference. The book is an introductory Object-Orientation guide, it is full of information on what makes an Object-Oriented project good or bad. Page-Jones groups classes into Domains much as the concept used by the software reusability community in middle-1990 -when the industry was crazy about objects and reusability. That concept was summarized by van Deursen and Klint as:

…Following Simos (1995, 1996), two definitions of domain could be used. The first is generally used in the artificial intelligence and object-oriented communities. It lets a domain correspond to the “real world”, without direct relation to software systems it might be encoded in. The second definition comes from the systematic software reuse research community. It defines a domain as “a set of systems including common functionality in a specified area”.

Arie van Deursen/Paul Klint, Little Languages: Little Maintenance?</em> Domains in DSL-speak is the context a piece of the software system works with. A system will have lots of domains and some of them will be key aspects of the system like business rules processing or handling user input. Other Domains are simpler like handling text and performing arithmetics If you think of DSL Domains you will see that almost always one DSL Domain will fit into only one of Page-Jones’ Domains. We can say that page-Jone’s Domains are superset of DSL Domains grouped by the reusability they have. I think that the real useful metric we can get from Page-Jones’ classification is not about reusability at all, reusability level is a side-effect. The real useful feature is that it tells us if a Domain is tied to the concepts from a single application or if it is more generic and therefore if a Language aimed at that Domain would be specific to that application or business or would be generic and could be used in lots of other systems. I think that those reusability-driven groups act like Layers in a system. Layers as defined by Fowler groups component with related responsibility in a system and act as a dependency control mechanism therefore I will use the term “the Layer of a Domain” to refer to the measure of how this Domain is generic or tied to a specific application.

Standard Layers

The Layers -or Domains by his nomenclature- defined by Meilir Page-Jones are very complete. I couldn’t think of a Domain that wouldn’t fall into one of those but certainly this is a possibility. I present the Layers defined by Page-Jones as the Standard Layers.

There are some interesting characteristics of those Layers:

They aren’t opaque: Languages at higher Layers can access any of the lower Layers or Object Domains, not only the one directly bellows it.
They are accessed top-down:A language at a lower Layer can’t access resources and concepts at upper Layers.

Fundamental Layer

The concepts at this Layer are basic to any software systems, they could be used in almost all architectures, all business and all kind of applications. Common concepts at this domain are database querying, regular expressions, context-free math, text manipulation, I/O and even more sophisticated matters like HTTP request handling or Object-Relational Mapping.

Architecture Layer

Page-Jones states the Architecture Domain as a place where differences and idiosyncrasies between computer architectures are solved. At the time that book was written there were lots of different computer architectures around and each one provided its way of doing things like persisting data, inter-process communication and the like. At that time a system architecture was more about choosing what kind of storage it would use (relational, hierarchical, object-oriented…), IPC mechanisms (raw sockets, pipes, …) or even the window toolkit (AWT, Qt, GTK, MFC…) than about what we now consider architecture: the components of a system and how they relate to each other. This said I think it’s time to get the Architecture Layer to the present. This Layer’s components defines how the system is structured. The concepts at this Layer are related to a single architecture and are useful only in that context. A good example of this Layer is presented by Ruby on Rails, a framework that allows applications to be built in a specific architectural style.

Business Layer

The components of this Layer are useful in a given business domain. This Layer will define concepts that would be useful in lots of different applications that share the same business. An example of what you could find at this level would be a class named Customer. This class can be used in a Customer Management system where you register customer data, the actual sales system where you report that a customer has bought something and a customer relationship management application where you control all your marketing efforts. Those three applications deal with different activities but they all share the concept of a customer. This concept is related to the business as a whole, not to a given application only.

Application Layer

At this level we have concepts and components that are specific to a given application. They will not be useful for any other application, even from the same business. Examples of this Layer are components that handle certain events. For example you could have a component that would receive a signal that the user is going to add a customer and handle the information and the workflow for that action. This component would be useless in other application that wasn’t related to customer register.

Domain Layers and Domain-Specific Languages

A Domain contains components. Components can be anything capable of performing actions one would expect from a computer program. In Object-Oriented terms components would probably be classes or objects. In Language-Oriented Programming they could also be languages. As said above a Layer will have different Domains. A Domain-Specific Language can handle one or more of those providing a vivid representation of the concepts in that Domain. During the history of computer science we have built languages for lots of different Domains besides our General Purpose Languages. If you follow those you will see that much of the languages we have created are in the Fundamental level (SQL, HQL, RegExp and dozens of mathematical languages for example). Early programming environments didn’t provide a productive way of performing event he most basic actions so we started from the very basics. Currently we found out that our sophisticated architectures are too complex to be productive. We have flexibility and power but it is very hard to put the right pieces to work so we have found things like Spring, Castle or Ruby on Rails to make this work easier. Now it is time to the Business and Application-level Domain-Specific Languages. Of course it is currently improbable to have an application that would use DSLs for all its Layers. Current technology provides us with languages to deal with lots of problems and to build up our own language but that doesn’t mean all tasks would be performed by a DSL. This can be good, indeed, since we need a base language -generally a GPL- or we would have to know dozens of programming languages to write a simple application.

Domain-Specific Languages (DSLs)

Programmers today, he often says, are “unwitting cryptographers”: they gather requirements and knowledge from their clients and then, literally, hide that valuable information in a mountain […] of code. The catch is, once the code is written, the programmers have to make any additions or changes by modifying the code itself . That work is painful, slow, and prone to error. We shouldn’t be touching the code at all, Simonyi says.

– Quote from Anything You Can Do, I Can Do Meta

Domain-Specific Language Definitions

There are lots of different definitions for Domain-Specific Languages:

A DSL is a focused, processable language for describing a specific concern when building a system in a specific domain. The abstractions and notations used are tailored to the stakeholders who specify that particular concern.

– Markus Völter

A Domain-Specific Language is a custom language that targets a small problem domain, which it describes and validates in terms native to the domain.

– Domain-Specific Development with Visual Studio DSL Tools

In model-driven engineering, a Domain-Specific Language (DSL) is a specialized language, which, combined to a transformation function, serves to raise the abstraction level of software and ease software development. […] A DSL is a specialized and problem-oriented language [4]. Contrarily to a General Purpose Language (GPL) (e.g., UML, Java or C#), a DSL serves to accurately describe a domain of knowledge.

– DSL Classification , OOPSLA 2007

DSL Examples

Let’s look at some examples of Domain-Specific Languages found in the industry. The information here may be inaccurate or outdated, to get more information follow the links.

Apache Camel DSL

Type: Internal

Host Language: Java

Domain Layer: Architecture

“Apache Camel is a powerful rule based routing and mediation engine which provides a POJO based implementation of the Enterprise Integration Patterns using an extremely powerful fluent API (or declarative Java Domain Specific Language) to configure routing and mediation rules. The Domain Specific Language means that Apache Camel can support type-safe smart completion of routing rules in your IDE using regular Java code without huge amounts of XML configuration files; though Xml Configuration inside Spring is also supported.”

Example:

// lets log messages from gold customers
interept(xpath("/customer[@type='gold']).to("log:customer");
from("seda:foo").to("seda:bar");

Regular Expressions

Type: Internal

Host Language: Any

Domain Layer: Fundamental

“In computing, a regular expression is a string that is used to describe or match a set of strings, according to certain syntax rules.”

Example:

.at ->matches any three-character string ending with "at", including "hat", "cat", and "bat".
[hc]at ->matches "hat" and "cat".
[^b]at ->matches all strings matched by .at except "bat".
^[hc]at ->matches "hat" and "cat", but only at the beginning of the string or line.
[hc]at$ ->matches "hat" and "cat", but only at the end of the string or line.

sed

Type: External

Domain Layer: Fundamental

“sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed’s ability to filter text in a pipeline which particularly distinguishes it from other types of editors.”

Example:

#!/usr/bin/sed -nf

     /^$/ {
       p
       b
     }

     # Same as cat -n from now
     x
     /^$/ s/^.*$/1/
     G
     h
     s/^/      /
     s/^ *\(......\)\n/\1  /p
     x
     s/\n.*$//
     /^9*$/ s/^/0/
     s/.9*$/x&/
     h
     s/^.*x//
     y/0123456789/1234567890/
     x
     s/x.*$//
     G
     s/\n//
     h

Structured Query Language (SQL)

Type: Internal

Host Language: Any

Domain Layer: Fundamental

“[…]is a computer language designed for the retrieval and management of data in relational database management systems, database schema creation and modification, and database object access control management.” Example:

SELECT books.title, count(*) AS Authors
FROM books
JOIN book_authors
ON books.isbn = book_authors.isbn
GROUP BY books.title;

Domain-Specific Languages Patterns

Monkey Tag

Tag the objects that you patch so you don’t break the host language

When creating an Internal Domain-Specific Language inside a language like Ruby you generally need to modify objects from the host’s core library -arrays or strings for example. Adding new methods to those objects is pretty straightforward but when you have to change the behaviour of an existing method other parts of the language can break since they weren’t expecting the new behaviour.

To be able to patch the host language’s core objects when creating your internal language but avoid breaking the host language’s ecosystem you should make your modifications be applied only in your language. To identify the a language construct is from your language or from the host language you should mark your objects with a tag and look for it.

The name of this pattern is a reference to Monkey Patching.

How it works

How to actually mark an object (or any other construct you are using) depends on the host language used. I’ve used this pattern in Ruby and Java (although you can’t extend most o Java’s core classes you still have some extensibility). In Ruby generally you will make objects from your language defining a method that identifies them. The method can return some value (true/false or a symbol for example) or just be checked with respond_to? to see if the object has that method. In Java there are two ways to do this. The first is using old-school tagging interfaces. Your object should implement an interface from your DSL and in the extension method you make an instanceof check. The more modern way –available in Java 5.0-and-later deployments- is using an annotation in classes and methods used by your DSL.

When to use it

When you modify a core method you should always be careful. Code from external libraries and from other –and often obscure- parts of the host language’s core probably won’t be expecting the modification and this could drive to unpredictable consequences. In dynamic languages this is even worse since you have no real control of what calls the code you modified. Languages that provide macros are moe flexible in their syntaxes, if you are using those probably you don’t need this pattern at all. Whenever changing the core classes to build an Internal Domain-Specific Language it is mandatory to limit the scope of change. Monkey Tagging is a nice way of doing this.

Example: Defining a Log Literal

Registro’s Domain-Specific Language defines a log file line using several core constructs, like below.

['20080124', '01:20:00,018']  - (INFO "Something happened ")

In order to make this work we have to redefine the method Array#-, a Ruby core method. We really don’t care about other uses of that, we only want to change its behaviour when called from a Registro DSL call. In this situation the Array#- method receives the return of a DSL function, in the example would be the return of the INFO method. We will check this argument for a tag. So we tag the returned object:

def self.method_missing(name, args)
	# INFO looks like a lo level (just like WARN or DEBUG)
            if(looks_like_log_level? name)
	    #the object that will be returned
                r = { :level => name.to_s.downcase.to_sym, :text => args}
	    #tag it by adding a method
                def r._registro_dsl?
                    true
                end
            else
                r = super.original_method_missing name, args
            end
            r
        end

And check for the tag in the Array extension:

class Array
    alias  :original_minus :-

   # A separator for parts in a log entry. Will return an array containing self + the argument
    def -(arg)
	#if argument has the tagging method do the magic, otherwise do the old logic
        (arg._registro_dsl?)? [self, arg] : original_minus(arg)
    end
end

Notice that this code would result in a NoMethodError for objects that are not part of our DSL. We could change this to use respond_to? but instead we prefer to define this method into all object, thus we will define a default implementation on the Object class.

def Object._registro_dsl?
	False
end

So not every object created will be from the DSL domain by default.

Modeling Patterns

A Domain-Specific Language is, as the concept name suggests, related to a Domain. Eric Evans defines that as:

Every software program relates to some activity or interest of its user. That subject area to which the user applies the program is the Domain of the software.

When using Domain-Specific Languages we have several options on how to Model the underlying Domain. Although all options share the final result of creating a specialized programming language as the main interaction channel between programmer and the system they have different strategies for Domain modeling.

I’ll use as an example a very simple Internal DSL that has operations on Cartesian coordinates as its Domain. This DSL was created to exemplify concepts, this code shouldn’t be considered production-ready.

Language as Model

There is no Domain Model, the Language is the Model

In an Internal or External DSL, the language processor will often be a program written in a powerful programming language. Using Language as Model you will implement your Model into the Language Processor itself. In an External DSL this means that its Domain concepts will be implemented among the code that parses the language. In an Internal DSL you will implement those in the same code that you use to adapt the host language to your syntax. Here is an example using Ruby:

class Array
    def +(other)
        modified = self.clone
        modified[other.axis] = modified[other.axis] + other
        modified
    end
end

class Fixnum
    attr_reader :axis

    def X
        @axis = 0
        self
    end

    def Y
        @axis = 1
        self
    end
end
#test
describe 'language as the Model' do
    it '' do
        result = [1,3] + 2.Y
        result.should eql([1,5])
    end
end

Developing the Model inside the language is a simple solution that can be easier to implement and require less Modeling effort than its alternatives. One of its problems is that although it works quite well for small, simple and stable languages it won’t scale to fulfil the needs of real-world languages -that will always evolve. Change is too painful in this strategy and even the simplest syntax modification can break your underlying logic. Integration may be a problem as well, especially for Internal DSLs. By not using the host language’s usual practices -like using proper objects to represent concepts- it is very likely that your code will be very hard to integrate with libraries, frameworks and even your own code from other modules. I’d recommend this approach for prototypes, frozen-syntax DSLs or simple External DSLs.

Language as Interface

The Language is an interface to the Domain Model

Using DSLs doesn’t really require anything to change in the way we Model Domains. We can still apply techniques like Domain Driven Design and use a rich object graph to represent the software. The language is a way to interact with this graph.

Using Language as Interface there is nothing that you do using the DSL that you can’t possibly do just by calling objects in your host language; the language has no business rule at all. The DSL is basically a set of Domain-oriented syntactic sugar constructs.

An example in Ruby:

#domain
class Cordinate
    attr_reader :x, :y

    def initialize(x,y)
        @x=x
        @y = y
    end

    def +(other)
        Cordinate.new(@x+other.x, @y+other.y)
    end
end

#language
class Array
    def +(other)
        coordinate_to_array(self.to_coordinate + other)
    end

    def to_coordinate
        Cordinate.new(self[0], self[1])
    end

    def coordinate_to_array(coordinate)
        [coordinate.x, coordinate.y]
    end
end

class Fixnum

    def Y
        Cordinate.new(0,self)
    end

    def X
        Cordinate.new(self, 0)
    end
end

#test
describe 'language as interface' do
    it '' do
        result = [1,3] + 2.Y
        result.should eql([1,5])
    end
end

This is probably the more sensible approach for DSLs using the current mainstream technology. In this strategy you have the power of Domain-specific constructs while still keeping the actual object Model that can be used and reused by non-DSL clients.

It adds a new step to the development, though. It’s not just a matter of defining the Domain Model, you have to create the language that interacts with that.

Bibliography

These are books, pages and papers I’ve come across while studying Domain-Specific Languages. I’m slowly adding things here.

Software Design & Architecture

Abelson, Harold / Sussman, Gerald Jay - Structure and Interpretation of Computer Programs - 2nd Edition
Alan Kay - the Early Story of Smalltalk: Alan Kay on the motivations behind Smalltalk.
Fowler, Martin - Fluent Interface: Fowler describes an useful technique for chaining methods in order to get more readability. Especially useful for Internal DSLs at static languages such as Java.
Fowler, Martin - Humane Interface: Fowler makes a contrast between the kind of interfaces provided by languages like Java and those provided by languages like Ruby. In Java-like you’ll find a small set of construct that you can use to perform all desired operations, in Ruby-like you’ll find constructs that are more developer-friendly.
Evans, Eric - Domain-Driven Design: Eric Evans describe a technique to apply Model-Driven Design in object Oriented software by eliminating the gaps between actual software artifacts and the domain. Seminal book on OOP.
Page-Jones, Meilir - Fundamentals of Object-Oriented Design in UML: Seminal book on object-orientation principles and the very definition of the layered ‘domains’ in a software.
Raymond, Eric - The Art of Unix Programming: Eric Raymond describes the philosophy of how UNIX programs are develop. A whole chapter is dedicated to minilanguages like awk, sed and Sendmail and how to build them.
P. Klint/P. Olivier - The ToolBus coordination architecture -a demonstration
Fowler, Martin - Patterns of Enterprise Application Architecture: Fowler’s classic catalog of architectural patterns. Also contains very interesting texts on architecture concepts.
XLR: Extensible Language and Runtime - The art of turning ideas into code: Very interesting ree software project. The document on its concepts has very nice definitions for semantic and syntactic noise

DSLs & LOP

Fowler, Martin - Language Workbenches: The Killer-App for Domain Specific Languages?: Fowler works the concepts behind LOP by using examples. Great text.
Ford, Neal - Language Oriented Programming: neal Ford’s keynote presentation on history of computer programming languages and LOP.
M. P. Ward - Language Oriented Programming: Very interesting paper on LOP development compared to bottom-up and top-down approaches. Ward uses both DSLs and LOP as somewhat synonyms but his definition of LOP doesn’t imply in tying a language to a specific domain but to a problem being solved.
Spinellis, Diomidis - Notable Design Patterns for Domain-Specific Languages: Nice compilation of generic design patterns for DSL implementation.
M. Mernik, J. Heering, A.M. Sloane - When and how to develop domain-specific languages: Follow up on Spinellis’ work, catalogs and gives a brief description of patterns that could be used in different stages of LOP development.
Arie van Deursen/Paul Klint/Joost Visser - Domain-Specific Languages: An Annotated Bibliography: Very useful condensation of concepts with pointers to were they are better explained.
Arie van Deursen/Paul Klint - Little Languages: Little Maintenance?: Nice paper on DSL analysis and design with real cases from the financial market.
Markus Völter - InfoQ - Architecture as Language: A story: Markus describes his language to model component-based software architectures. Although I think UML has enough concepts from this domain is a nice and easy text.
Rick Kilmer - InfoQ - Ruby and the Art of Domain Specific Languages: Besides some examples o Internal DSLs in Ruby this talk tries to divide Internal DSLs into two kinds: implicit and explicit.
Lloyd H. Nakatani, Mark A. Ardis, Robert G. Olsen and Paul M. Pontrellidomains - Jargons for Domain Engineering: The benefits of Jargons instead of plain DSLs in domain engineering. Most interesting parts are the initial statement on the difficulty to create a DSL without language design knowledge and the final “Avoiding DSL Pitfalls” section.
Paul Hudak - Building Domain-Specific Embedded Languages: Nice introduction on why DSLs are better at modeling than GPLs. Discuss the creation of languages inside languages, what is called Domain-Specific Embedded Language, using Haskell.
Tim Menzies - Notes on Domain-Specific Languages: Very interesting short explanation of DSLs, including comparrisson to 1984’s Newspeak (and indirectly to Sapir-Whorf hypothesis).
Steve Cook, Gareth Jones, Stuart Kent, Alan Cameron Wills - Domain-Specific Development with Visual Studio DSL Tools: Book on Microsoft DSL Tools platform.
Benoit Langlois, Consuela-Elena Jitia and Eric Jouenne - DSL Classification: A catalog of DSL tooling characteristics and comparison between the currently available (2007) tools.
Sean McDirmid - Turing Completeness Considered Harmful: Component Programming with a Simple Language
Jacques Meekel/Thomas B. Hortont/Robert B. Francet/CharlieMellone L Sajid Dalvi - From Domain Models to Architecture Frameworks:
Dimitrios S. Kolovos/Richard F. Paige/Tim Kelly/Fiona A.C. Polack - Requirements for Domain-Specific Languages:
Bruce/David - What makes a good domain-specific language? (pages 17-35):
Gray, Jeff/Karsai, Gábor - An Examination of DSLs for Concisely Representing Model Traversals and Transformations:
Yoder,Alan/Cohn, David - Domain-specific and general-purpose aspects of spreadsheet languages:
Lloyd Nakatani/Mark Jones - Jargons and infocentrism:
Markus Fromherz/Vineet Gupta/Vijay Saraswat - cc - A generic framework for domain-specific languages:
V. R. Basili, L. C. Briand, W. M. Thomas - Domain Analysis for the Reuse of Software Development Experiences:
Eric Van Wyk - Modular Domain-Specific Language Extensions:
Gilles Dubochet - On Embedding Domain-specific Languages with User-friendly Syntax:
Nitin Arora/Rupert Westenthaler/Wernher Behrendt/Aldo Gangemi - Information Object Design Pattern for Modeling Domain Specific Knowledge:
Diomidis Spinellis - Reliable software implementation using domain-specific languages:
Daan Leijen/Erik Meijer -Domain Specific Embedded Compilerss:
Conal Elliott/Sigbjorn Finne/Oege de Moor - Compiling Embedded Languages:
Paul Hudack - Building Domain Specific Languages:
Scott Rosenberg - Anything You Can Do, I Can Do Meta: Feature article on Charles Simonyi

Domain-Specific Languages (Unfinished Draft)

Attention:

Table of Contents

Introduction

Language-Oriented Programming (LOP)

Domains and Models

Why LOP?

Language Adaptation

When to Break Language Style?

Internal Domain-Specific Languages

Why not?

Layering Domains

Standard Layers

Fundamental Layer

Architecture Layer

Business Layer

Application Layer

Domain Layers and Domain-Specific Languages

Domain-Specific Languages (DSLs)

Domain-Specific Language Definitions

DSL Examples

Apache Camel DSL

Regular Expressions

sed

Structured Query Language (SQL)

Domain-Specific Languages Patterns

Monkey Tag

How it works

When to use it

Example: Defining a Log Literal

Modeling Patterns

Language as Model

Language as Interface

Bibliography

Software Design & Architecture

DSLs & LOP