Prolog compilers are too permissive

Prolog compilers are too permissive, too forgiving… resulting in sloppy programming. Experienced programmers may shake off guilty feelings convincing themselves that’s only run-once code that nobody is going to reuse, or port, or maintain. Or that’s really not their fault. After all, if the Prolog compiler doesn’t complain and the application appears to run, why care? Novice programmers never notice until they decide to switch Prolog compilers. Or when someone comes along and asks “what if” when trying to help them debugging or porting their applications.

Consider a simple example, the arg/3 built-in predicate. Some popular Prolog compilers have long interpreted a negative term argument position as a failure rather than a programming error. When I complained (I used to do that a lot to Prolog implementers) the reply was something along the lines “we agree but our users would be upset if we break compatibility with their applications”. You can easily find similar examples. Just dig into your memories. E.g. do you remember when “logical update semantics” come along all those applications relying on “immediate update semantics” suddenly broke?

Recently I ported, or tried to port, well known Prolog libraries and interesting Prolog applications to Logtalk, both to allow running these libraries and applications in most Prolog compilers and for testing the Logtalk automatic compilation of Prolog modules as Logtalk objects (oh the pain I choose to inflict upon myself!). Common sins are missing discontiguous/1 and multifile/1 directives, non-declared dependencies, i.e. missing use_module/1 or use_module/2 directives, arbitrary goals used as directives (instead of disciplined use of the initialization/1 directive), duplicated predicate exports (specially when re-exporting predicates from imported modules), operator overdoses, term expansion clauses defined in the same file that is going to be term expanded (suddenly predicate definition order in a source file is important!), not to mention code relying on assumptions that are specific to a Prolog compiler or an operating-system. Prolog compilers should warn the users about these (and other) issues.

Another problem is hidden dependencies in Prolog “modes”. Some Prolog compilers support an “iso” mode (that usually is not the default) and some other modes for backwards compatibility. Guess what happens when users try to reuse libraries developed for the same Prolog compiler but using different modes. Or when users startups the Prolog compiler in the wrong mood… err mode. If they are lucky, they will get a stream of compilation errors.

This is a vicious circle. Prolog implementers are afraid to make their compilers more strict because they don’t want to break existing code or upset users. Users enjoy a permissive and flexible programming environment, even if they risk shooting themselves in the foot, unconsciously write non portable code, and have trouble listing their own code dependencies without using a cross-reference tool; they have little incentive and see no rewards in writing more portable and robust code.

Portability is not the only victim here. Is hard to do static code analysis (you want your applications to run faster and use less memory, don’t you?) in this mix of explicit and implicit programming assumptions. ISO standardization is another victim as Prolog implementers and library developers get defensive arguing that the required changes are only of interest to ISO nitpicks, failing to see the forest behind their own tree.

This post is a bit harsh, I know. But wake-up calls sometimes need harsh voices to complement more carrot-like incentives and initiatives. Please don’t shoot the messenger.


ALP newsletter

Enrico Pontelli just published a new issue of the The Association for Logic Programming Newsletter:

http://www.logicprogramming.org/newsletter/feb09

You can subscribe to the newsletter mailing list from the URL above. This newsletter is only possible thanks to the work of Enrico and to the reader’s submissions. Please consider contributing to the newsletter by sending news bits and articles to Enrico.


Using Logtalk to run Prolog module code in compilers without a module system

Logtalk can compile most Prolog modules as objects. This is accomplished by recognizing and parsing a common subset of Prolog module directives. For simple module code, it suffices to change the file name extensions from .pl to .lgt and compile the files as usual using the logtalk_load/1-2 built-in predicates. For modules that use proprietary predicates, directives, and syntax, some changes to the original code may be necessary.

As an example, assume that we want to use the SWI-Prolog modules lists, pairs, oset, and ordsets in GNU Prolog, a compiler that doesn’t support a module system. After making a copy of the original files and renaming their extensions to .lgt, we will need to remove a few specific, proprietary SWI-Prolog bits. First, we need to comment out all occurrences of the directive:

:- set_prolog_flag(generate_debug_info, false).

Second, we need to comment out in lists.lgt the calls to the must_be/2 predicate and the line:

:- use_module(error, [must_be/2]).

Still in lists.lgt we will need to replace the call to the succ/2 by:

M is N + 1,

Third, Logtalk doesn’t support the use_module/1 directive, requiring instead the use of the use_module/2 directive. Thus, we need to replace the directive:

:- use_module(library(oset)).

with the directive:

:- use_module(oset,
        [oset_int/3, oset_addel/3, oset_delel/3,
         oset_diff/3, oset_union/3]).

Finally, is_list/1 is a built-in predicate in SWI-Prolog but not in GNU Prolog. Logtalk provides its own portable versions of the is_list/1, succ/2, and must_be/2 predicates but let’s leave them out for now, for the sake of simplicity. After saving our changes, we are ready for a quick experiment:

$ gplgt
...
Logtalk 2.36.0
Copyright (c) 1998-2009 Paulo Moura
...
GNU Prolog 1.3.1
By Daniel Diaz
Copyright (C) 1999-2009 Daniel Diaz
| ?- {lists, pairs, oset, ordsets}.
...
yes
| ?- pairs::map_list_to_pairs(lists::length,[[1,2],[a,b,c],[X]],Pairs).
 
Pairs = [2-[1,2],3-[a,b,c],1-[X]]
yes

In SWI-Prolog the equivalent call would be:

$ swipl
...
Welcome to SWI-Prolog (Multi-threaded, 32 bits, Version 5.7.8)
...
?- pairs:map_list_to_pairs(length,[[1,2],[a,b,c],[X]],Pairs).
Pairs = [2-[1, 2], 3-[a, b, c], 1-[X]].

So, all done? Not quite. Different Prolog compilers provide different sets of built-in predicates. Module libraries use those built-in predicates. Thus, porting module code must also check for used built-in predicates. Fortunately, that’s quite easy in Logtalk: we simply compile the files with the portability compiler flag set to warning. In the case of our example, there is no problem for GNU Prolog but if we try to load our files in e.g. B-Prolog (another compiler without a module system) we will find that the lists module calls a memberchk/2 predicate that is not built-in in this compiler.

The porting process is not as straightforward as we may hoped for, but nothing is really straightforward when dealing with Prolog portability issues. The code changes needed are usually simple to apply as the example above illustrates. A small price to pay when porting module code. I’m sure you appreciate the irony of using Logtalk to run Prolog module code in module-less Prolog compilers. The subset of module directives recognized by Logtalk is enough for dealing with most Prolog module libraries. This subset could also provide a basis for a Prolog module standard based on current practice but that’s a topic for another post.

P.S. The predicate succ/2 is defined in the Logtalk library object integer. An equivalent predicate to is_list/1 is defined in the Logtalk library object list, which also defines a memberchk/2 predicate. Moreover, each Logtalk library object that defines a type contains a definition for a check/1 predicate that plays the same role as the SWI-Prolog library predicate must_be/2.


Logtalk 2.36.0 released

I released Logtalk 2.36.0 yesterday. The biggest news is support for settings files. At startup, Logtalk looks for and loads a settings.lgt file found in the startup directory. If not found, Logtalk looks for a loads a settings.lgt file in the Logtalk user folder. These settings files allows users to customize Logtalk without forcing them to edit the back-end Prolog configuration files, which often change between releases. This allows users to easily update to a new Logtalk version without the tedious work of reapplying changes to the default values of compiler flags to the new versions of the configuration files. Updated scripts and installers automatically will take care, for future releases, of preserving any existing settings file found when updating the Logtalk user folder.

Although the new settings files feature is easy to describe, a lot of work went into implementation and testing, thanks to the lack of Prolog standards for operating-system access. This includes basic functionality such as finding out the startup directory, opening a file in the current directory, and accessing operating-system environment variables. Something that should be implemented and tested during an afternoon, took me one week of work. One week that would be better spent working on e.g. the Logtalk libraries. In the end, I got settings file working for most back-end Prolog compilers on POSIX systems and, for some of them, also on Windows. Not ideal but hard to do better giving the back-end Prolog compiler limitations.

Lack of feature-parity between back-end Prolog compilers is always a problem. It turns what should be simple documentation in long lists of exceptions. Some users, completely oblivious of the nightmare that is writing portable Prolog code, end up blaming Logtalk for a limited feature set that pales when compared with recent programming languages. I’m quite tired of dealing with this portability problems. Therefore, future Logtalk releases will cut down on the number of compatible Prolog compilers. Hopefully, this will speed up future development and will enable more straightforward implementations of new functionality.


ICLP’08 session on “Uniting the Prolog Community”: personal notes

Bart Demoen and Tom Schrijvers kindly invited me to share my experience in writing portable Prolog applications at the ICLP’08 session on “Uniting the Prolog Community”. My experience comes mostly from developing Logtalk and from my work with implementers and with the ISO committee towards Prolog standardization. What follows is an extended version of the main points in my short (10 minutes) presentation. Be sure to read the session manifesto in the ICLP’08 proceedings for the context of these notes.

Prolog portability

There are more than 20 Prolog compilers available today. Some of them are academic experiments, some of them are commercial products, and a healthy percentage of them are free, open source compilers. Most of them are actively maintained today. I have 12 of these compilers installed on my laptop. While doing Logtalk development, I rotate daily between a subset of them. This might sound crazy, and certainly would not be feasible for many applications, but it is generally possible in the case of Logtalk and it helps me finding portability problems and bugs both in my own code and in the Prolog compilers. Just to give you a quick example, while developing the latest Logtalk stable release (2.35.0), I found two bugs in my code. One of them only manifested itself in B-Prolog. The other one was only apparent in ECLiPSe. In the past 10 years, this working routine helped finding, isolating, reporting, and sometimes fixing, a few hundreds bugs in Prolog compilers. Most of the trouble could be avoided, with obvious productivity gains, if Prolog compilers would fully comply with the ISO Prolog Core Standard (Part I), published in 1995 (i.e. 14 years ago!). The good news is that the current status of standard compliance improved considerably in recent years. The bad news is that the missing bits still can cause a lot of grief when writing portable applications. One of the lingering problems is the exceptions thrown by built-in predicates. In error cases, some compilers fail instead of throwing an exception while others throw an exception term that doesn’t comply with the standard. Operators can also be problematic. Some compilers choke when parsing valid code that uses operators and not all compilers make module declared operators local. A third issue are meta-predicates, specially when closures are used. Compounded with flawed modules systems, these and other problems make write portable Prolog applications something that is avoided rather than being the norm.

QA

Quality assurance (QA) is missing in action for too many Prolog compilers. Some may argue that QA is something that one only have the right to demand for commercial Prolog compilers. I disagree. Embarrassing bugs are embarrassing bugs, no matter where you find them. With a significant amount of Prolog development occurring in the open source realm, users, specially industrial users, need to trust our development processes. That means, among other things, unit testing applied in a systematic way.

ISO Prolog standards

The “Uniting the Prolog Community” manifesto in the ICLP’08 proceedings contains the following comment regarding the ISO Prolog standardization process:

“Despite many (also recent) efforts, the ISO committee seems unable to make substantial progress on the core standard, let alone to impose the modules standard.”

I find this remark to be a good summary of the current status of of ISO Prolog standardization. It’s also a misleading remark. The ISO Prolog standardization is an open process where anyone can participate. The ISO committee is you, me, and anyone willing to contribute. True, there are formal voting procedures but that does not prevent in any way attending a meeting, presenting a case, and helping improving the current draft proposals. Criticizing the ISO committee for the sore state of Prolog standardization is easy but, most of the time, is just plain blame shifting. Very few people actively participate. Case in point: most Prolog implementers at ICLP’08 didn’t bother to attend the ISO meeting. The same happened in previous meetings, held during past ICLP events. Besides the yearly meetings, there is also a mailing list and standardization forums. Both are ignored most of the time. Nevertheless, despite the lack of human resources, progress is being made! Five draft documents cover Definite Clause Grammars, Core revision, Threads, operating-system interface, and Globals. Being drafts means there are still rough on the edges. More work is needed to improve their quality. The published standards are not perfect either. They have flaws and we must have the courage to correct them, despite their official status. Two examples are the limited way of dealing with different text encodings in Part 1 and most of Part 2 (the modules standard).

Libraries and code sharing

There is no systematic development of Prolog libraries. Even when considering only pure Prolog libraries, a lot of desirable functionality is incomplete or missing. Pure Prolog libraries are the perfect candidate for code sharing among systems. Or so it seems at first glance. The trouble begins when a library predicate needs to call a built-in predicate, say, arg/3, whose behavior when called with a non-existing argument number is implementation dependent. Suddenly, applications using the library and working perfectly in Prolog X fail unexpectedly when run with Prolog Y. The library writer, faced with the prospect of writing once and testing everywhere, often chooses to make its library Prolog-specific.

Reliable, portable libraries require Prolog compilers that comply to a core standard, include adequate documentation, and come with a set of unit tests. Thus, we need portable documenting tools and portable unit test frameworks. One of the problems is that most of work that needs to be done hardly warrants you any publications. Commercial development of Prolog libraries is hardly feasible giving the small size of our community. Everyone is pressed on time so… we need code sharing at all levels, not only libraries but also at the compiler level. Developing, for example, a common source reader, a common foreign-language interface, or using common stream handling code would strength collaboration and improve portability. The SWI-Prolog and YAP developers are already hard at work doing joint development. Hopefully, others will follow their lead.

During the ICLP session, while discussing how Prolog systems could collaborate, someone remarked that “the fittest may survive”. In order for the fittest to survive we do not need any community at all. The fragmentation of our community, expressed by the large number of Prolog dialects is indeed a problem. Most implementations, however, specially the smaller ones, contain unique and interesting features that we should consider keeping while looking for new ways to cooperate and unit our efforts.

Not our father’s Prolog!

Prolog compilers have came a long way since the first implementations at Marseille and Edinburgh. Nowadays, we’ve unicode, tabling, constraints, threads, modules and objects, graphical source-level debuggers, profilers, foreign-language interfaces among other features. Most Prolog compilers provide a subset of these features making them suitable for a broad range of application domains. We do, however, have a marketing problem! A big one. Is quite frequent to find people disdainful of Prolog, claiming that it’s slow, quirky, a relic from old AI language wars, useless as a general purpose programming language. Mention Prolog in a research grant request and you risk getting a short, dismissive sentence as a reward. Thus, improved collaboration among Prolog implementers only goes half way. We also need to show the rest of the world that Prolog implementations have come a long way and can be an effective tool for solving non-academic problems.


A PhD thesis you don’t want to miss

Jan Wielemaker, SWI-Prolog main developer, kindly sent me a paper copy of his PhD thesis. You may grab a PDF version at the following URL:

http://www.swi-prolog.org/Publications.html

This is a PhD thesis you don’t want to miss. It describes Jan’s outstanding work developing SWI-Prolog as a top-notch Prolog system and applying logic programming to solve large-scale problems. Hats off to you Jan. Thanks for sharing your hard work with all of us.

P.S. I love the thesis cover.


Prolog modules madness

Try the following experiment in a Prolog compiler supporting modules such as SWI-Prolog or YAP. Define two modules that export the same predicate and try to load them. For example, assume a file m1.pl with content:

:- module(m1, [p/1]).
 
p(X) :- clause(X, true).
 
:- dynamic(a/1).
a(one). a(two). a(three).

and a file m2.pl with content:

:- module(m2, [p/1]).
 
p(X) :- clause(X, true).
 
:- dynamic(a/1).
a(1). a(2). a(3).

Let’s try to load these two files in SWI-Prolog:

$ swipl
Welcome to SWI-Prolog (Multi-threaded, 32 bits, Version 5.7.7)
Copyright (c) 1990-2008 University of Amsterdam.
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to redistribute it under certain conditions.
Please visit http://www.swi-prolog.org for details.
 
For help, use ?- help(Topic). or ?- apropos(Word).
 
?- [m1, m2].
% m1 compiled into m1 0.00 sec, 1,316 bytes
ERROR: Cannot import m2:p/1 into module user: already imported from m1
false.
 
[debug]  ?-

We get an error. It seems that loading a module automatically imports its exported predicates into the loading context (the module user in this case). How unfortunate. We don’t get a chance of using explicit qualified calls to choose which predicate to call. Let’s try the same experiment with YAP:

$ yap
% Restoring file /usr/local/lib/Yap/startup
YAP version Yap-5.1.4
   ?- [m1, m2].
 % consulting /Users/pmoura/m1.pl...
 % consulted /Users/pmoura/m1.pl in module m1, 1 msec 1056 bytes
 % consulting /Users/pmoura/m2.pl...
 % consulted /Users/pmoura/m2.pl in module m2, 0 msec 648 bytes
NAME CLASH: m1:p/1 was already imported to module user;
            Do you want to import it from m2 ? [y or n]

In this case, we get a prompt asking what to do. Maybe interesting if we happen to be in an interactive session, otherwise…

Ciao have claimed for a long time to provide a better module system so let’s try a similar experiment with this compiler. This time we explicitly say that we want to use both modules:

$ ciao
Welcome to the Ciao Prolog Development System!
 
Ciao-Prolog 1.10 #8: Wed Jan 31 21:54:06 WET 2007
?- use_module(m1), use_module(m2).
 
yes
?- p(a(X)).
 
X = one ? ;
X = two ? ;
X = three ? ;
no

In this case, there is no error, no prompt, the compiler just silently makes a choice in our behalf. How can the compiler know how to make the right choice? Should the order of the use_module/1 calls really matter? What if there are two exported predicates with the same name in both modules and we want one from the first module and the other from the second module?

What these experiments tell us?

Prolog compilers automatically import, or try to import, the exported predicates of loaded modules. Therefore, whenever two modules export the same predicate, we either get an error (SWI-Prolog), a prompt for solving the conflict (YAP), or automatic conflict resolution (Ciao). So much for the often proclaimed de facto module standard.

Modules are supposed to provide namespaces. Except, it seems, for their exported predicates. In this case, a module developer needs to be aware of all the exported predicates in all modules that might be used by him, or by her, or by third parties in order not to get into trouble. Or risk the chance of getting his or her users into trouble. This is the amazing foundation that many implementers and users alike feel that should be used to build the predicate libraries that will support our Prolog applications. My biggest problem is, of course, how to hide this post from colleagues working in other programming languages. I don’t want them to laugh themselves to death.


CLPSE 2009 workshop

I’m co-organizing, together with Ulrich Neumerkel, an ICLP’09 workshop on software engineering:

http://logtalk.org/workshops/clpse2009/

If you’re working in any topic related to software engineering in the context of logic programming applications or developing a sizable logic programming application consider submitting your work to the 4th International Workshop on (Constraint) Logic Programming and Software Engineering. See the URL above for details.


Using both object proxies and regular objects

During the last three days I was working with Artur Miguel Dias, the author of CxProlog, rewriting P-FLAT, a Prolog toolkit for teaching Formal Languages and Automata Theory, as a Logtalk application, tentatively named L-FLAT. We made great progress and the final version of L-FLAT is going to be smaller, more portable, and both simpler to maintain and extend than the original plain Prolog application. One of our goals is to allow users to use both object proxies and regular objects when representing Turing Machines, Pushdown Automata, Context Free Grammars, and other concepts and mechanisms supported by L-FLAT. This flexibility allows users to represent e.g. a simple Finite Automaton with a small number of states and transitions as an object proxy (i.e. as a Prolog compound term; see my previous post) that might be entered interactively at the command-line:

fa(fa1,
    1,                                    % initial state
    [1/a/1, 1/a/2, 1/b/2, 2/b/2, 2/b/1],  % transitions
    [2]                                   % final states
).

It also allows users to represent e.g. a complex Turing Machine with dozens of transitions as a regular object, defined in a source file:

:- object(copyTM,
    instantiates(tm)).
 
    initial(q0).
 
    transitions([
        q0/'B'/'B'/'R'/q1,
        q1/a/'X'/'R'/q2,   q1/b/'Y'/'R'/q5,   q1/'B'/'B'/'L'/q7,
        q2/a/a/'R'/q2,     q2/b/b/'R'/q2,     q2/'B'/'B'/'R'/q3,
        q3/a/a/'R'/q3,     q3/b/b/'R'/q3,     q3/'B'/a/'L'/q4,
        q4/a/a/'L'/q4,     q4/b/b/'L'/q4,     q4/'B'/'B'/'L'/q4,
                           q4/'X'/'X'/'R'/q1, q4/'Y'/'Y'/'R'/q1,
        q5/a/a/'R'/q5,     q5/b/b/'R'/q5,     q5/'B'/'B'/'R'/q6,
        q6/a/a/'R'/q6,     q6/b/b/'R'/q6,     q6/'B'/b/'L'/q4,
        q7/'X'/a/'L'/q7,   q7/'Y'/b/'L'/q7
    ]).
 
    finals([]).
 
:- end_object.

Using both object proxies and regular objects requires that all predicates that expect e.g. a Regular Expression to accept both an object identifier and an object proxy (a compound term) as argument. This might sound as an additional hurdle until you realize that an object proxy is also an instantiation of the identifier of a parametric object. As long as the parametric object defines the straightforward predicates that give access to its parameters, everything will work as expected. But how do we relate the regular objects and the parametric objects? Easy. If you want to represent e.g. some finite automata as instances of class fa and other finite automata as object proxies, simply define a parametric object as an instance of class fa. The parametric object parameters will be the properties that might be unique for a specific finite automaton. The parametric object define the predicates that give access to these properties. These predicates are the same that are used in class fa in the definition of all predicates that need access to finite automaton properties:

:- object(fa(_Initial, _Transitions, _Finals),
    instantiates(fa)).
 
    initial(Initial) :-
        parameter(1, Initial).
 
    transitions(Transitions) :-
        parameter(2, Transitions).
 
    finals(Finals) :-
        parameter(3, Finals).
 
:- end_object.

Thus, using both object proxies and regular objects becomes trivial, fully transparent, and just a matter of defining the necessary parametric objects.


Efficient representation of data objects

Some Logtalk applications need to represent large numbers of immutable objects. These objects typically represent data that must be validated or used for data mining. This kind of data is often exported from databases and must be converted into Logtalk objects for further processing. The application logic is usually represented by a set of hierarchies with the data objects at the bottom. Although data conversion is most of the time easily scriptable, the resulting objects can take up significantly more space than a straightforward representation as Prolog facts. Logtalk object representation is simply not optimized for this kind of objects. Fortunately, this is one of those cases where you can eat the cake and keep it. The solution is to represent data as compact Prolog facts and to interpret those facts as object proxies. An object proxy is simply a Prolog compound term that can be interpreted as a possible instantiation of the identifier of a parametric object.

For a toy example, assume that your data represents geometric circles with attributes such as the circle radius and the circle color:

% circle(Id, Radius, Color)
circle('#1', 1.23, blue).
circle('#2', 3.71, yellow).
circle('#3', 0.39, green).
circle('#4', 5.74, black).
circle('#5', 8.32, cyan).

We can define the following parametric object for representing circles:

:- object(circle(_Id, _Radius, _Color)).
 
    :- public([
        id/1, radius/1, color/1,
        area/1, perimeter/1
    ]).
 
    id(Id) :-
        parameter(1, Id).
 
    radius(Radius) :-
        parameter(2, Radius).
 
    color(Color) :-
        parameter(3, Color).
 
    area(Area) :-
        ::radius(Radius),
        Area is 3.1415927*Radius*Radius.
 
    perimeter(Perimeter) :-
        ::radius(Radius),
        Perimeter is 2*3.1415927*Radius.
 
:- end_object.

The circle/3 parametric object provides a simple solution for encapsulating a set of predicates associated with a given compound term. But how do we perform computations with the object proxies? We cannot send messages to Prolog facts! Or can we? Logtalk provides a handy notation for working with object proxies. For example, in order to construct a list with all circle areas we can write:

| ?- findall(Area, {circle(_, _, _)}::area(Area), Areas).
 
Areas = [4.75291, 43.2412, 0.477836, 103.508, 217.468]
yes

The message sending control construct, ::/2, accepts as receiving object the term {Proxy}. Logtalk proves Proxy as a Prolog goal and, if successful, sends the message to the resulting term. Backtracking over the Proxy goal is supported, allowing all matching object proxies to be processed by a simple failure-driven loop (as implicit in the findall/3 call above).

You can find the full source code of this example in the current Logtalk distribution. In real applications, data objects, represented as object proxies, are tied to large hierarchies representing both knowledge about data and how to reason with it.