Monthly Archives: October 2009

Working with data sets

A recurring question on the comp.lang.prolog newsgroup is how to work with different data sets, usually loading them from different files, without mixing the data in the plain Prolog database. Unfortunately, these questions often lack enough details for making an informed choice between several potential programming solutions. Two possible solutions are (1) load the data into suitable data structures instead of using the database and (2) use clauses to represent the data but encapsulate each data set in its own Prolog module or Logtalk object. Some combination of both solutions may also be possible. In this post, however, we’re going to sketch the second solution using Logtalk objects. For an alternative but also Logtalk-based solution please see this previous post.

Assuming all data sets are described using the same predicates, the first step is to declare these predicates. The predicate declarations can be encapsulated either in an object or in a protocol (interface). Using a protocol we could write:

:- protocol(data_set).
 
    :- public(datum_1/3).  % data set description predicates
    :- public(datum_2/5).
    ...
 
:- end_protocol.

We can now represent each data set using its own object (possibly stored in its own file). Each data set object implements the data_set protocol defined above. For example:

:- object(data_set_1,
    implements(data_set)).
 
    datum_1(a, b, c).
    ...
 
    datum_2(1, 2, 3, 4, 5).
    ...
 
:- end_object.

Assuming we have the required memory, we can load some of all of our data sets without mixing their data. But that’s not all. We can also encapsulated our data set processing code in its own object (or set of objects, or hierarchy of objects, or whatever is suitable to the complexity of our application). This object, let’s name it processor, will perform its magic by sending messages to the specific data set that we want to process. For example:

:- object(processor).
 
    :- public(compute/2).
    ...
 
    compute(DataSet, Computation) :-
        DataSet::datum_1(A, B, C),
        ...
 
:- end_object.

If the computations we wish to perform make sense as questions sent to the data sets themselves, an alternative is to move the data set predicate declarations from the data_set protocol to the processor object and make the data set objects extend the resulting object, below renamed as data_set. For example:

:- object(data_set).
 
    :- public(datum_1/3).  % data set description predicates
    :- public(datum_2/5).
    ...
    :- public(compute/1).  % computing predicates
    ...
 
    datum_3(abc, def).     % default value for datum_3/2
 
    compute(Computation) :-
        ::datum_1(A, B, C),
        ...
 
:- end_object.
 
:- object(data_set_1,
    extends(data_set)).
 
    datum_1(a, b, c).
    ...
 
    datum_2(1, 2, 3, 4, 5).
    ...
 
:- end_object.

An advantage of this solution is that the object data_set can contain default values for the data set description predicates. The ::/1 operator used above is the Logtalk operator for sending a message to self, i.e. to the data set object that received the message compute/1. If the information requested is not found in the data set object, then it will be looked for in its ancestor, where the default values are defined.

The best and most elegant solution will, of course, depend on the details on the data set processing application. For example, above we could have defined the object data_set as a class and the individual data sets as instances of this class (technically, the solution above uses prototypes).

Note that all code above is static. Individual data set description predicates may be declared dynamic (using the predicate directive dynamic/1) if we need to update them during processing of the data sets. If our application requires being able to delete data sets from memory, is simply a question of declaring the data set objects dynamic using the Logtalk object directive dynamic/0 and to use the Logtalk built-in predicate abolish_object/1 when a data set object is no longer needed.

We have only scratched the surface of the Logtalk features that we could make use in our implementation but, hopefully, it’s enough as a starting guide. Feel free to stop by the Logtalk discussion forums to further discuss this programming pattern.


Mandatory versus optional ISO Prolog standards

Currently, there are two approved ISO Prolog standards: ISO/IEC 13211-1: General core (first edition published in 1995-06-01; an errata was published recently) and ISO/IEC 13211-1: Modules (first edition published in 2000-06-01). There are also five standardization proposals being discussed: Core Revision, Definite Clause Grammars (DCGs), Globals, Threads, and Portable Operating-System Interface (POSI).

While I was a member of the WG17 standardization group (see my previous post), I always stood for a mandatory core standard, making all other standards optional when talking about ISO compliance of a specific Prolog compiler. This means that a Prolog implementer would only need to comply with the core standard in order to claim conformance to the ISO Prolog specification. The Prolog implementer could also freely chose to implement e.g. DCGs and POSI, disregarding Module, Globals, and Threads.

This view of mandatory and optional Prolog standards was shared by some but not all members of the WG17 standardization group. Some of them want to make the Module standard mandatory and pushed for making some of the other standardization proposals dependent (or at least making reference to) the approved Module standard. I find this a recipe for disaster and for ISO Prolog standards irrelevance. A standard should stand on its own merits. Despite the hard work done on the Module standard, the proposed module system is (rightfully) ignored by most Prolog implementers. Other standard proposals should not be used as a leverage for forcing implementers to implemented a flawed standard.

Why is the current Module standard flawed?

First, it specifies a new module system instead of trying to standardize current practice. Instead of helping existing module implementations to converge, the standard choses to specify a new, and therefore incompatible, module system. For example, the standard introduces a new concept of module interface (which can only be implemented by a single module!) that is not found elsewhere, even today.

Second, it specifies two different and incompatible ways of dealing with meta-predicates (the infamous colon_sets_calling_context flag). This means that two Prolog compilers can comply with this standard and still be incompatible!

Third, specifies a meta-predicate directive that prevents the specification of the number of missing arguments when working with closures. One of the consequences is that only the Prolog implementer knows how to parse and make use of meta-predicate directive for built-in predicates! But check what the inventor of the meta-predicate directive have to say about the flaws on the meta-predicate directive as specified in the Module standard.

Fourth, it makes some poor choices regarding built-in predicates and built-in directives. For example, the specification of the predicate_property/2 predicate defines the properties public and private as stating if clause/2 can be used on the predicate clauses. Thus, the properties public and private cannot be used to infer about predicate scope, which would be a much better match for most programmer expectations. Another example is the meta-predicate directive, which is ugly named metapredicate/1 (while existing module systems use, of course, a meta_predicate/1 directive!).

Fifth, the standard fails to specify a solution for renaming predicates when importing. The consequence is that library developers need to be aware of the predicate names used by other library developers in order to avoid conflicts. So much for the idea that modules provide an encapsulation mechanism. Of course, the standard also states that any module predicate (including not exported ones) can be called using explicit qualification and leaves as “an allowable extension to provide a mechanism that hides certain procedures (…)”.

Other problems with the current module standard could be described here but the ones stated above are hopefully enough to convince you that the standard needs to be thoroughly revised.

You may think that my critics of the current module standard (and WG17 policies) are mostly motivated by my work in Logtalk, which provides an alternative to the use of modules. You are wrong. True, Logtalk objects subsume module functionality and the Logtalk compiler is able to compile most modules as objects. But the Logtalk compiler also goes to great lengths to allow programmers to use both modules and objects in the same applications. Case in point: the fact that Logtalk can compile most modules as objects clearly show that there is enough common core functionality in today’s module systems to warrant a new module standard focused in current practice. But any new or revised module standard should also pay due attention to the advanced modules found on some Prolog systems such as ECLiPSe and Ciao.

In its current state, making the current module standard mandatory or required for implementing other standard proposals will either delay standardization efforts or will tie implementations to a limited and flawed module system for years to come.


Switching between Logtalk installed versions

Recent Logtalk releases include a shell script, logtalk_select, which allows easy switching between installed Logtalk versions. It’s an experimental script, loosely based on the python_select script, with two major limitations: it doesn’t update the Logtalk user folder and it’s POSIX-only. Nevertheless, it’s useful whenever you want to test your application with a new Logtalk release. Usage is quite simple. In order to list all installed versions type:

$ logtalk_select -l
Available versions: lgt2372 lgt2373 lgt2374 lgt2375

The current installed version can be checked by typing:

pmmbp:~ pmoura$ logtalk_select -s
Current version: lgt2375

In order to switch to another installed version type:

$ sudo logtalk_select lgt2374

Using sudo may or may not be needed depending on your Logtalk installation prefix and on the administrative privileges of your user account. Typing the script name without arguments prints a help script:

$ logtalk_select
This script allows switching between installed Logtalk versions
 
Usage:
logtalk_select [-vlsh] version
 
Optional arguments:
-v print version of logtalk_select
-l list available versions
-s show the currently selected version
-h help

If you’re a shell scripting wizard and able to improve the logtalk_select script, please mail me. As always, feedback and contributions are most welcome.


Stepping down as editor of ISO Prolog standardization proposals

I’m stepping down as editor of ISO Prolog standardization proposals. In recent years, I found myself responsible for four different draft proposals: Core Revision, DCGs, Threads, and POSI (Portable Operating-System Interface). My fault really. With the exception of the DCGs proposal, all the other proposals are born from my initiative. Recently I have been unable to fulfill my duties as editor of the DCGs proposal, failing to meet the deadline for its next revision. This resulted both from the proverbial lack of time and from being weary of the ISO standardization process. This process is mostly broken, unable to meet the needs of the Prolog community. I tried to fix it from within. I failed.

The last straw that resulted in my decision to end my standardization work was the lame events at the WG17 meeting at Pasadena. Tired of the lack of sensible priorities in the discussion of the standardization proposals, I succeeded I changing the meeting’s agenda, convincing the others participants to discuss the Core Revision proposal instead of spending another annual meeting discussing DCGs and Globals. Nothing wrong, of course, with DCGs and Globals. Both are worthy subjects for standardization. But fixing and improving the Core Prolog standard is the most important and urgent task. We discussed the Core Revision proposal in the morning, going from A to Z, making decisions and identifying contention aspects that would merit further discussion for the next revision of the proposal. At the end of the morning, I was pretty satisfied with the results and leaved to catch my flight back home, thus unable to attend the WG17 meeting in the afternoon. The remaining participants decided, without me as the editor of the Core Revision proposal being present, to go back and change the decisions made in the morning. I found this behavior regrettable and disrespectful. I also found some of the afternoon decisions will informed and resulting from an apparent lack of knowledge of the current, published standards. Moreover when most of participants aren’t aware that a Core Revision proposal even existed before this meeting and never participated in previous discussions about this proposal.

I still believe that standardization is vital for the future of Prolog as a programming language. But the current ISO process is the wrong way to do it. Case in point. Standardization proposals are voted by countries, instead of being voted by implementers and users groups. Implementers always decided if a proposal is worthy, by implementing it, or if is worthless, by ignoring it (think ISO Prolog Part 2: Modules). Users are the ones using and claiming for a better language.

A saddening aspect of the ISO standardization process is the lack of perception from outsiders that people working on proposals and participating in meetings are volunteers and not necessarily experts on the matters being discussed. I can understand that outsiders find some aspects of the proposals poorly formulated or completely wrong. I cannot understand that, instead of criticizing the proposals and suggesting alternatives, outsiders choose to insult the volunteers that are doing their best to improve the current standards.

Visibility and openness of the standardization work is also a problem. There are standardization discussion forums and a mailing list. Both are mostly ignored. Neither is listed in the official WG17 web site. It gets tiresome quickly to keep explaining and repeating arguments because people either want to remain anonymous when giving feedback to the current proposals or have no knowledge of the reasoning and previous discussions behind the proposals.

Improving the current, published Prolog standards, requires the courage to recognize past errors and fix them, even if that results in revoking and replacing them with hopefully better specifications. Some people refuse this path and desperately cling to the past, painting the whole standardization process in a corner. I have no patience left for this nonsense.

More could be said about the problems in the current ISO standardization process but I hope that the few ones described above are enough for you to understand my decision. Many thanks to all the people that contributed to get this far in my standardization efforts. Hopefully others with more time and energy will continue from here. My only advice, if I may give one, is: throw away the current ISO standardization process and start a new, grassroots movement that brings together implementers and user groups. Some good examples can be found in the recent web standardization processes and in other programming language communities.